Setup and Installation¶
Flowcept can be installed in multiple ways, depending on your needs.
Default Installation¶
To install Flowcept with its basic dependencies from PyPI, run:
pip install flowcept
This installs the minimal Flowcept package, not including MongoDB, Redis, MCP, or any adapter-specific dependencies.
Installing Specific Adapters and Additional Dependencies¶
Flowcept integrates with several tools and services, but you should only install what you actually need. Good practice is to cherry-pick the extras relevant to your workflow instead of installing them all.
pip install flowcept[mongo] # MongoDB support
pip install flowcept[mlflow] # MLflow adapter
pip install flowcept[dask] # Dask adapter
pip install flowcept[tensorboard] # TensorBoard adapter
pip install flowcept[kafka] # Kafka message queue
pip install flowcept[nvidia] # NVIDIA GPU runtime capture
pip install flowcept[telemetry] # CPU/GPU/memory telemetry capture
pip install flowcept[lmdb] # LMDB lightweight database
pip install flowcept[mqtt] # MQTT support
pip install flowcept[llm_agent] # MCP agent, LangChain, Streamlit integration
pip install flowcept[llm_google] # Google GenAI + Flowcept agent support
pip install flowcept[llm_agent_audio] # MCP agent with audio enabled (tts).
pip install flowcept[analytics] # Extra analytics (seaborn, plotly, scipy)
pip install flowcept[dev] # Developer dependencies (docs, tests, lint, etc.)
Installing with Common Runtime Bundle¶
pip install flowcept[extras]
The extras group is a convenience shortcut that bundles the most common runtime dependencies.
It is intended for users who want a fairly complete, but not maximal, Flowcept environment.
You might choose flowcept[extras] if:
You want Flowcept to run out-of-the-box with Redis, telemetry, and MongoDB
You prefer not to install each extra one by one
Warning
If you only need one of these features, install it individually.
Install all optional dependencies at once¶
Flowcept provides a combined all extra, but installing everything into a single environment is not recommended for users.
Many of these dependencies are unrelated and should not be mixed in the same runtime.
This option is only intended for Flowcept developers who need to test across all adapters and integrations.
pip install flowcept[all]
Installing from Source¶
To install Flowcept from the source repository:
git clone https://github.com/ORNL/flowcept.git
cd flowcept
pip install .
You can then install specific dependencies similarly as above:
pip install .[optional_dependency_name]
This follows the same pattern as above, allowing for a customized installation from source.
Setup¶
The Quick Start example works with just pip install flowcept, no extra setup is required.
For online queries or distributed capture, Flowcept relies on two optional components:
Message Queue (MQ) — message broker / pub-sub / data stream
Database (DB) — persistent storage for historical queries
Message Queue (MQ)¶
Required for anything beyond Quickstart
Flowcept publishes provenance data to the MQ during workflow runs
Developers can subscribe with custom consumers (see simple consumer example)
You can monitor or print messages in motion using:
flowcept --stream-messages --print
Supported MQs:
Database (DB)¶
Optional, but required for: - Persisting provenance beyond MQ memory/disk buffers - Running complex analytical queries on historical data
Supported DBs:
Notes¶
Without a DB: - Provenance remains in the MQ only (persistence not guaranteed) - Complex historical queries are unavailable
Flowcept’s architecture is modular: other MQs and DBs (graph, relational, etc.) can be added in the future
Deployment examples for MQ and DB are provided in the deployment directory
Downloading and Starting External Services (MQ or DB)¶
Flowcept uses external services for message queues (MQ) and databases (DB). You can start them with Docker Compose, plain containers, or directly on your host.
Using Docker Compose (recommended)¶
We provide a Makefile with shortcuts:
Redis only (no DB):
make services(LMDB can be used in this setup as a lightweight DB)Redis + MongoDB:
make services-mongoKafka + MongoDB:
make services-kafkaMofka only (no DB):
make services-mofka
To customize, edit the YAML files in deployment and run:
docker compose -f deployment/<compose-file>.yml up -d
Using Docker (without Compose)¶
See the deployment compose files for expected images and configurations.
You can adapt them to your environment and use standard docker pull / run / exec commands.
Running on the Host (no containers)¶
Install binaries for the service you need:
Start services normally (
redis-server,mongod,kafka-server-start.sh, etc.).
Flowcept Settings File¶
Flowcept uses a settings file for configuration.
To create a minimal settings file (recommended): use:
flowcept --init-settings
Creates ~/.flowcept/settings.yaml.
To create a full settings file with all options: use:
flowcept --init-settings --full
Also creates ~/.flowcept/settings.yaml.
Recommended pattern:
flowcept --init-settings --full -y
flowcept --config-profile full-online -y
Meaning:
flowcept --init-settings: minimal file fromDEFAULT_SETTINGSflowcept --init-settings --full: copyresources/sample_settings.yamlflowcept --config-profile ...: apply a runtime overlay to the existing file
What You Can Configure¶
Message queue and database routes, ports, and paths
MCP agent ports and LLM API keys
Buffer sizes and flush settings
Telemetry capture settings
Instrumentation and PyTorch details
Log levels
Data observability adapters
And more (see example file)
Common profiles:
full-online: Redis MQ + Redis KV + Mongo + online flushfull-offline: offline flush + dump buffer + MQ/KV/DB disabledmq-only: MQ only, no KV/Mongo/LMDBmq-only-no-flush: MQ enabled, tasks accumulate locally and are bulk-published to MQ in a single end-of-run flush; also dumps to local JSONL; use withFlowcept(check_safe_stops=False)full-telemetry: telemetry on except GPU
Adapter flags are additive:
flowcept --init-settings --dask -y
flowcept --init-settings --mlflow -y
flowcept --init-settings --tensorboard -y
Custom Settings File¶
Flowcept looks for its settings in the following order:
Environment variable
FLOWCEPT_SETTINGS_PATH— if set, Flowcept will use this path~/.flowcept/settings.yaml— created byflowcept --init-settingsDefault sample file — sample_settings.yaml
Environment Variables¶
Note
Precedence: Environment variables override values in
~/.flowcept/settings.yaml and packaged sample settings.
If FLOWCEPT_USE_DEFAULT=true, Flowcept runs in strict default mode:
external settings files and runtime env overrides (MQ/DB host/ports/toggles, etc.)
are ignored.
Short version:
settings file controls the normal behavior
profiles modify the settings file
environment variables can still override those values at runtime
General¶
Variable |
Purpose / Default |
|---|---|
|
If |
|
Path to a YAML settings file. If unset, Flowcept uses |
Message Queue (MQ)¶
Variable |
Purpose / Default |
|---|---|
|
Enable MQ publishing. Accepts string values. Recommended: |
|
MQ kind (e.g., |
|
Channel/topic name. Default |
|
MQ host. Default |
|
MQ port (int). Default |
|
Full connection URI. Overrides host/port if set. Default unset. |
Key-Value DB (KVDB)¶
Variable |
Purpose / Default |
|---|---|
|
KV host. Default |
|
KV port (int). Default |
|
Full connection URI. Default unset. |
MongoDB¶
Variable |
Purpose / Default |
|---|---|
|
Enable MongoDB persistence. Parsed as boolean: |
|
Full MongoDB URI. If set, overrides host/port. Default unset. |
|
Mongo host. Default |
|
Mongo port (int). Default |
LMDB¶
Variable |
Purpose / Default |
|---|---|
|
Enable LMDB persistence. Parsed as boolean: |
|
Override the LMDB database directory. Default from |
Agent / MCP¶
Variable |
Purpose / Default |
|---|---|
|
Enable agent audio. String accepted. Interpreted truthy for |
|
MCP host. Default |
|
MCP port (int). Default |
Parsing Notes¶
Ports (
*_PORT) are cast to integers.MONGO_ENABLEDandLMDB_ENABLEDare parsed strictly as booleans using case-insensitive comparison to"true".MQ_ENABLEDis read as a string and used as-is; prefertrueorfalseto avoid surprises when checking truthiness in Python.