Environment Configuration

Overview

Multimodal Video RAG follows the 12-Factor App methodology for configuration. All system behavior is controlled via environment variables, allowing for seamless transitions between local development, Docker containers, and cloud environments.

Core LLM Configuration

The system is designed to be provider-agnostic. You can choose between running models locally for maximum privacy or using cloud providers for higher performance.

LLM Provider Selection

Set the LLM_PROVIDER variable to determine which backend the LangGraph agent and vision modules use.

Provider-Specific Settings

Ollama (Local)

If using Ollama, ensure the service is reachable from the backend container.

LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://ollama:11434

Note: You must manually pull the required models inside the Ollama container as shown in the Quick Start guide.

OpenRouter (Cloud)

If using OpenRouter, an API key is required.

LLM_PROVIDER=openrouter
OPENROUTER_API_KEY=sk-or-v1-your-api-key-here

Backend Infrastructure

These settings configure the FastAPI server and its connection to backing services like the vector database.

Ingestion & Processing

The ingestion pipeline uses specialized models for vision and speech-to-text.

ASR (Speech Recognition)

The system uses Faster-Whisper for high-speed transcription.

Device Context: By default, the system attempts to use cuda. If no GPU is available, it will fall back to cpu (significantly slower).

Vision LLM

The vision module generates descriptions for video frames.

Local Model: Defaulted to llava:7b.
Visual Sampling: Controls how frequently frames are extracted for analysis.

Privacy & Security

To ensure data privacy during processing (especially when using cloud providers), the system integrates Microsoft Presidio for PII (Personally Identifiable Information) detection.

# Enable/Disable PII Anonymization
PII_DETECTION_ENABLED=true

# Entities to redact (e.g., PHONE_NUMBER, EMAIL_ADDRESS, PERSON)
PII_ENTITIES_TO_REDACT=["PHONE_NUMBER", "EMAIL_ADDRESS", "LOCATION"]

Environment Template (.env)

Create a .env file in the project root. You can use the template below for a standard local setup:

# --- Application Settings ---
APP_NAME="Multimodal Video RAG"
APP_ENV=development
DEBUG=true

# --- LLM Provider ---
# Options: ollama, openrouter
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OPENROUTER_API_KEY=

# --- Vector Store ---
CHROMA_HOST=localhost
CHROMA_PORT=8000

# --- Privacy ---
PII_DETECTION_ENABLED=true

# --- Frontend ---
NEXT_PUBLIC_API_URL=http://localhost:8000

Troubleshooting Configuration

Connection Refused: If the backend cannot connect to Ollama or ChromaDB while running in Docker, ensure you are using the service name (e.g., http://ollama:11434) instead of localhost.
Model Not Found: If using Ollama, ensure you have executed ollama pull for both the vision model (llava) and the instruct model (llama3.1).
GPU Not Detected: Ensure the nvidia-container-toolkit is installed on your host machine and that the deploy.resources.reservations.devices section is present in your docker-compose file.