Experience-Guided Multi-Agent System for Multi-Modal Understanding
Multi-agent AI platform for video, audio, image, and document understanding. Processes all content types using ColPali, VideoPrism, ColQwen, and LateOn embeddings with Vespa-backed retrieval. Agents coordinate via A2A protocol with DSPy-powered reasoning, streaming responses, and Phoenix observability. 11-package UV workspace with multi-tenant isolation.
- π§ Self-Optimizing: Learns from every interaction using GEPA (Experience-Guided Policy Adaptation) - routing strategies improve continuously from real usage
- π Multi-Modal Intelligence: Process any content type (video, audio, images, documents, text, dataframes) with unified understanding
- π€ Multi-Agent Orchestration: DSPy 3.0 A2A protocol-based coordination of specialized agents working together
- π Cross-Modal Fusion: Intelligent combination of insights across different modalities for richer understanding
- β‘ Production Performance: <500ms P95 latency at 500+ concurrent users with 9 Vespa ranking strategies
- π― Multiple SOTA Models: ColPali (frame-level), VideoPrism (global+temporal), ColQwen (multi-modal fusion)
- π’ Multi-Tenant Ready: Complete schema-per-tenant isolation with independent Phoenix projects and memory
- π Full Observability: Comprehensive Phoenix telemetry with traces, experiments, and real-time dashboards
- π§ͺ Evaluation Framework: Provider-agnostic metrics with reference-free, visual LLM, and classical evaluators
- ποΈ Professional Architecture: 10-package layered structure (Foundation β Core β Implementation β Application)
For Individual Developers:
- Build intelligent content search applications across any modality
- Experiment with multiple state-of-the-art embedding models
- Learn multi-agent AI architectures with production-quality code
- Use locally with Ollama (no API costs)
For Researchers:
- Run experiments with different embedding strategies and evaluate results
- Optimize routing agents with synthetic data generation
- Track all experiments with comprehensive Phoenix telemetry
- Publish reproducible results with full observability
For Teams & Organizations:
- Deploy multi-tenant SaaS applications with complete data isolation
- Achieve production-scale performance (<500ms P95 at 500+ users)
- Monitor and optimize with comprehensive dashboards
- Scale from prototype to production with professional architecture
- Python 3.12+
- 16GB+ RAM
- CUDA-capable GPU (recommended for VideoPrism)
- Docker for Vespa and Phoenix
- uv package manager:
pip install uv
# Clone repository
git clone <repo>
cd cogniverse
# Install dependencies
uv sync
# Start infrastructure
cogniverse up # Starts Vespa, Phoenix, Ollama via k3d
# Verify services
curl -s http://localhost:8080/ApplicationStatus # Vespa
curl -s http://localhost:6006/health # Phoenix# Ingest videos with ColPali embeddings
uv run python scripts/run_ingestion.py \
--video_dir data/videos \
--profile video_colpali_smol500_mv_frame \
--tenant default
# Multi-modal multi-profile ingestion (video, audio, images, documents)
uv run python scripts/run_ingestion.py \
--content_dir data/content \
--profiles video_colpali_smol500_mv_frame \
video_videoprism_base_mv_chunk_30s \
video_colqwen_omni_mv_chunk_30s \
--tenant default# Multi-agent intelligent search across all content
uv run python tests/comprehensive_video_query_test_v2.py \
--profiles video_colpali_smol500_mv_frame \
--test-multiple-strategies
# Direct API query (text, image, or multi-modal)
curl -X POST http://localhost:8000/api/v1/search \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: default" \
-d '{"query": "machine learning tutorial", "modalities": ["video", "document", "image"]}'# Run Phoenix experiments
uv run python scripts/run_experiments_with_visualization.py \
--dataset-name golden_eval_v1 \
--profiles video_colpali_smol500_mv_frame \
--test-multiple-strategies \
--quality-evaluators
# Launch Phoenix dashboard
uv run streamlit run libs/dashboard/cogniverse_dashboard/app.py
# Open http://localhost:8501cogniverse/
βββ libs/ # SDK Packages (UV workspace - 10 packages)
β βββ sdk/ # cogniverse_sdk (Foundation Layer)
β β βββ cogniverse_sdk/
β β βββ interfaces/ # Backend interfaces
β β βββ document.py # Universal document model
β βββ foundation/ # cogniverse_foundation (Foundation Layer)
β β βββ cogniverse_foundation/
β β βββ config/ # Configuration base
β β βββ telemetry/ # Telemetry interfaces
β βββ core/ # cogniverse_core (Core Layer)
β β βββ cogniverse_core/
β β βββ agents/ # Agent base classes
β β βββ registries/ # Component registries
β β βββ common/ # Shared utilities
β βββ evaluation/ # cogniverse_evaluation (Core Layer)
β β βββ cogniverse_evaluation/
β β βββ experiments/ # Experiment management
β β βββ metrics/ # Provider-agnostic metrics
β β βββ datasets/ # Dataset handling
β βββ telemetry-phoenix/ # cogniverse_telemetry_phoenix (Core Layer - Plugin)
β β βββ cogniverse_telemetry_phoenix/
β β βββ provider.py # Phoenix telemetry provider
β β βββ evaluation/ # Phoenix evaluation provider
β βββ agents/ # cogniverse_agents (Implementation Layer)
β β βββ cogniverse_agents/
β β βββ routing/ # DSPy routing & optimization
β β βββ search/ # Multi-modal search & reranking
β β βββ tools/ # A2A tools
β βββ vespa/ # cogniverse_vespa (Implementation Layer)
β β βββ cogniverse_vespa/
β β βββ backends/ # Vespa backend (tenant schemas)
β β βββ schema/ # Schema management
β βββ synthetic/ # cogniverse_synthetic (Implementation Layer)
β β βββ cogniverse_synthetic/
β β βββ generators/ # Synthetic data generators
β β βββ service.py # Synthetic data service
β βββ runtime/ # cogniverse_runtime (Application Layer)
β β βββ cogniverse_runtime/
β β βββ server/ # FastAPI server
β β βββ ingestion/ # Video processing pipeline
β βββ dashboard/ # cogniverse_dashboard (Application Layer)
β βββ cogniverse_dashboard/
β βββ phoenix/ # Phoenix dashboards
β βββ streamlit/ # Streamlit UI
βββ docs/ # Comprehensive documentation
β βββ architecture/ # System architecture
β βββ modules/ # Module documentation
β βββ operations/ # Deployment & configuration
β βββ development/ # Development guides
β βββ diagrams/ # Architecture diagrams
β βββ testing/ # Testing guides
βββ scripts/ # Operational scripts
βββ tests/ # Test suite (by package)
βββ configs/ # Configuration & schemas
βββ pyproject.toml # Workspace root
βββ uv.lock # Unified lockfile
Package Dependencies (Layered Architecture):
Foundation Layer:
cogniverse_sdk (zero internal dependencies)
β
cogniverse_foundation (depends on sdk)
Core Layer:
cogniverse_core (depends on sdk, foundation, evaluation)
cogniverse_evaluation (depends on sdk, foundation)
cogniverse_telemetry_phoenix (plugin - depends on core, evaluation)
Implementation Layer:
cogniverse_agents (depends on core)
cogniverse_vespa (depends on core)
cogniverse_synthetic (depends on core)
Application Layer:
cogniverse_runtime (depends on core, agents, vespa, synthetic)
cogniverse_dashboard (depends on core, evaluation)
ββββββββββββββββββββ
β Composing Agent β β ADK-based orchestrator
ββββββββββ¬ββββββββββ
β A2A Protocol
ββββββββββββββββ¬ββββββββββββββ¬βββββββββββββββ
βΌ βΌ βΌ βΌ
βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ
βVideo Search β β Memory β β Routing β β Evaluation β
β Agent β β Agent β β Optimizer β β Agent β
βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ
| Model | Type | Dimensions | Use Case |
|---|---|---|---|
| ColPali SmolVLM | Frame-level | 768 | Visual document search |
| VideoPrism Base | Global video | 768 | Semantic video understanding |
| VideoPrism LVT | Temporal | 768/1024 | Action/motion search |
| ColQwen2 Omni | Multi-modal | 768 | Text+visual fusion |
- bm25_only - Text-only BM25
- float_float - Dense embeddings only
- binary_binary - Binary embeddings only
- hybrid_float_bm25 - BM25 + dense (recommended)
- phased - Two-phase ranking
- float_binary - Dense with binary fallback
- binary_bm25 - Binary + BM25
- bm25_float_rerank - BM25 then dense rerank
- bm25_binary_rerank - BM25 then binary rerank
from cogniverse_foundation.config.unified_config import SystemConfig
from cogniverse_agents.video_agent_refactored import VideoSearchAgent
from cogniverse_foundation.config.utils import create_default_config_manager
from cogniverse_core.schemas.filesystem_loader import FilesystemSchemaLoader
from pathlib import Path
# Configure tenant with complete isolation
config = SystemConfig(
tenant_id="acme_corp",
llm_model="gpt-4",
backend_url="http://localhost",
backend_port=8080,
telemetry_url="http://localhost:6006",
)
# Create agent β profile-agnostic, tenant-agnostic at construction
config_manager = create_default_config_manager()
schema_loader = FilesystemSchemaLoader(Path("configs/schemas"))
agent = VideoSearchAgent(
config_manager=config_manager,
schema_loader=schema_loader,
)
# Search with per-request profile and tenant_id
# Agent automatically targets schema: video_colpali_smol500_mv_frame_acme_corp
results = agent.search(
query="machine learning tutorial",
profile="video_colpali_smol500_mv_frame",
tenant_id="acme_corp",
top_k=10,
)from cogniverse_agents.routing.config import RoutingConfig
from cogniverse_agents.routing.optimization_orchestrator import OptimizationOrchestrator
# Configure GEPA optimizer for tenant
routing_config = RoutingConfig(
tenant_id="acme_corp",
optimizer_type="GEPA",
experience_buffer_size=10000,
learning_rate=0.001,
update_interval=300 # 5 minutes
)
orchestrator = OptimizationOrchestrator(config=routing_config)
results = orchestrator.run_optimization()Access comprehensive telemetry at http://localhost:8501:
- Traces: Request flow visualization
- Experiments: A/B testing results
- Metrics: Performance analytics
- Memory: Context tracking
- Configuration: Live config management
- Reference-Free: Quality, Diversity, Distribution scores
- Visual LLM: LLaVA/GPT-4V visual relevance
- Classical: MRR, NDCG, Precision@k, Recall@k
- Phoenix Experiments: Automatic tracking and comparison
# Run full test suite (30 min timeout for integration tests)
JAX_PLATFORM_NAME=cpu uv run pytest --timeout=1800
# Unit tests only
JAX_PLATFORM_NAME=cpu uv run pytest tests/unit/
# Integration tests
JAX_PLATFORM_NAME=cpu uv run pytest tests/integration/
# Specific component
JAX_PLATFORM_NAME=cpu uv run pytest tests/agents/ -v- Architecture Overview - System design and multi-tenant architecture
- SDK Architecture - UV workspace and 10-package layered architecture
- Multi-Tenant Architecture - Complete tenant isolation patterns
- System Flows - 20+ architectural diagrams
- Setup & Installation - UV workspace installation
- Configuration Guide - Multi-tenant configuration
- Deployment Guide - Docker, Modal, Kubernetes
- Multi-Tenant Operations - Tenant lifecycle management
- Package Development - SDK package workflows
- Scripts & Operations - Operational scripts
- Testing Guide - SDK and multi-tenant testing
- Agents - Agent implementations
- Routing - Query routing and optimization
- Ingestion - Video processing pipeline
- Search & Reranking - Multi-modal search
- Telemetry - Phoenix integration
- Evaluation - Experiment tracking
- Backends - Vespa integration
- Common - Utilities and cache
# Start all services via k3d/Helm
cogniverse up
# Check status
cogniverse status# Deploy to Modal
modal deploy src/modal/app.py
# Test endpoint
curl https://your-app.modal.run/search \
-H "X-Tenant-ID: default" \
-d '{"query": "tutorial"}'- Multi-tenant isolation: Schema-per-tenant with JWT validation
- Rate limiting: Per-tenant QPS limits
- Authentication: JWT/API key support
- Audit logging: All operations tracked in Phoenix
| Metric | Target | Current |
|---|---|---|
| Query Latency P95 | < 500ms | 450ms |
| Ingestion Speed | 10 videos/min | 12 videos/min |
| Concurrent Users | 500 | 600 |
| Cache Hit Rate | > 40% | 45% |
| Routing Accuracy | > 90% | 92% |
See the Developer Guide for detailed contribution guidelines.
Code Standards:
- Use type hints for all function signatures
- Add docstrings to public functions (Google style)
- Follow PEP 8 with
rufffor linting - Use
uv runfor all Python commands
Commit Standards:
- Use imperative mood:
Add,Fix,Update,Refactor,Remove - Subject line: WHAT changed (under 72 chars)
- Body: WHY the change was needed (for non-trivial changes)
Pre-Commit Checklist:
- Run
uv run pytestand ensure 100% pass rate - Run
uv run ruff checkwith no errors - Update documentation for significant changes
- Never commit failing tests or skip markers
[License information here]
- GitHub Issues: Report bugs
- Documentation: Read the docs
- Phoenix Dashboard: http://localhost:8501