Skip to content

amit-jain/cogniverse

Repository files navigation

Cogniverse - Self-Optimizing Content Intelligence Platform

Experience-Guided Multi-Agent System for Multi-Modal Understanding

Multi-agent AI platform for video, audio, image, and document understanding. Processes all content types using ColPali, VideoPrism, ColQwen, and LateOn embeddings with Vespa-backed retrieval. Agents coordinate via A2A protocol with DSPy-powered reasoning, streaming responses, and Phoenix observability. 11-package UV workspace with multi-tenant isolation.

🎯 What Makes Cogniverse Different

  • 🧠 Self-Optimizing: Learns from every interaction using GEPA (Experience-Guided Policy Adaptation) - routing strategies improve continuously from real usage
  • 🎭 Multi-Modal Intelligence: Process any content type (video, audio, images, documents, text, dataframes) with unified understanding
  • πŸ€– Multi-Agent Orchestration: DSPy 3.0 A2A protocol-based coordination of specialized agents working together
  • πŸ”€ Cross-Modal Fusion: Intelligent combination of insights across different modalities for richer understanding
  • ⚑ Production Performance: <500ms P95 latency at 500+ concurrent users with 9 Vespa ranking strategies
  • 🎯 Multiple SOTA Models: ColPali (frame-level), VideoPrism (global+temporal), ColQwen (multi-modal fusion)
  • 🏒 Multi-Tenant Ready: Complete schema-per-tenant isolation with independent Phoenix projects and memory
  • πŸ“Š Full Observability: Comprehensive Phoenix telemetry with traces, experiments, and real-time dashboards
  • πŸ§ͺ Evaluation Framework: Provider-agnostic metrics with reference-free, visual LLM, and classical evaluators
  • πŸ—οΈ Professional Architecture: 10-package layered structure (Foundation β†’ Core β†’ Implementation β†’ Application)

🎬 Use Cases

For Individual Developers:

  • Build intelligent content search applications across any modality
  • Experiment with multiple state-of-the-art embedding models
  • Learn multi-agent AI architectures with production-quality code
  • Use locally with Ollama (no API costs)

For Researchers:

  • Run experiments with different embedding strategies and evaluate results
  • Optimize routing agents with synthetic data generation
  • Track all experiments with comprehensive Phoenix telemetry
  • Publish reproducible results with full observability

For Teams & Organizations:

  • Deploy multi-tenant SaaS applications with complete data isolation
  • Achieve production-scale performance (<500ms P95 at 500+ users)
  • Monitor and optimize with comprehensive dashboards
  • Scale from prototype to production with professional architecture

πŸš€ Quick Start

Prerequisites

  • Python 3.12+
  • 16GB+ RAM
  • CUDA-capable GPU (recommended for VideoPrism)
  • Docker for Vespa and Phoenix
  • uv package manager: pip install uv

Installation

# Clone repository
git clone <repo>
cd cogniverse

# Install dependencies
uv sync

# Start infrastructure
cogniverse up  # Starts Vespa, Phoenix, Ollama via k3d

# Verify services
curl -s http://localhost:8080/ApplicationStatus  # Vespa
curl -s http://localhost:6006/health            # Phoenix

Basic Operations

1. Content Ingestion (All Modalities)

# Ingest videos with ColPali embeddings
uv run python scripts/run_ingestion.py \
    --video_dir data/videos \
    --profile video_colpali_smol500_mv_frame \
    --tenant default

# Multi-modal multi-profile ingestion (video, audio, images, documents)
uv run python scripts/run_ingestion.py \
    --content_dir data/content \
    --profiles video_colpali_smol500_mv_frame \
               video_videoprism_base_mv_chunk_30s \
               video_colqwen_omni_mv_chunk_30s \
    --tenant default

2. Multi-Modal Search

# Multi-agent intelligent search across all content
uv run python tests/comprehensive_video_query_test_v2.py \
    --profiles video_colpali_smol500_mv_frame \
    --test-multiple-strategies

# Direct API query (text, image, or multi-modal)
curl -X POST http://localhost:8000/api/v1/search \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: default" \
  -d '{"query": "machine learning tutorial", "modalities": ["video", "document", "image"]}'

3. Evaluation & Optimization

# Run Phoenix experiments
uv run python scripts/run_experiments_with_visualization.py \
    --dataset-name golden_eval_v1 \
    --profiles video_colpali_smol500_mv_frame \
    --test-multiple-strategies \
    --quality-evaluators

# Launch Phoenix dashboard
uv run streamlit run libs/dashboard/cogniverse_dashboard/app.py
# Open http://localhost:8501

πŸ“ UV Workspace Structure

cogniverse/
β”œβ”€β”€ libs/                         # SDK Packages (UV workspace - 10 packages)
β”‚   β”œβ”€β”€ sdk/                      # cogniverse_sdk (Foundation Layer)
β”‚   β”‚   └── cogniverse_sdk/
β”‚   β”‚       β”œβ”€β”€ interfaces/       # Backend interfaces
β”‚   β”‚       └── document.py       # Universal document model
β”‚   β”œβ”€β”€ foundation/               # cogniverse_foundation (Foundation Layer)
β”‚   β”‚   └── cogniverse_foundation/
β”‚   β”‚       β”œβ”€β”€ config/           # Configuration base
β”‚   β”‚       └── telemetry/        # Telemetry interfaces
β”‚   β”œβ”€β”€ core/                     # cogniverse_core (Core Layer)
β”‚   β”‚   └── cogniverse_core/
β”‚   β”‚       β”œβ”€β”€ agents/           # Agent base classes
β”‚   β”‚       β”œβ”€β”€ registries/       # Component registries
β”‚   β”‚       └── common/           # Shared utilities
β”‚   β”œβ”€β”€ evaluation/               # cogniverse_evaluation (Core Layer)
β”‚   β”‚   └── cogniverse_evaluation/
β”‚   β”‚       β”œβ”€β”€ experiments/      # Experiment management
β”‚   β”‚       β”œβ”€β”€ metrics/          # Provider-agnostic metrics
β”‚   β”‚       └── datasets/         # Dataset handling
β”‚   β”œβ”€β”€ telemetry-phoenix/        # cogniverse_telemetry_phoenix (Core Layer - Plugin)
β”‚   β”‚   └── cogniverse_telemetry_phoenix/
β”‚   β”‚       β”œβ”€β”€ provider.py       # Phoenix telemetry provider
β”‚   β”‚       └── evaluation/       # Phoenix evaluation provider
β”‚   β”œβ”€β”€ agents/                   # cogniverse_agents (Implementation Layer)
β”‚   β”‚   └── cogniverse_agents/
β”‚   β”‚       β”œβ”€β”€ routing/          # DSPy routing & optimization
β”‚   β”‚       β”œβ”€β”€ search/           # Multi-modal search & reranking
β”‚   β”‚       └── tools/            # A2A tools
β”‚   β”œβ”€β”€ vespa/                    # cogniverse_vespa (Implementation Layer)
β”‚   β”‚   └── cogniverse_vespa/
β”‚   β”‚       β”œβ”€β”€ backends/         # Vespa backend (tenant schemas)
β”‚   β”‚       └── schema/           # Schema management
β”‚   β”œβ”€β”€ synthetic/                # cogniverse_synthetic (Implementation Layer)
β”‚   β”‚   └── cogniverse_synthetic/
β”‚   β”‚       β”œβ”€β”€ generators/       # Synthetic data generators
β”‚   β”‚       └── service.py        # Synthetic data service
β”‚   β”œβ”€β”€ runtime/                  # cogniverse_runtime (Application Layer)
β”‚   β”‚   └── cogniverse_runtime/
β”‚   β”‚       β”œβ”€β”€ server/           # FastAPI server
β”‚   β”‚       └── ingestion/        # Video processing pipeline
β”‚   └── dashboard/                # cogniverse_dashboard (Application Layer)
β”‚       └── cogniverse_dashboard/
β”‚           β”œβ”€β”€ phoenix/          # Phoenix dashboards
β”‚           └── streamlit/        # Streamlit UI
β”œβ”€β”€ docs/                         # Comprehensive documentation
β”‚   β”œβ”€β”€ architecture/             # System architecture
β”‚   β”œβ”€β”€ modules/                  # Module documentation
β”‚   β”œβ”€β”€ operations/               # Deployment & configuration
β”‚   β”œβ”€β”€ development/              # Development guides
β”‚   β”œβ”€β”€ diagrams/                 # Architecture diagrams
β”‚   └── testing/                  # Testing guides
β”œβ”€β”€ scripts/                      # Operational scripts
β”œβ”€β”€ tests/                        # Test suite (by package)
β”œβ”€β”€ configs/                      # Configuration & schemas
β”œβ”€β”€ pyproject.toml                # Workspace root
└── uv.lock                       # Unified lockfile

Package Dependencies (Layered Architecture):

Foundation Layer:
  cogniverse_sdk (zero internal dependencies)
    ↓
  cogniverse_foundation (depends on sdk)

Core Layer:
  cogniverse_core (depends on sdk, foundation, evaluation)
  cogniverse_evaluation (depends on sdk, foundation)
  cogniverse_telemetry_phoenix (plugin - depends on core, evaluation)

Implementation Layer:
  cogniverse_agents (depends on core)
  cogniverse_vespa (depends on core)
  cogniverse_synthetic (depends on core)

Application Layer:
  cogniverse_runtime (depends on core, agents, vespa, synthetic)
  cogniverse_dashboard (depends on core, evaluation)

πŸ—οΈ Architecture

Multi-Agent Orchestration

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Composing Agent β”‚ ← ADK-based orchestrator
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ A2A Protocol
         β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β–Ό              β–Ό             β–Ό              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Video Search β”‚ β”‚   Memory    β”‚ β”‚   Routing   β”‚ β”‚ Evaluation  β”‚
β”‚   Agent     β”‚ β”‚   Agent     β”‚ β”‚  Optimizer  β”‚ β”‚   Agent     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Embedding Models

Model Type Dimensions Use Case
ColPali SmolVLM Frame-level 768 Visual document search
VideoPrism Base Global video 768 Semantic video understanding
VideoPrism LVT Temporal 768/1024 Action/motion search
ColQwen2 Omni Multi-modal 768 Text+visual fusion

Vespa Ranking Strategies

  1. bm25_only - Text-only BM25
  2. float_float - Dense embeddings only
  3. binary_binary - Binary embeddings only
  4. hybrid_float_bm25 - BM25 + dense (recommended)
  5. phased - Two-phase ranking
  6. float_binary - Dense with binary fallback
  7. binary_bm25 - Binary + BM25
  8. bm25_float_rerank - BM25 then dense rerank
  9. bm25_binary_rerank - BM25 then binary rerank

πŸ”§ Configuration

Multi-Tenant Setup

from cogniverse_foundation.config.unified_config import SystemConfig
from cogniverse_agents.video_agent_refactored import VideoSearchAgent
from cogniverse_foundation.config.utils import create_default_config_manager
from cogniverse_core.schemas.filesystem_loader import FilesystemSchemaLoader
from pathlib import Path

# Configure tenant with complete isolation
config = SystemConfig(
    tenant_id="acme_corp",
    llm_model="gpt-4",
    backend_url="http://localhost",
    backend_port=8080,
    telemetry_url="http://localhost:6006",
)

# Create agent β€” profile-agnostic, tenant-agnostic at construction
config_manager = create_default_config_manager()
schema_loader = FilesystemSchemaLoader(Path("configs/schemas"))
agent = VideoSearchAgent(
    config_manager=config_manager,
    schema_loader=schema_loader,
)

# Search with per-request profile and tenant_id
# Agent automatically targets schema: video_colpali_smol500_mv_frame_acme_corp
results = agent.search(
    query="machine learning tutorial",
    profile="video_colpali_smol500_mv_frame",
    tenant_id="acme_corp",
    top_k=10,
)

DSPy Optimization

from cogniverse_agents.routing.config import RoutingConfig
from cogniverse_agents.routing.optimization_orchestrator import OptimizationOrchestrator

# Configure GEPA optimizer for tenant
routing_config = RoutingConfig(
    tenant_id="acme_corp",
    optimizer_type="GEPA",
    experience_buffer_size=10000,
    learning_rate=0.001,
    update_interval=300  # 5 minutes
)

orchestrator = OptimizationOrchestrator(config=routing_config)
results = orchestrator.run_optimization()

πŸ“Š Monitoring & Evaluation

Phoenix Dashboard

Access comprehensive telemetry at http://localhost:8501:

  • Traces: Request flow visualization
  • Experiments: A/B testing results
  • Metrics: Performance analytics
  • Memory: Context tracking
  • Configuration: Live config management

Evaluation Metrics

  • Reference-Free: Quality, Diversity, Distribution scores
  • Visual LLM: LLaVA/GPT-4V visual relevance
  • Classical: MRR, NDCG, Precision@k, Recall@k
  • Phoenix Experiments: Automatic tracking and comparison

πŸ§ͺ Testing

# Run full test suite (30 min timeout for integration tests)
JAX_PLATFORM_NAME=cpu uv run pytest --timeout=1800

# Unit tests only
JAX_PLATFORM_NAME=cpu uv run pytest tests/unit/

# Integration tests
JAX_PLATFORM_NAME=cpu uv run pytest tests/integration/

# Specific component
JAX_PLATFORM_NAME=cpu uv run pytest tests/agents/ -v

πŸ“š Documentation

Architecture

Operations & Deployment

Development

Module Documentation

Diagrams

πŸš€ Production Deployment

Unified Deployment

# Start all services via k3d/Helm
cogniverse up

# Check status
cogniverse status

Modal (Serverless)

# Deploy to Modal
modal deploy src/modal/app.py

# Test endpoint
curl https://your-app.modal.run/search \
  -H "X-Tenant-ID: default" \
  -d '{"query": "tutorial"}'

πŸ” Security

  • Multi-tenant isolation: Schema-per-tenant with JWT validation
  • Rate limiting: Per-tenant QPS limits
  • Authentication: JWT/API key support
  • Audit logging: All operations tracked in Phoenix

🎯 Performance Targets

Metric Target Current
Query Latency P95 < 500ms 450ms
Ingestion Speed 10 videos/min 12 videos/min
Concurrent Users 500 600
Cache Hit Rate > 40% 45%
Routing Accuracy > 90% 92%

🀝 Contributing

See the Developer Guide for detailed contribution guidelines.

Quick Reference

Code Standards:

  • Use type hints for all function signatures
  • Add docstrings to public functions (Google style)
  • Follow PEP 8 with ruff for linting
  • Use uv run for all Python commands

Commit Standards:

  • Use imperative mood: Add, Fix, Update, Refactor, Remove
  • Subject line: WHAT changed (under 72 chars)
  • Body: WHY the change was needed (for non-trivial changes)

Pre-Commit Checklist:

  • Run uv run pytest and ensure 100% pass rate
  • Run uv run ruff check with no errors
  • Update documentation for significant changes
  • Never commit failing tests or skip markers

πŸ“ License

[License information here]

πŸ†˜ Support


About

Multi-Agent multi-modal RAG System

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages