ContextScope Eval

Evaluate, trace, and visualize how AI agents share and use context in multi-agent recommendation systems.

Overview

ContextScope Eval is an evaluation framework that measures how effectively autonomous agents pass and use information through multi-agent pipelines. Built for the MongoDB hackathon, it demonstrates context transmission quality using a Netflix-like movie recommendation system with the Mflix sample dataset.

Key Innovation: Measures information flow quality — how context survives, transforms, and degrades as it moves through agents.

Features

Multi-agent movie recommendation system (User Profiler → Content Analyzer → Recommender → Explainer)
Context fidelity and drift measurement at each agent handoff
MongoDB Atlas integration with Vector Search
Real-time visualization dashboard
Comparison of structured (JSON) vs freeform (Markdown) context formats

Quick Start

Prerequisites
- Python 3.10+
- MongoDB Atlas account with Mflix sample data loaded

Setup

# Clone and enter directory
cd mongodbhackathon

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp env.example .env
# Edit .env with your MongoDB connection string

Test Connection
```
python test_connection.py
```
Next Steps See SETUP.md for detailed setup instructions and usage examples.

Project Structure

mongodbhackathon/
├── backend/           # Python backend with FastAPI
│   ├── config.py     # Configuration management
│   ├── db/           # MongoDB connection layer
│   ├── models/       # Data models for User, Movie, Comment
│   └── services/     # Business logic layer
├── test_connection.py # Connection test script
└── docs/
    ├── PROJECT.md    # Full project specification
    ├── SETUP.md      # Setup guide
    └── AGENTS.md     # Agent architecture details

Tech Stack

Layer	Technology
Database	MongoDB Atlas + Vector Search
Backend	Python + FastAPI
Models	Pydantic
Agents	Granite 4.0 (Apache 2.0)
Judge	OLMo / Mistral (Apache 2.0)
Frontend	Next.js + D3.js (coming soon)

Documentation

PROJECT.md - Complete project specification and architecture
SETUP.md - Setup guide and troubleshooting
AGENTS.md - Agent design and implementation details

License

Apache-2.0 (Models) / MIT (Code)

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
backend		backend
frontend		frontend
reports		reports
scripts		scripts
tests		tests
.gitignore		.gitignore
=1.24.0		=1.24.0
AGENTS.md		AGENTS.md
PROGRESS.md		PROGRESS.md
PROJECT.md		PROJECT.md
README.md		README.md
SETUP.md		SETUP.md
SUBMISSION.md		SUBMISSION.md
aaarzyn-notes.md		aaarzyn-notes.md
check_embedded_movies.py		check_embedded_movies.py
check_embeddings.py		check_embeddings.py
check_imports.py		check_imports.py
dashboard.py		dashboard.py
debug_sean_bean.py		debug_sean_bean.py
demo_context_evaluation.py		demo_context_evaluation.py
demo_recommendation_pipeline.py		demo_recommendation_pipeline.py
find_bad_movies.py		find_bad_movies.py
keys.txt		keys.txt
output.txt		output.txt
playground-1.mongodb.js		playground-1.mongodb.js
requirements.txt		requirements.txt
start_api.py		start_api.py
test_all_genres.py		test_all_genres.py
view_comparison_results.py		view_comparison_results.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ContextScope Eval

Overview

Features

Quick Start

Project Structure

Tech Stack

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ContextScope Eval

Overview

Features

Quick Start

Project Structure

Tech Stack

Documentation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages