A multi-model AI orchestration MCP server for automated code review and LLM-powered analysis. Multi-MCP integrates with Claude Code CLI to orchestrate multiple AI models (OpenAI GPT, Anthropic Claude, Google Gemini) for code quality checks, security analysis (OWASP Top 10), and multi-agent consensus. Built on the Model Context Protocol (MCP), this tool enables Python developers and DevOps teams to automate code reviews with AI-powered insights directly in their development workflow.
- π Code Review - Systematic workflow with OWASP Top 10 security checks and performance analysis
- π¬ Chat - Interactive development assistance with repository context awareness
- π Compare - Parallel multi-model analysis for architectural decisions
- π Debate - Multi-agent consensus workflow (independent answers + critique)
- π€ Multi-Model Support - OpenAI GPT, Anthropic Claude, Google Gemini, and OpenRouter
- π₯οΈ CLI & API Models - Mix CLI-based (Gemini CLI, Codex CLI) and API models
- π·οΈ Model Aliases - Use short names like
mini,sonnet,gemini - π§΅ Threading - Maintain context across multi-step reviews
Multi-MCP acts as an MCP server that Claude Code connects to, providing AI-powered code analysis tools:
- Install the MCP server and configure your AI model API keys
- Integrate with Claude Code CLI automatically via
make install - Invoke tools using natural language (e.g., "multi codereview this file")
- Get Results from multiple AI models orchestrated in parallel
Fast Multi-Model Analysis:
- β‘ Parallel Execution - 3 models in ~10s (vs ~30s sequential)
- π Async Architecture - Non-blocking Python asyncio
- πΎ Conversation Threading - Maintains context across multi-step reviews
- π Low Latency - Response time = slowest model, not sum of all models
Prerequisites:
- Python 3.11+
- API key for at least one provider (OpenAI, Anthropic, Google, or OpenRouter)
# Clone and install
git clone https://github.com/religa/multi_mcp.git
cd multi_mcp
# Execute ./scripts/install.sh
make install
# The installer will:
# 1. Install dependencies (uv sync)
# 2. Generate your .env file
# 3. Automatically add to Claude Code config (requires jq)
# 4. Test the installationManual configuration (if you prefer not to run make install):
# Install dependencies
uv sync
# Copy and configure .env
cp .env.example .env
# Edit .env with your API keysAdd to Claude Code (~/.claude.json), replacing /path/to/multi_mcp with your actual clone path:
{
"mcpServers": {
"multi": {
"type": "stdio",
"command": "/path/to/multi_mcp/.venv/bin/python",
"args": ["-m", "multi_mcp.server"]
}
}
}Multi-MCP loads settings from .env files in this order (highest priority first):
- Environment variables (already set in shell)
- Project
.env(current directory or project root) - User
.env(~/.multi_mcp/.env) - fallback for pip installs
Edit .env with your API keys:
# API Keys (configure at least one)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...
OPENROUTER_API_KEY=sk-or-...
# Azure OpenAI (optional)
AZURE_API_KEY=...
AZURE_API_BASE=https://your-resource.openai.azure.com/
# AWS Bedrock (optional)
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION_NAME=us-east-1
# Model Configuration
DEFAULT_MODEL=gpt-5-mini
DEFAULT_MODEL_LIST=gpt-5-mini,gemini-3-flashModels are defined in YAML configuration files (user config wins):
- Package defaults:
multi_mcp/config/config.yaml(bundled with package) - User overrides:
~/.multi_mcp/config.yaml(optional, takes precedence)
To add your own models, create ~/.multi_mcp/config.yaml (see config.yaml and config.override.example.yaml for examples):
version: "1.0"
models:
# Add a new API model
my-custom-gpt:
litellm_model: openai/gpt-4o
aliases:
- custom
notes: "My custom GPT-4o configuration"
# Add a custom CLI model
my-local-llm:
provider: cli
cli_command: ollama
cli_args:
- "run"
- "llama3.2"
cli_parser: text
aliases:
- local
notes: "Local LLaMA via Ollama"
# Override an existing model's settings
gpt-5-mini:
constraints:
temperature: 0.5 # Override default temperatureMerge behavior:
- New models are added alongside package defaults
- Existing models are merged (your settings override package defaults)
- Aliases can be "stolen" from package models to your custom models
Once installed in Claude Code, you can use these commands:
π¬ Chat - Interactive development assistance:
Can you ask Multi chat what's the answer to life, universe and everything?
π Code Review - Analyze code with specific models:
Can you multi codereview this module for code quality and maintainability using gemini-3 and codex?
π Compare - Get multiple perspectives (uses default models):
Can you multi compare the best state management approach for this React app?
π Debate - Deep analysis with critique:
Can you multi debate the best project code name for this project?
Edit ~/.claude/settings.json and add the following lines to permissions.allow to enable Claude Code to use Multi MCP without blocking for user permission:
{
"permissions": {
"allow": [
...
"mcp__multi__chat",
"mcp__multi__codereview",
"mcp__multi__compare",
"mcp__multi__debate",
"mcp__multi__models"
],
},
"env": {
"MCP_TIMEOUT": "300000",
"MCP_TOOL_TIMEOUT": "300000"
},
}Use short aliases instead of full model names:
| Alias | Model | Provider |
|---|---|---|
mini |
gpt-5-mini | OpenAI |
nano |
gpt-5-nano | OpenAI |
gpt |
gpt-5.2 | OpenAI |
codex |
gpt-5.1-codex | OpenAI |
sonnet |
claude-sonnet-4.5 | Anthropic |
haiku |
claude-haiku-4.5 | Anthropic |
opus |
claude-opus-4.5 | Anthropic |
gemini |
gemini-3-pro-preview | |
flash |
gemini-3-flash | |
azure-mini |
azure-gpt-5-mini | Azure |
bedrock-sonnet |
bedrock-claude-4-5-sonnet | AWS |
Run multi:models to see all available models and aliases.
Multi-MCP can execute CLI-based AI models (like Gemini CLI, Codex CLI, or Claude CLI) alongside API models. CLI models run as subprocesses and work seamlessly with all existing tools.
Benefits:
- Use models with full tool access (file operations, shell commands)
- Mix API and CLI models in
compareanddebateworkflows - Leverage local CLIs without API overhead
Built-in CLI Models:
gemini-cli(alias:gem-cli) - Gemini CLI with auto-edit modecodex-cli(alias:cx-cli) - Codex CLI with full-auto modeclaude-cli(alias:cl-cli) - Claude CLI with acceptEdits mode
Adding Custom CLI Models:
Add to ~/.multi_mcp/config.yaml (see Model Configuration):
version: "1.0"
models:
my-ollama:
provider: cli
cli_command: ollama
cli_args:
- "run"
- "codellama"
cli_parser: text # "json", "jsonl", or "text"
aliases:
- ollama
notes: "Local CodeLlama via Ollama"Prerequisites:
CLI models require the respective CLI tools to be installed:
# Gemini CLI
npm install -g @anthropic-ai/gemini-cli
# Codex CLI
npm install -g @openai/codex
# Claude CLI
npm install -g @anthropic-ai/claude-codeMulti-MCP includes a standalone CLI for code review without needing an MCP client.
# Review a directory
multi src/
# Review specific files
multi src/server.py src/config.py
# Use a different model
multi --model mini src/
# JSON output for CI/pipelines
multi --json src/ > results.json
# Verbose logging
multi -v src/
# Specify project root (for CLAUDE.md loading)
multi --base-path /path/to/project src/| Feature | Multi-MCP | Single-Model Tools |
|---|---|---|
| Parallel model execution | β | β |
| Multi-model consensus | β | Varies |
| Model debates | β | β |
| CLI + API model support | β | β |
| OWASP security analysis | β | Varies |
"No API key found"
- Add at least one API key to your
.envfile - Verify it's loaded:
uv run python -c "from multi_mcp.settings import settings; print(settings.openai_api_key)"
Integration tests fail
- Set
RUN_E2E=1environment variable - Verify API keys are valid and have sufficient credits
Debug mode:
export LOG_LEVEL=DEBUG # INFO is default
uv run python -m multi_mcp.serverCheck logs in logs/server.log for detailed information.
Q: Do I need all three AI providers? A: No, just one API key (OpenAI, Anthropic, or Google) is enough to get started.
Q: Does it truly run in parallel?
A: Yes! When you use codereview, compare or debate tools, all models are executed concurrently using Python's asyncio.gather(). This means you get responses from multiple models in the time it takes for the slowest model to respond, not the sum of all response times.
Q: How many models can I run at the same time? A: There's no hard limit! You can run as many models as you want in parallel. In practice, 2-5 models work well for most use cases. All tools use your configured default models (typically 2-3), but you can specify any number of models you want.
We welcome contributions! See CONTRIBUTING.md for:
- Development setup
- Code standards
- Testing guidelines
- Pull request process
Quick start:
git clone https://github.com/YOUR_USERNAME/multi_mcp.git
cd multi_mcp
uv sync --extra dev
make check && make testMIT License - see LICENSE file for details
