Skip to content

technicalboy2023/ai-router

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


Version Node.js License: MIT OpenAI Compatible Anthropic Compatible ESM GitHub Stars GitHub Forks


Production-grade, open-source AI gateway β€” unifying Groq, Gemini, OpenRouter, and Ollama behind a single OpenAI-compatible + Anthropic-compatible endpoint. Smart failover, multi-key rotation, response caching, 4 routing strategies, and a powerful CLI β€” all in one.


πŸš€ Quick Start Β· βš™οΈ Configuration Β· 🌐 API Reference Β· πŸ’» Usage Β· πŸ–₯️ CLI Β· 🀝 Contributing


πŸ€” The Problem It Solves

Building production AI apps is painful:

  • πŸ’Έ Rate limits kill your app at peak traffic
  • πŸ”‘ One API key = single point of failure
  • πŸ”€ Different SDKs per provider = messy codebase
  • πŸ’€ No fallback when Groq or Gemini goes down
  • πŸ’° Redundant API costs for repeated prompts

Universal AI Router eliminates all of this. It's a self-hosted AI gateway that sits between your app and every major LLM provider. One endpoint, one format, infinite resilience β€” built for developers who run real workloads.


✨ Features

Feature Description
πŸ” OpenAI-Compatible API Drop-in replacement at /v1/chat/completions β€” zero SDK changes
πŸ€– Anthropic-Compatible API Full /v1/messages endpoint β€” works with Claude Code, Anthropic SDKs
⚑ Smart Failover Automatic provider switching on failure with exponential backoff
πŸ”‘ Multi-Key Rotation Add unlimited keys per provider β€” health-scored rotation bypasses rate limits
🧠 Response Caching In-memory TTL cache β€” same prompt costs zero tokens the second time
🎯 4 Routing Strategies model-based, priority, latency-aware, round-robin β€” pick your strategy
πŸŒ™ Background Daemon Runs as a persistent background process β€” close terminal, router stays alive
πŸ”€ 4 Provider Support Groq Β· Gemini Β· OpenRouter Β· Ollama β€” all unified
πŸ“Š Live Metrics & Usage /metrics, /usage, /health endpoints with per-key telemetry
πŸ›‘οΈ Auth & Rate Limiting Token-based auth + sliding-window IP rate limiter built-in
πŸ”§ Admin API Reset cooldowns & clear cache via authenticated admin endpoints
🌊 Streaming SSE Full streaming support β€” responses pipe directly to your client
πŸ› οΈ Tool Call Support OpenAI function calling / tool use β€” handled natively
πŸ–₯️ Powerful Global CLI init, start, stop, restart, status, remove β€” full lifecycle management
βš™οΈ Multi-Router Support Run multiple named routers on different ports simultaneously
πŸ”Œ n8n / Make / Zapier Ready Works with any OpenAI-compatible no-code platform

πŸ›  Tech Stack

Node.js Express ESM Pino Undici Zod Commander Vitest Groq Google Gemini OpenRouter Ollama


πŸ“¦ Installation


πŸ–₯️ Option A β€” Local Machine (Windows / macOS)

Requirement: Node.js LTS (v20+) β€” Download here

# 1. Clone the repository
git clone https://github.com/technicalboy2023/ai-router.git
cd ai-router

# 2. Install dependencies
npm install

# 3. Register the global CLI command
npm link

βœ… ai-router command is now available globally in your terminal.


🐧 Option B β€” Linux VPS (Ubuntu 22.04) β€” Recommended for Production

Perfect for 24/7 hosting on Linode, DigitalOcean, Vultr, Hetzner, Contabo, etc.

Step 1 β€” System Update & Dependencies

sudo apt update && sudo apt upgrade -y
sudo apt install -y git curl

Step 2 β€” Install Node.js LTS (via NodeSource)

# Add NodeSource LTS repo
curl -fsSL https://deb.nodesource.com/setup_lts.x | sudo -E bash -

# Install Node.js
sudo apt install -y nodejs

# Verify
node -v    # v22.x.x or latest LTS
npm -v

Step 3 β€” Clone & Install

git clone https://github.com/technicalboy2023/ai-router.git
cd ai-router
npm install
sudo npm link

Step 4 β€” Open Firewall Port

sudo ufw allow 8000/tcp
sudo ufw reload
sudo ufw status

Step 5 β€” Run as Background Daemon (PM2)

# Install PM2 globally
sudo npm install -g pm2

# Start the router
pm2 start npm --name "ai-router" -- run dev

# Save process list
pm2 save

# Enable auto-start on reboot (run the command PM2 outputs!)
pm2 startup systemd

# Verify
pm2 status
pm2 logs ai-router

βœ… Router running at http://YOUR_VPS_IP:8000 β€” survives reboots automatically!

Useful PM2 Commands

pm2 status                  # Check all running processes
pm2 logs ai-router          # Stream live logs
pm2 restart ai-router       # Restart after config changes
pm2 stop ai-router          # Stop the router
pm2 delete ai-router        # Remove from PM2

βš™οΈ Configuration

Step 1 β€” Create .env File

# ── Groq (add multiple keys for rotation) ─────────────────
GROQ_KEY_1=gsk_your_first_groq_key
GROQ_KEY_2=gsk_your_second_groq_key

# ── Google Gemini ──────────────────────────────────────────
GEMINI_KEY_1=AIzaSy_your_gemini_key

# ── OpenRouter ────────────────────────────────────────────
OPENROUTER_KEY_1=sk-or-v1-your_key
OPENROUTER_KEY_2=sk-or-v1-your_second_key

# ── Security ──────────────────────────────────────────────
AUTH_TOKEN=my_super_secret_token
ADMIN_TOKEN=my_admin_secret_token

⚠️ Never commit .env to Git. Add it to .gitignore.


Step 2 β€” Tune config/default.json

{
  "name": "default",
  "port": 8000,
  "host": "0.0.0.0",

  "routing": {
    "strategy": "model-based",
    "providerOrder": ["groq", "openrouter", "gemini", "ollama"],
    "modelMapping": {
      "llama*":   "groq",
      "mixtral*": "groq",
      "gemma*":   "groq",
      "gemini*":  "gemini",
      "gpt*":     "openrouter"
    }
  },

  "fallback": {
    "providers": ["groq", "openrouter", "gemini", "ollama"],
    "maxRetries": 4,
    "backoff": { "initial": 500, "factor": 2, "max": 16000 }
  },

  "cache":     { "enabled": true, "ttl": 30, "maxSize": 512 },
  "auth":      { "enabled": true, "tokens": ["my_super_secret_token"], "adminTokens": ["my_admin_secret_token"] },
  "rateLimit": { "enabled": true, "windowMs": 60000, "maxRequests": 100 },
  "logging":   { "level": "info", "file": "logs/gateway.log", "console": true }
}
Key Description
routing.strategy model-based Β· priority Β· latency-aware Β· round-robin
routing.modelMapping Glob patterns β†’ provider ("llama*": "groq")
fallback.maxRetries Provider switches before giving up (default: 4)
fallback.backoff Exponential backoff in ms (initial β†’ max)
cache.ttl Cache TTL in minutes
auth.enabled Toggle Bearer token authentication
rateLimit.windowMs Sliding window duration in ms

πŸš€ Start the Router

# Development β€” foreground with live logs
npm run dev

# βœ… Router live at β†’ http://localhost:8000
# Production β€” named instance in background
ai-router start myRouter -c config/default.json

βž• Adding a New Router

You can run multiple routers simultaneously on the same server β€” each on a different port, with its own config, auth token, and provider priority.

Step 1 β€” Create a new config file

Copy config/default.json and give it a new name:

cp config/default.json config/myrouter.json

Step 2 β€” Edit the new config file

Open config/myrouter.json and change the following values:

πŸ”΄ Mandatory Changes (Must Edit β€” or router will crash / conflict)

Field Where What to Change
"name" Top level Change to a unique name e.g. "myrouter"
"port" Top level Change to a different port e.g. 8001, 8080
"logging.file" logging block Change to a new log file e.g. "logs/myrouter.log"

⚠️ Two routers cannot share the same port. If they do, the second one will crash with "Port already in use".

🟒 Optional Changes (Only if you want different behavior)

Field Where Why You'd Change It
"auth.tokens" auth block Give this router a separate API password
"routing.providerOrder" routing block Prioritise a different provider first (e.g. ["gemini", "openrouter", "groq", "ollama"])
"fallback.providers" fallback block Control which providers act as fallbacks
"rateLimit.maxRequests" rateLimit block Set a higher/lower request cap for this router

Example: Minimal new router config

{
  "name": "myrouter",
  "port": 8001,
  "host": "0.0.0.0",

  "routing": {
    "strategy": "model-based",
    "providerOrder": ["openrouter", "gemini", "groq", "ollama"]
  },

  "fallback": {
    "providers": ["openrouter", "gemini", "groq", "ollama"],
    "maxRetries": 4,
    "backoff": { "initial": 500, "factor": 2, "max": 16000 }
  },

  "cache":     { "enabled": true, "ttl": 30, "maxSize": 512 },
  "auth":      { "enabled": true, "tokens": ["my_router2_token"], "adminTokens": ["my_router2_admin"] },
  "rateLimit": { "enabled": true, "windowMs": 60000, "maxRequests": 100 },
  "logging":   { "level": "info", "file": "logs/myrouter.log", "console": true }
}

Step 3 β€” Start the new router

# Open firewall for the new port first (Linux VPS only)
sudo ufw allow 8001/tcp

# Start the new router
ai-router start myrouter -c config/myrouter.json

βœ… Now both routers are running: :8000 (default) and :8001 (myrouter) β€” completely independent.


🌐 API Endpoints

Method Endpoint Auth Description
POST /v1/chat/completions User Main LLM endpoint β€” OpenAI-compatible
POST /v1/messages User Anthropic Messages API β€” Claude Code compatible
POST /v1/messages/count_tokens User Token estimation β€” Claude Code compatible
POST /v1/embeddings User OpenAI-compatible text embeddings endpoint
GET /v1/models User List all models across all providers
GET /health None Liveness probe β€” provider & key summary
GET /metrics None Per-key telemetry β€” requests, errors, tokens, latency
GET /usage None Anonymized per-key usage counters
GET /router/status None Routing engine status
POST /admin/reset-cooldowns Admin Reset all rate-limited/cooled-down keys
POST /admin/cache/clear Admin Flush the response cache

πŸ’» Usage Examples

🐍 Python (openai library)

import openai

client = openai.OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="my_super_secret_token"
)

response = client.chat.completions.create(
    model="llama3-8b-8192",
    messages=[{"role": "user", "content": "Explain neural networks simply."}]
)
print(response.choices[0].message.content)

🌊 Streaming (Python)

stream = client.chat.completions.create(
    model="gemini-1.5-flash",
    messages=[{"role": "user", "content": "Write a poem about space."}],
    stream=True
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

πŸ–₯️ cURL

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my_super_secret_token" \
  -d '{"model": "openrouter/auto", "messages": [{"role": "user", "content": "Hello!"}]}'

πŸ“Š Health Check

curl http://localhost:8000/health

πŸ€– Claude Code Setup

The router fully supports Claude Code via the /v1/messages endpoint. Configure it:

# Set your router as the Anthropic API base URL
export ANTHROPIC_BASE_URL="http://YOUR_VPS_IP:8000"
export ANTHROPIC_API_KEY="your-router-auth-token"

Or add to your shell config (~/.bashrc, ~/.zshrc) for persistence:

echo 'export ANTHROPIC_BASE_URL="http://YOUR_VPS_IP:8000"' >> ~/.bashrc
echo 'export ANTHROPIC_API_KEY="your-router-auth-token"' >> ~/.bashrc
source ~/.bashrc

Now launch Claude Code normally β€” it will route through your AI Router with full fallback support.

πŸ€– Anthropic SDK (Python)

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:8000",
    api_key="my_super_secret_token"
)

message = client.messages.create(
    model="openrouter/auto",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain quantum computing simply."}]
)
print(message.content[0].text)

πŸ€– Anthropic cURL (/v1/messages)

curl -X POST http://localhost:8000/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: my_super_secret_token" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "openrouter/auto",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

πŸ”— n8n / Make / Zapier

  • URL: http://YOUR_VPS_IP:8000/v1/chat/completions
  • Method: POST
  • Header: Authorization: Bearer my_super_secret_token
  • Body: Standard OpenAI JSON payload

Works natively with n8n's OpenAI node β€” just change the base URL.


πŸ–₯️ CLI Reference

# Initialize a new named router config
ai-router init myRouter --port 8000

# Start a named router (background)
ai-router start myRouter -c config/myRouter.json

# Start ALL routers defined in config/
ai-router start-all

# Check status of all running routers
ai-router status

# Stream live logs
ai-router logs myRouter

# Restart a router (pick up config changes)
ai-router restart myRouter

# Stop a specific router
ai-router stop myRouter

# Stop ALL running routers
ai-router stop-all

# Remove a router config
ai-router remove myRouter

🎯 Routing Strategies

Strategy How It Works Best For
model-based Routes by model name glob patterns Predictable provider assignment
priority Tries providers in providerOrder sequence Simple primary + fallback setup
latency-aware Prefers provider with lowest avg response time Latency-sensitive apps
round-robin Distributes evenly across all providers Load balancing

πŸ“ Project Structure

ai-router/
β”‚
β”œβ”€β”€ bin/
β”‚   └── ai-router.js              # Global CLI entrypoint
β”‚
β”œβ”€β”€ config/
β”‚   └── default.json              # Full router configuration
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ index.js                  # Main export
β”‚   β”œβ”€β”€ worker.js                 # Dev server entry (npm run dev)
β”‚   β”‚
β”‚   β”œβ”€β”€ cli/
β”‚   β”‚   β”œβ”€β”€ orchestrator.js       # PM2 process manager wrapper
β”‚   β”‚   └── commands/             # init, start, startAll, stop, stopAll,
β”‚   β”‚                             # restart, status, logs, remove
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ loader.js             # Config parser & merger
β”‚   β”‚   └── schema.js             # Zod validation schema
β”‚   β”‚
β”‚   β”œβ”€β”€ providers/
β”‚   β”‚   β”œβ”€β”€ BaseProvider.js       # Abstract provider class
β”‚   β”‚   β”œβ”€β”€ ProviderRegistry.js   # Provider registry & lookup
β”‚   β”‚   β”œβ”€β”€ GroqProvider.js       # Groq
β”‚   β”‚   β”œβ”€β”€ GeminiProvider.js     # Google Gemini
β”‚   β”‚   β”œβ”€β”€ OpenRouterProvider.js # OpenRouter
β”‚   β”‚   └── OllamaProvider.js     # Ollama (local)
β”‚   β”‚
β”‚   β”œβ”€β”€ router_core/
β”‚   β”‚   β”œβ”€β”€ KeyRegistry.js        # Per-provider key pool
β”‚   β”‚   β”œβ”€β”€ KeyHealth.js          # Health scoring per key
β”‚   β”‚   β”œβ”€β”€ ResponseCache.js      # In-memory TTL cache
β”‚   β”‚   └── UsageStore.js         # Usage counter persistence
β”‚   β”‚
β”‚   β”œβ”€β”€ server/
β”‚   β”‚   β”œβ”€β”€ app.js                # Express app bootstrap
β”‚   β”‚   β”œβ”€β”€ middleware/           # auth, cors, rateLimiter,
β”‚   β”‚   β”‚                         # errorHandler, requestId
β”‚   β”‚   └── routes/               # chatCompletions, messages, models,
β”‚   β”‚                             # health, metrics, usage, routerStatus, admin
β”‚   └── services/
β”‚       β”œβ”€β”€ RoutingEngine.js      # 4-strategy routing logic
β”‚       β”œβ”€β”€ FallbackEngine.js     # Retry + failover
β”‚       β”œβ”€β”€ KeyManager.js         # Key selection & rotation
β”‚       β”œβ”€β”€ ToolCallHandler.js    # OpenAI tool/function calls
β”‚       β”œβ”€β”€ AnthropicTranslator.js # Anthropic ↔ OpenAI format conversion
β”‚       β”œβ”€β”€ ResponseNormalizer.js # Unified response format
β”‚       └── ErrorNormalizer.js    # Unified error format
β”‚
β”œβ”€β”€ .env                          # ⚠️ Your keys (never commit!)
β”œβ”€β”€ .env.example                  # Template for .env
β”œβ”€β”€ package.json
└── README.md

πŸ”„ Updating the Router

After making changes locally (or when a new version is available on GitHub), follow these steps to update the router on your VPS or cloud platform.

Option A β€” VPS (PM2 + Git)

# 1. SSH into your VPS
ssh user@YOUR_VPS_IP

# 2. Navigate to the project directory
cd ~/ai-router

# 3. Pull the latest changes from GitHub
git pull origin main

# 4. Install any new/updated dependencies
npm install

# 5. Restart all running routers to pick up changes
pm2 restart all

# 6. Verify everything is running
pm2 status
pm2 logs ai-router --lines 20

πŸ’‘ Tip: Your .env and config/*.json files won't be overwritten by git pull β€” they're either gitignored or only yours.

Option B β€” Render / Railway / Cloud Platforms

If your router is deployed on Render, Railway, or similar:

  1. Push your changes to GitHub:
    git add .
    git commit -m "fix: update router logic"
    git push origin main
  2. Auto-deploy: Most cloud platforms auto-detect the push and redeploy automatically.
  3. Manual deploy: If auto-deploy is off, go to your platform dashboard β†’ click "Manual Deploy" β†’ select the latest commit.

βœ… No SSH needed β€” cloud platforms handle the restart for you.

Option C β€” Quick Config-Only Update (No Code Changes)

If you just edited config/default.json or .env on the VPS directly:

# Just restart β€” no git pull needed
pm2 restart ai-router

# Or restart a specific named router
pm2 restart myrouter

πŸ—‘οΈ Complete Uninstall & Cleanup

To fully remove the AI Router from your system β€” including all processes, configs, logs, and the CLI command.

Step 1 β€” Stop & Remove All Router Processes

# Stop all running routers
pm2 stop all

# Delete all router processes from PM2
pm2 delete all

# Remove PM2 startup script (optional β€” if you don't use PM2 for anything else)
pm2 unstartup systemd
pm2 save --force

Step 2 β€” Unlink the Global CLI Command

# Navigate to the project directory
cd ~/ai-router

# Remove the global 'ai-router' command
sudo npm unlink

Step 3 β€” Delete the Project Files

# Go back to home directory
cd ~

# Delete the entire project folder
rm -rf ai-router

Step 4 β€” Clean Up Remaining Data (Optional)

# Remove PM2 logs related to ai-router
pm2 flush

# Close the firewall port (if you opened one)
sudo ufw delete allow 8000/tcp
sudo ufw delete allow 8001/tcp   # if you had a second router
sudo ufw reload

Step 5 β€” Verify Everything Is Gone

# Should return "command not found"
ai-router status

# Should show no processes
pm2 status

# Should show the folder no longer exists
ls ~/ai-router

βœ… That's it β€” your system is 100% clean. No leftover configs, daemons, or orphan processes.


🀝 Contributing

All contributions welcome!

# Fork + clone
git clone https://github.com/technicalboy2023/ai-router.git
cd ai-router

# Create feature branch
git checkout -b feature/add-mistral-provider

# Run tests
npm test

# Commit + push + open PR
git commit -m "feat: add Mistral AI provider"
git push origin feature/add-mistral-provider

Good first contributions:

  • πŸ”Œ New provider adapter (Mistral, Cohere, Together AI, Anthropic)
  • πŸ“Š Web dashboard UI for metrics
  • 🐳 Docker / docker-compose setup
  • πŸ§ͺ Test coverage improvements
  • πŸ“ Docs & usage examples

πŸ›‘ License

MIT License β€” free for personal and commercial use. See LICENSE for full details.


πŸ‘¨β€πŸ’» Author

Built with ❀️ by AMAN

GitHub

Self-hosted infrastructure enthusiast. Building open-source tools for AI developers who refuse vendor lock-in.


⭐ Support the Project

If this saved you time, money, or debugging pain β€” a star means everything.

Star on GitHub

⭐ Star Β· 🍴 Fork Β· πŸ“’ Share

Every star helps more developers discover this project. Thank you!


πŸ” Keywords

ai gateway Β· openai proxy Β· llm router Β· groq api Β· google gemini Β· openrouter Β· ollama Β· ai failover Β· api key rotation Β· self-hosted ai Β· open source llm Β· n8n ai Β· ai rate limit bypass Β· openai compatible Β· local ai server Β· llm proxy Β· multi-provider ai Β· ai load balancer


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors