Skip to content

serjs/openrouter-proxy-injector

Repository files navigation

OpenRouter-Proxy-Injector: OpenRouter API Proxy with key management for heavy usage cases.

Lightweight (~48Mb RAM in container) and smart proxy server for OpenRouter key rotation with automatic mitigation of upstream server rate limits.

Readme in your language

en ru

Service overview

OpenRouter Proxy Injector enables you to use and manage OpenRouter API keys for both DEV and PRODUCTION environments, while also tracking and handling rate limits from upstream providers for both paid and free models.

It’s ideal for "Vibe coding", intensive AI agent usage, or simply developing with the OpenRouter API.

Features

  • Smart Key Rotation: Uses a quota-aware strategy to prioritize keys with the most remaining daily capacity.
  • Mixed Key Support: Use different billing API keys for your agent swarm (supports both free and paid keys).
  • Daily Quota Management: Manage limits of 50 or 1000 free model requests per day for each API account, enabling nearly unlimited use of Free models through multiple accounts.
  • Proactive Throttling: Automatically respects the 20 requests-per-minute limit per key to avoid 429 errors.
  • Automatic Retries: Automatically retry requests until a response is received from the upstream model when hitting upstream service limits (e.g., frequent Google Gemini 429 rate limits during intensive agent usage).
  • Full API Support: Supports all OpenRouter API methods for model interaction as-is.
  • Streaming Support: Supports both streaming and non-streaming requests.

Tech details

howitworks_animation.mp4
Technical architecture

graph TD
A["Client (VSCode / AnythingLLM / Custom Code)"] -->|"Request with APIKEY variable"| B["Proxy Server"]
B --> C{"Key Manager"}
C -->|"Filter Keys"| D["Check Blocked / Daily Quota / RPM"]
D -->|"Select Best"| E["Prioritize by Remaining Quota %"]
E -->|"No Keys"| F["429 - All keys exhausted"] --> G["Response to client"]
E -->|"Key Selected"| H["Send request to OpenRouter"]
H --> I{"Response from OpenRouter"}
I -->|"200 OK"| J["Successful response"] --> G
I -->|"429 - Provider error (retryable)"| K["Retry request (up to 10 times)"] --> I
I -->|"429 - Daily limit reached"| L["Block key until UTC Midnight"] --> M["Retry with new key"] --> C
I -->|"4XX / 5XX - Other error"| ERR["Return error to client"] --> G
Loading

App config parameters

ENV variable Type Required Default Description
PROXY_API_KEY String True EMPTY Your custom unified API key for handling your requests
OPENROUTER_KEYS String True EMPTY OpenRouter API Keys. Supports optional limits: key1:50,key2:1000,key3. Defaults to 50.
TIMEZONE String False UTC Timezone for handling daily API usage limits for free models. Used to reset limited and locked keys.
UVICORN_PORT Int False 9999 Default app port to listen
UVICORN_HOST String False 0.0.0.0 Default app IP to listen
UVICORN_LOG_LEVEL String False info Set logging level. E.g. debug for showing request details, including body and responses. OpenRouter API Keys are obfuscated in debug logs.

Quickstart with Docker

  1. Install Docker engine

  2. Run container docker run -it -e PROXY_API_KEY=<RANDOM_UNIQUE_STRONG_KEY> -e OPENROUTER_KEYS=sk...ab,sk...cd -p 9999:9999 ghcr.io/serjs/openrouter-proxy-injector:latest

  3. Set OpenRouter API URL as your docker host URL, e.g. http://<docker_host_ip>:9999

  4. That's it!

  5. Check Openrouter Proxy Injector API and key statuses

  http://<docker_host_ip>:9999/docs

  http://<docker_host_ip>:9999/docs#/default/get_key_status_key_status_get

Quickstart with docker-compose

  1. Copy and edit .env: cp .env.sample .env

    Fill PROXY_API_KEY and OPENROUTER_KEYS as the minimum required parameters.

  2. Run docker compose up -d

  3. That's it!

  4. Check Openrouter Proxy Injector API and key statuses

  http://<docker_host_ip>:9999/docs

  http://<docker_host_ip>:9999/docs#/default/get_key_status_key_status_get

Setting up openrouter-proxy for

VSCode Continue

Example configuration and usage instructions for VSCode Continue.

VSCode Continue Example
LobeChat

Example configuration and usage instructions for LobeChat.

LobeChat Example Config
AnythingLLM

Example configuration and usage instructions for AnythingLLM.

AnythingLLM Example

FAQ

Q: What types of API Rate Limiting exist in OpenRouter?

A: There are three different limits:

  1. :free models limits: 20 requests per minute, 50 requests per day for <=10 credits and 1000 requests per day for >=10 credits. Official docs
  2. Cloudflare DDOS Protection Rate limits: Handled by user agent in the current codebase.
  3. Upstream server rate-limit: Global to OpenRouter upstreams, handled with exponential retries (respects the 20 requests-per-minute limit).

Q: How to know which limits I'm facing?

A: OpenRouter info in API responses are logged by the proxy injector.

  1. <MODEL_NAME> is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits - logged as Key sk-12345... is temporarily rate-limited upstream. Retrying 15 times.
  2. Rate limit exceeded: free-models-per-day. Add 10 credits to unlock 1000 free model requests per day - logged as Key sk-12345... has reached the maximum success count and is temporarily blocked. Reached daily :free models requests; blocked until limit reset (00:00 UTC).

About

Smart proxy server for OpenRouter key rotation with automated mitigation of upstream server rate limits

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors