Inference Gateway

An open-source, cloud-native, high-performance gateway unifying multiple LLM providers

📖 Documentation · 🚀 Getting Started · 💬 Discussions · 🐛 Issues

🌐 What is Inference Gateway?

Inference Gateway is a proxy server that provides a unified API to interact with multiple large language model (LLM) providers - from local solutions like Ollama to major cloud providers like OpenAI, Anthropic, Groq, Cohere, Cloudflare, and DeepSeek.

Stop managing multiple SDKs and API keys. Route all your LLM traffic through a single, production-ready gateway.

# One endpoint. Every provider.
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

✨ Key Features

Feature	Description
🔀 Unified API	One OpenAI-compatible endpoint for all LLM providers
🔌 MCP Integration	Native Model Context Protocol support for automatic tool discovery
🤖 A2A Protocol	Agent-to-Agent coordination across specialized agents
🌊 Streaming	Real-time token streaming from all supported providers
☸️ Kubernetes Ready	First-class K8s support with Operator and HPA scaling
📊 Observability	OpenTelemetry integration for monitoring and tracing
🔒 Privacy First	Self-hosted, zero data collection, MIT licensed
🌿 Lightweight	~10.8MB binary with minimal resource footprint

🏗️ Ecosystem

Core

Repository	Description
inference-gateway	The core gateway server
operator	Kubernetes Operator for lifecycle management
cli	Agentic CLI assistant with project context awareness
schemas	MCP, A2A, and OpenAPI schemas
docs	Documentation site

SDKs

Repository	Language
sdk	Go
typescript-sdk	TypeScript
rust-sdk	Rust
python-sdk	Python

Agent Development

Repository	Description
adl	Agent Definition Language for declarative agent definitions
adl-cli	Scaffold and manage A2A-powered enterprise agents
adk	Agent Development Kit (Go)
typescript-adk	Agent Development Kit (TypeScript)
rust-adk	Agent Development Kit (Rust)

A2A Agents

Repository	Description
google-calendar-agent	Google Calendar scheduling & automation
browser-agent	Browser automation via Playwright
documentation-agent	Context7-style documentation access
grafana-agent	Grafana dashboards automation
n8n-agent	n8n workflow generation & automation
mock-agent	Mocking and testing

Tools & Community

Repository	Description
a2a-debugger	A2A agents troubleshooter
registry	Registry for A2A agents and skills
awesome-a2a	Curated list of A2A-compatible agents
infer-action	GitHub Action for the Infer CLI

🚀 Quick Start

# Run with Docker
docker run -p 8080:8080 \
  -e OPENAI_API_KEY=your-key \
  ghcr.io/inference-gateway/inference-gateway:latest

# Or install the CLI
curl -fsSL https://raw.githubusercontent.com/inference-gateway/cli/main/install.sh | bash
infer init && infer chat

👉 Full setup guide: docs.inference-gateway.com/getting-started

🤝 Contributing

We welcome contributions of all kinds - bug reports, feature requests, documentation improvements, and code!

⭐ Star the main repo to show your support
🐛 Report bugs via GitHub Issues
💬 Join discussions in GitHub Discussions
🔧 Submit PRs - see CONTRIBUTING.md in each repository

Released under the MIT License · Built with ❤️ in Go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference Gateway

Inference Gateway

🌐 What is Inference Gateway?

✨ Key Features

🏗️ Ecosystem

Core

SDKs

Agent Development

A2A Agents

Tools & Community

🚀 Quick Start

🤝 Contributing

Pinned Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!