🚀 LiteLLM Connector for Copilot

Welcome! Choose Your Own AI Adventure 🎯

Tired of being limited to a single AI model in Copilot Chat? Break free.

The LiteLLM Connector unlocks hundreds of models from any provider—OpenAI, Anthropic, Google, Mistral, local Llama, custom fine-tunes, you name it—and brings them directly into your VS Code Copilot Chat experience.

If LiteLLM can talk to it, Copilot can use it.

Whether you're a developer who wants to experiment with different models, a team that needs cost-effective options, or an organization running private LLMs behind your firewall—this extension gives you the freedom to choose the right model for the job, without leaving your editor.

⭐️ Support the Project

If this extension saves you time or helps you work more effectively, please consider:

⭐ Star the repo on GitHub: https://github.com/gethnet/litellm-connector-copilot
📝 Leave a review on the VS Code Marketplace: https://marketplace.visualstudio.com/items?itemName=GethNet.litellm-connector-copilot
☕ Support development via Ko-fi or Buy Me a Coffee

Your support keeps this project alive and improving! ❤️

🛠️ Getting Started (It's Easier Than You Think!)

Prerequisites

✅ VS Code 1.110+ (required)
✅ GitHub Copilot Individual subscription (Free or Paid Individual plans work).
- ⚠️ Important: GitHub Copilot Business (Organization) and Enterprise plans are not currently supported due to VS Code API limitations. For technical details, see the VS Code Language Model API documentation and the list of supported individual plans.
🌐 A LiteLLM proxy running somewhere (locally or in the cloud)
🔑 Your Base URL and optionally an API Key

New to LiteLLM? Check out their documentation to learn how to set up a proxy that can route to any model provider.

Installation & Setup (60 seconds)

Install the "LiteLLM Connector for Copilot" extension from the VS Code Marketplace
Open the Command Palette: Ctrl+Shift+P (Windows/Linux) or Cmd+Shift+P (Mac)
Run: Manage LiteLLM Provider
Choose between Configure Single Backend (Legacy) for a quick setup or Manage Multiple Backends to aggregate models from several LiteLLM proxy instances.

Multi-Backend Power: You can now connect to multiple LiteLLM instances simultaneously (e.g., Local Llama + Cloud GPT-4 + Internal Proxy). Models are automatically namespaced (e.g., local/llama-3) to prevent conflicts.

Enter your LiteLLM proxy details (Base URL and API Key).
Open Copilot Chat and pick a model from the LiteLLM section.
Start chatting! 🎉

That's it! Your models from the LiteLLM proxy will automatically appear in the model picker.

💡 What Makes This Special?

This isn't just another AI connector—it's built with care and designed for real-world use:

🌍 Any Model, Any Provider

Access hundreds of models through your LiteLLM proxy: GPT-4, Claude 3.5, Gemini Pro, Llama 3, DeepSeek, local models, and custom fine-tunes. All in one place.

⛓️ Multi-Backend Aggregation

Connect to multiple LiteLLM instances at once. Mix and match local, cloud, and team proxies seamlessly. Models from different backends are clearly labeled and ready for use.

🌊 Smooth Streaming Experience

Real-time, streaming responses just like native Copilot models. No waiting for complete responses—watch as the AI thinks and types.

🛠️ Full Tool Calling Support

Models can use tools and functions to interact with your workspace. Perfect for code analysis, git operations, and complex workflows.

👁️ Vision Capabilities

Use image-capable models to analyze screenshots, diagrams, and code directly in chat. Upload images and get insights.

🧠 V2 Chat Provider (Experimental) Supports VS Code's newer Language Model APIs including LanguageModelChatMessage2 and LanguageModelThinkingPart for reasoning/thinking models. Emits structured text, thinking, data, and tool-call parts to the progress callback.

🧠 Smart, Automatic Compatibility

The extension automatically handles provider-specific quirks:

Strips unsupported parameters (like temperature for O1 models)
Retries with cleaned payloads when models reject flags
Normalizes tool call IDs for strict providers
No manual parameter tuning needed

📊 Token Awareness

See real-time token usage with context window indicators (e.g., "↑128K in / ↓16K out"). Helps you stay within limits and understand costs.

✍️ Git Commit Generation

Generate structured, conventional commit messages from your staged changes. The extension analyzes your diff and creates clear, professional commit messages.

🧼 Smart Sanitization

Automatically strips Markdown code blocks from generated commit messages for a clean SCM experience.

🔍 Built-in Diagnostics

Run LiteLLM: Check Connection anytime to verify your proxy configuration. Troubleshooting made easy.

⏱️ Reliable Timeout Handling

Optional inactivity watchdog prevents stuck streams. Configurable timeout keeps your workflow smooth.

🚫🧠 Cache Control

Send no-cache headers to bypass LiteLLM caching when you need fresh responses. Provider-aware behavior ensures compatibility.

🔐 Secure by Design

Your API keys and URLs are stored safely in VS Code's encrypted SecretStorage. No plaintext secrets.

⌨️ Optional Inline Completions

Enable LiteLLM-powered inline completions as an alternative to Copilot's default. Great for experimentation.

🎯 Who Is This For?

Developers who want to experiment with different AI models without switching tools
Teams that need cost-effective or specialized models for specific tasks
Organizations running private LLMs behind firewalls for security/compliance
AI enthusiasts who want to test new models as soon as they're released
Researchers comparing model performance on real code
Anyone who's thought "I wish I could use [X model] in Copilot Chat"

🆕 What's New?

� Multi-Repo Commit Generation – Commit message generation now correctly identifies the active repository in multi-repo workspaces. Generates the right diff from the right repo every time.
🧪 Telemetry & Observability – PostHog-backed telemetry for feature-usage tracking, request metrics, and structured JSONL logging. All non-identifiable and opt-in.
🔧 Model Capability Overrides – Manually override VS Code's capability detection (toolCalling, imageInput) when auto-detection is incorrect. Configure via litellm-connector.modelCapabilitiesOverrides.
🧠 V2 Chat Provider – Experimental support for newer VS Code chat APIs including thinking parts for reasoning models.
�📊 Advanced Token Counting – Smarter budgeting with local estimation, background refinement, and short-lived caching for faster, more accurate context management.
🏎️ Optimized Model Discovery – Intelligent discovery throttling with in-flight deduplication and TTL caching to prevent excessive proxy lookups.
🧼 SCM Message Sanitization – Clean commit messages by automatically stripping triple backticks and Markdown artifacts.
✍️ Git Commit Generation – Generate structured, conventional commits directly from the SCM view using any LiteLLM-supported model.
🔍 Connection Diagnostics – Use the LiteLLM: Check Connection command to instantly validate your proxy and authentication setup.
🧱 Tool-call Hardening – Improved compatibility for strict providers (like GPT-5/o1) with normalized tool call IDs.

⚙️ Configuration Options

Fine-tune your experience with these settings (accessible via VS Code Settings):

Setting	Type	Default	Description
`litellm-connector.baseUrl`	string	`""`	Base URL for the LiteLLM proxy server.
`litellm-connector.backends`	array	`[]`	List of LiteLLM backends to connect to. Each requires a unique name and URL.
`litellm-connector.apiKeySecretRef`	string	`"default"`	Reference name for the API key stored in VS Code SecretStorage.
`litellm-connector.commitModelIdOverride`	string	`""`	Override the model used for git commit message generation. Leave empty to disable.
`litellm-connector.inactivityTimeout`	number	`60`	Seconds of inactivity before the connection is considered idle.
`litellm-connector.disableCaching`	boolean	`true`	Send `no-cache` headers to bypass LiteLLM caching.
`litellm-connector.enableResponsesApi`	boolean	`false`	(Experimental) Enable the VSCode Responses API integration.
`litellm-connector.disableQuotaToolRedaction`	boolean	`false`	Disable automatic tool removal when a quota error is detected in chat history.
`litellm-connector.modelOverrides`	object	`{}`	Override or add tags for specific models (e.g., `inline-completions,chat,tools`).
`litellm-connector.modelCapabilitiesOverrides`	object	`{}`	Override model capabilities (`toolCalling`, `imageInput`) reported to VS Code (e.g., `toolCalling,imageInput`).
`litellm-connector.inlineCompletions.enabled`	boolean	`false`	Enable LiteLLM inline completions via VS Code's stable inline completion provider API. (Deprecated: will be removed)
`litellm-connector.inlineCompletions.modelId`	string	`""`	(Deprecated) Use VS Code's `inlineChat.defaultModel` setting instead.
`litellm-connector.emitUsageData`	boolean	`false`	(Experimental) Emit token usage metadata as a response data part.
`litellm-connector.sendDefaultParameters`	boolean	`false`	(Temporary, will be removed) Send default temperature, frequency_penalty, and presence_penalty if not provided. Recommended: false.

Tip: Most users won't need to touch these—the defaults work great out of the box!

⌨️ Available Commands

Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P) and try these:

Command	What It Does
Manage LiteLLM Provider	Configure your Base URL and API key. Refreshes the model list.
LiteLLM: Check Connection	Test if your proxy is reachable and credentials are valid.
LiteLLM: Reload Models	Manually refresh the model list from your proxy.
LiteLLM: Reset All Configuration	⚠️ Nuke option—clears all stored URLs and API keys.
LiteLLM: Select Commit Message Model	Choose which model generates your commit messages.
LiteLLM: Show Available Models	See all models currently discovered from your proxy.
LiteLLM: Select Inline Completion Model	Choose which model powers inline completions.

🐛 Troubleshooting & FAQ

"Models aren't showing up after configuration"

Run LiteLLM: Check Connection to verify your Base URL and API key
Ensure your LiteLLM proxy is running and accessible
Try LiteLLM: Reload Models to force a refresh
If still stuck, use LiteLLM: Reset All Configuration and start fresh

"Connection fails / timeout errors"

Check that your LiteLLM proxy is running and the Base URL is correct
Verify network connectivity (firewall, VPN, proxy settings)
If using a remote proxy, ensure CORS is configured appropriately
Check the proxy logs for incoming requests

"Reinstalling didn't fix the problem"

VS Code stores credentials in encrypted SecretStorage. Reinstalling doesn't clear this. Use LiteLLM: Reset All Configuration instead.

"I get 'Quota Exceeded' errors"

The extension automatically detects quota errors and can redact tools to recover. If this happens frequently:

Check your LiteLLM proxy's rate limits
Consider upgrading your plan or adding more API keys
The disableQuotaToolRedaction setting can control this behavior

"Tool calls are failing"

Some models have strict tool-call validation. The extension normalizes tool call IDs automatically, but if you encounter issues:

Verify your LiteLLM proxy supports the model's tool-calling format
Check proxy logs for rejected requests
Try a different model variant

📋 Feedback & Contributions

Bug reports and feature requests are welcome!

Issues: https://github.com/gethnet/litellm-connector-copilot/issues
Pull Requests: Contributions are reviewed and appreciated

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.devcontainer		.devcontainer
.github		.github
.memories		.memories
.serena		.serena
.vscode		.vscode
assets		assets
scripts		scripts
src		src
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
.vscode-test.mjs		.vscode-test.mjs
.vscodeignore		.vscodeignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
LiteLLM_API.json		LiteLLM_API.json
NOTICE		NOTICE
README.marketplace.md		README.marketplace.md
README.md		README.md
codecov.yml		codecov.yml
esbuild.js		esbuild.js
eslint.config.mjs		eslint.config.mjs
liteLLM-allowed-routes.json		liteLLM-allowed-routes.json
litellm-connector-copilot.code-workspace		litellm-connector-copilot.code-workspace
llm-response.json		llm-response.json
package.json		package.json
tsconfig.json		tsconfig.json

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🚀 LiteLLM Connector for Copilot

Welcome! Choose Your Own AI Adventure 🎯

⭐️ Support the Project

🛠️ Getting Started (It's Easier Than You Think!)

Prerequisites

Installation & Setup (60 seconds)

💡 What Makes This Special?

🌍 Any Model, Any Provider

⛓️ Multi-Backend Aggregation

🌊 Smooth Streaming Experience

🛠️ Full Tool Calling Support

👁️ Vision Capabilities

🧠 Smart, Automatic Compatibility

📊 Token Awareness

✍️ Git Commit Generation

🧼 Smart Sanitization

🔍 Built-in Diagnostics

⏱️ Reliable Timeout Handling

🚫🧠 Cache Control

🔐 Secure by Design

⌨️ Optional Inline Completions

🎯 Who Is This For?

🆕 What's New?

⚙️ Configuration Options

⌨️ Available Commands

🐛 Troubleshooting & FAQ

"Models aren't showing up after configuration"

"Connection fails / timeout errors"

"Reinstalling didn't fix the problem"

"I get 'Quota Exceeded' errors"

"Tool calls are failing"

📋 Feedback & Contributions

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 19

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages