Tired of being limited to a single AI model in Copilot Chat? Break free.
The LiteLLM Connector unlocks hundreds of models from any providerβOpenAI, Anthropic, Google, Mistral, local Llama, custom fine-tunes, you name itβand brings them directly into your VS Code Copilot Chat experience.
If LiteLLM can talk to it, Copilot can use it.
Whether you're a developer who wants to experiment with different models, a team that needs cost-effective options, or an organization running private LLMs behind your firewallβthis extension gives you the freedom to choose the right model for the job, without leaving your editor.
If this extension saves you time or helps you work more effectively, please consider:
- β Star the repo on GitHub: https://github.com/gethnet/litellm-connector-copilot
- π Leave a review on the VS Code Marketplace: https://marketplace.visualstudio.com/items?itemName=GethNet.litellm-connector-copilot
- β Support development via Ko-fi or Buy Me a Coffee
Your support keeps this project alive and improving! β€οΈ
- β VS Code 1.110+ (required)
- β
GitHub Copilot Individual subscription (Free or Paid Individual plans work).
β οΈ Important: GitHub Copilot Business (Organization) and Enterprise plans are not currently supported due to VS Code API limitations. For technical details, see the VS Code Language Model API documentation and the list of supported individual plans.
- π A LiteLLM proxy running somewhere (locally or in the cloud)
- π Your Base URL and optionally an API Key
New to LiteLLM? Check out their documentation to learn how to set up a proxy that can route to any model provider.
- Install the "LiteLLM Connector for Copilot" extension from the VS Code Marketplace
- Open the Command Palette:
Ctrl+Shift+P(Windows/Linux) orCmd+Shift+P(Mac) - Run:
Manage LiteLLM Provider - Choose between Configure Single Backend (Legacy) for a quick setup or Manage Multiple Backends to aggregate models from several LiteLLM proxy instances.
Multi-Backend Power: You can now connect to multiple LiteLLM instances simultaneously (e.g., Local Llama + Cloud GPT-4 + Internal Proxy). Models are automatically namespaced (e.g.,
local/llama-3) to prevent conflicts.
- Enter your LiteLLM proxy details (Base URL and API Key).
- Open Copilot Chat and pick a model from the LiteLLM section.
- Start chatting! π
That's it! Your models from the LiteLLM proxy will automatically appear in the model picker.
This isn't just another AI connectorβit's built with care and designed for real-world use:
Access hundreds of models through your LiteLLM proxy: GPT-4, Claude 3.5, Gemini Pro, Llama 3, DeepSeek, local models, and custom fine-tunes. All in one place.
Connect to multiple LiteLLM instances at once. Mix and match local, cloud, and team proxies seamlessly. Models from different backends are clearly labeled and ready for use.
Real-time, streaming responses just like native Copilot models. No waiting for complete responsesβwatch as the AI thinks and types.
Models can use tools and functions to interact with your workspace. Perfect for code analysis, git operations, and complex workflows.
Use image-capable models to analyze screenshots, diagrams, and code directly in chat. Upload images and get insights.
- π§ V2 Chat Provider (Experimental)
Supports VS Code's newer Language Model APIs including
LanguageModelChatMessage2andLanguageModelThinkingPartfor reasoning/thinking models. Emits structured text, thinking, data, and tool-call parts to the progress callback.
The extension automatically handles provider-specific quirks:
- Strips unsupported parameters (like
temperaturefor O1 models) - Retries with cleaned payloads when models reject flags
- Normalizes tool call IDs for strict providers
- No manual parameter tuning needed
See real-time token usage with context window indicators (e.g., "β128K in / β16K out"). Helps you stay within limits and understand costs.
Generate structured, conventional commit messages from your staged changes. The extension analyzes your diff and creates clear, professional commit messages.
Automatically strips Markdown code blocks from generated commit messages for a clean SCM experience.
Run LiteLLM: Check Connection anytime to verify your proxy configuration. Troubleshooting made easy.
Optional inactivity watchdog prevents stuck streams. Configurable timeout keeps your workflow smooth.
Send no-cache headers to bypass LiteLLM caching when you need fresh responses. Provider-aware behavior ensures compatibility.
Your API keys and URLs are stored safely in VS Code's encrypted SecretStorage. No plaintext secrets.
Enable LiteLLM-powered inline completions as an alternative to Copilot's default. Great for experimentation.
- Developers who want to experiment with different AI models without switching tools
- Teams that need cost-effective or specialized models for specific tasks
- Organizations running private LLMs behind firewalls for security/compliance
- AI enthusiasts who want to test new models as soon as they're released
- Researchers comparing model performance on real code
- Anyone who's thought "I wish I could use [X model] in Copilot Chat"
- οΏ½ Multi-Repo Commit Generation β Commit message generation now correctly identifies the active repository in multi-repo workspaces. Generates the right diff from the right repo every time.
- π§ͺ Telemetry & Observability β PostHog-backed telemetry for feature-usage tracking, request metrics, and structured JSONL logging. All non-identifiable and opt-in.
- π§ Model Capability Overrides β Manually override VS Code's capability detection (
toolCalling,imageInput) when auto-detection is incorrect. Configure vialitellm-connector.modelCapabilitiesOverrides. - π§ V2 Chat Provider β Experimental support for newer VS Code chat APIs including thinking parts for reasoning models.
- οΏ½π Advanced Token Counting β Smarter budgeting with local estimation, background refinement, and short-lived caching for faster, more accurate context management.
- ποΈ Optimized Model Discovery β Intelligent discovery throttling with in-flight deduplication and TTL caching to prevent excessive proxy lookups.
- π§Ό SCM Message Sanitization β Clean commit messages by automatically stripping triple backticks and Markdown artifacts.
- βοΈ Git Commit Generation β Generate structured, conventional commits directly from the SCM view using any LiteLLM-supported model.
- π Connection Diagnostics β Use the
LiteLLM: Check Connectioncommand to instantly validate your proxy and authentication setup. - π§± Tool-call Hardening β Improved compatibility for strict providers (like GPT-5/o1) with normalized tool call IDs.
Fine-tune your experience with these settings (accessible via VS Code Settings):
| Setting | Type | Default | Description |
|---|---|---|---|
litellm-connector.baseUrl |
string | "" |
Base URL for the LiteLLM proxy server. |
litellm-connector.backends |
array | [] |
List of LiteLLM backends to connect to. Each requires a unique name and URL. |
litellm-connector.apiKeySecretRef |
string | "default" |
Reference name for the API key stored in VS Code SecretStorage. |
litellm-connector.commitModelIdOverride |
string | "" |
Override the model used for git commit message generation. Leave empty to disable. |
litellm-connector.inactivityTimeout |
number | 60 |
Seconds of inactivity before the connection is considered idle. |
litellm-connector.disableCaching |
boolean | true |
Send no-cache headers to bypass LiteLLM caching. |
litellm-connector.enableResponsesApi |
boolean | false |
(Experimental) Enable the VSCode Responses API integration. |
litellm-connector.disableQuotaToolRedaction |
boolean | false |
Disable automatic tool removal when a quota error is detected in chat history. |
litellm-connector.modelOverrides |
object | {} |
Override or add tags for specific models (e.g., inline-completions,chat,tools). |
litellm-connector.modelCapabilitiesOverrides |
object | {} |
Override model capabilities (toolCalling, imageInput) reported to VS Code (e.g., toolCalling,imageInput). |
litellm-connector.inlineCompletions.enabled |
boolean | false |
Enable LiteLLM inline completions via VS Code's stable inline completion provider API. (Deprecated: will be removed) |
litellm-connector.inlineCompletions.modelId |
string | "" |
(Deprecated) Use VS Code's inlineChat.defaultModel setting instead. |
litellm-connector.emitUsageData |
boolean | false |
(Experimental) Emit token usage metadata as a response data part. |
litellm-connector.sendDefaultParameters |
boolean | false |
(Temporary, will be removed) Send default temperature, frequency_penalty, and presence_penalty if not provided. Recommended: false. |
Tip: Most users won't need to touch theseβthe defaults work great out of the box!
Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P) and try these:
| Command | What It Does |
|---|---|
| Manage LiteLLM Provider | Configure your Base URL and API key. Refreshes the model list. |
| LiteLLM: Check Connection | Test if your proxy is reachable and credentials are valid. |
| LiteLLM: Reload Models | Manually refresh the model list from your proxy. |
| LiteLLM: Reset All Configuration | |
| LiteLLM: Select Commit Message Model | Choose which model generates your commit messages. |
| LiteLLM: Show Available Models | See all models currently discovered from your proxy. |
| LiteLLM: Select Inline Completion Model | Choose which model powers inline completions. |
- Run
LiteLLM: Check Connectionto verify your Base URL and API key - Ensure your LiteLLM proxy is running and accessible
- Try
LiteLLM: Reload Modelsto force a refresh - If still stuck, use
LiteLLM: Reset All Configurationand start fresh
- Check that your LiteLLM proxy is running and the Base URL is correct
- Verify network connectivity (firewall, VPN, proxy settings)
- If using a remote proxy, ensure CORS is configured appropriately
- Check the proxy logs for incoming requests
VS Code stores credentials in encrypted SecretStorage. Reinstalling doesn't clear this. Use LiteLLM: Reset All Configuration instead.
The extension automatically detects quota errors and can redact tools to recover. If this happens frequently:
- Check your LiteLLM proxy's rate limits
- Consider upgrading your plan or adding more API keys
- The
disableQuotaToolRedactionsetting can control this behavior
Some models have strict tool-call validation. The extension normalizes tool call IDs automatically, but if you encounter issues:
- Verify your LiteLLM proxy supports the model's tool-calling format
- Check proxy logs for rejected requests
- Try a different model variant
Bug reports and feature requests are welcome!
- Issues: https://github.com/gethnet/litellm-connector-copilot/issues
- Pull Requests: Contributions are reviewed and appreciated
Apache-2.0 Β© GethNet