🤖 Local LLM Connect

Run local AI models directly in VS Code — no cloud, no API keys, no manual server setup.

The extension scans your machine for models, starts the inference server automatically, and exposes a built-in MCP endpoint so any AI client (Claude Desktop, Cursor, Continue.dev) can use your local model too.

✨ Features

Feature	Description
🔍 Auto model discovery	Scans common folders for GGUF, SafeTensors, PyTorch, and Ollama models
⚡ Auto server start	Starts `llama-server` or `ollama serve` automatically — no terminal needed
🌐 MCP server	Built-in MCP endpoint at `http://127.0.0.1:3333/mcp` for any MCP client
💬 Chat panel	Full streaming chat with your local model
🔧 Code tools	Explain, refactor, and ask about selected code
📁 Add any folder	Browse to any folder to add more models on the fly
🔌 Format support	GGUF · SafeTensors · PyTorch · Ollama

📦 Supported Model Formats

Format	Runtime needed	Where to get models
GGUF	llama.cpp (`llama-server` in PATH)	HuggingFace — search any model + GGUF
SafeTensors / PyTorch	Python + `transformers` (auto-installed)	HuggingFace — any standard model
Ollama	Ollama	`ollama pull llama3`

Auto-scanned directories

The extension looks for models in these locations automatically:

~/models             ~/Models             ~/Downloads
~/Documents          ~/.cache/huggingface/hub
~/.cache/lm-studio/models
~/llama.cpp/models   ~/.local/share/nomic.ai/gpt4all

Plus any custom paths you add via the "+ Add folder..." option or settings.

🚀 Quick Start

1. Prerequisites

Pick one of the following (or both):

Option A — GGUF models (fastest)

# Build llama.cpp (or download a release binary)
git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --target llama-server -j$(nproc)
sudo cp build/bin/llama-server /usr/local/bin/

# Download any GGUF model to ~/models/
# e.g. from https://huggingface.co/models?library=gguf

Option B — Ollama models (easiest)

# Install from https://ollama.com, then:
ollama pull llama3          # or any model

2. Install the Extension

From GitHub Releases (recommended)

Go to github.com/hit1001/vscode-local-llm-/releases/latest
Download local-llm-connect-2.0.0.vsix
Install it:

code --install-extension local-llm-connect-2.0.0.vsix

Or install via VS Code UI:

Open VS Code → Extensions panel (Ctrl+Shift+X)
Click ... (top-right) → Install from VSIX...
Select the downloaded .vsix file

From VS Code Marketplace (coming soon)

Search for Local LLM Connect in the Extensions panel

From source

git clone https://github.com/hit1001/vscode-local-llm-
cd vscode-local-llm-
npm install
npm run compile
# Press F5 in VS Code to run, or package with vsce

3. Use It

Open VS Code — look for ⊙ Select Model in the bottom-right status bar
Press Ctrl+Shift+M to scan and pick a model
The server starts automatically
Press Ctrl+Shift+L to open the chat panel

🔌 MCP Integration

The extension starts an MCP server automatically on http://127.0.0.1:3333/mcp.

Run Ctrl+Shift+P → Local LLM: Show MCP Connection Info for a full guide. Quick configs:

Claude Desktop

Edit ~/.config/claude/claude_desktop_config.json:

{
  "mcpServers": {
    "local-llm": {
      "url": "http://127.0.0.1:3333/mcp",
      "transport": "http"
    }
  }
}

Cursor / Windsurf / Continue.dev

{
  "mcpServers": {
    "local-llm": {
      "url": "http://127.0.0.1:3333/mcp",
      "transport": "http"
    }
  }
}

Test with curl

curl -X POST http://127.0.0.1:3333/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0", "id": 1, "method": "tools/call",
    "params": {
      "name": "chat",
      "arguments": {
        "messages": [{"role": "user", "content": "Hello! What can you do?"}]
      }
    }
  }'

Available MCP Tools

Tool	Description
`chat`	Full conversation with message history
`explain_code`	Explain a code snippet
`refactor_code`	Suggest code improvements
`complete`	Complete any text prompt

⌨️ Commands & Shortcuts

Command	Shortcut	Description
Local LLM: Select Model	`Ctrl+Shift+M`	Scan machine and pick a model
Local LLM: Open Chat	`Ctrl+Shift+L`	Open the chat panel
Local LLM: Ask About Selection	`Ctrl+Shift+A`	Ask about selected code/text
Local LLM: Explain Selected Code	Right-click menu	Explain selected code
Local LLM: Refactor Selected Code	Right-click menu	Refactor selected code
Local LLM: Show MCP Connection Info	Command Palette	Get MCP endpoint and connection configs
Local LLM: Stop Server	Command Palette	Stop the running inference server

⚙️ Settings

Setting	Default	Description
`localLLM.mcpPort`	`3333`	MCP server port
`localLLM.extraModelPaths`	`[]`	Extra directories to scan for models
`localLLM.temperature`	`0.7`	Generation temperature (0–2)
`localLLM.maxTokens`	`2048`	Max tokens per response
`localLLM.systemPrompt`	(coding assistant)	System prompt for all requests

🏗️ Architecture

VS Code Extension
├── localModelScanner.ts   — scans filesystem for GGUF / HuggingFace / Ollama models
├── serverManager.ts       — starts llama-server, ollama serve, or Python/transformers server
├── mcpServer.ts           — MCP HTTP server (port 3333) for external AI clients
├── llmClient.ts           — HTTP client for Ollama/OpenAI-compatible APIs (streaming)
├── chatPanel.ts           — WebView chat UI
└── extension.ts           — commands, status bar, activation

External MCP clients  ──→  MCP server (port 3333)  ──→  local model
VS Code chat panel    ──→  LLM client              ──→  local model
                                                         ↑
                                              llama-server / ollama / python

🛠️ Development

git clone https://github.com/YOUR_USERNAME/local-llm-connect
cd local-llm-connect
npm install
npm run compile      # or: npm run watch
# Press F5 in VS Code to launch Extension Development Host

To package:

npm install -g @vscode/vsce
vsce package         # → local-llm-connect-2.0.0.vsix

🤝 Contributing

Pull requests are welcome. For major changes, open an issue first to discuss.

Fork the repo
Create a feature branch (git checkout -b feature/my-feature)
Commit your changes
Push and open a Pull Request

📄 License

MIT — see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
huggingface		huggingface
src		src
.gitignore		.gitignore
.vscodeignore		.vscodeignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Local LLM Connect

✨ Features

📦 Supported Model Formats

Auto-scanned directories

🚀 Quick Start

1. Prerequisites

2. Install the Extension

3. Use It

🔌 MCP Integration

Claude Desktop

Cursor / Windsurf / Continue.dev

Test with curl

Available MCP Tools

⌨️ Commands & Shortcuts

⚙️ Settings

🏗️ Architecture

🛠️ Development

🤝 Contributing

📄 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 Local LLM Connect

✨ Features

📦 Supported Model Formats

Auto-scanned directories

🚀 Quick Start

1. Prerequisites

2. Install the Extension

3. Use It

🔌 MCP Integration

Claude Desktop

Cursor / Windsurf / Continue.dev

Test with curl

Available MCP Tools

⌨️ Commands & Shortcuts

⚙️ Settings

🏗️ Architecture

🛠️ Development

🤝 Contributing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages