ollama pull qwen2.5-coder:7b- Size: 4.7GB download
- RAM: 8GB+ required
- Tool Calling: ✅ Excellent
- Best for: Production use, complex coding tasks, reliable tool execution
ollama pull deepseek-coder:6.7b- Size: 3.8GB download
- RAM: 8GB+ required
- Tool Calling: ✅ Excellent
- Best for: Fast responses, coding tasks, good balance
ollama pull qwen2.5:3b- Size: 1.9GB download
- RAM: 4GB+ required
- Tool Calling:
⚠️ Limited (may have issues) - Best for: Testing, simple tasks, resource-constrained systems
ollama pull qwen2.5-coder:1.5b- Size: 986MB download
- RAM: 2GB+ required
- Tool Calling: ❌ Poor (frequent issues)
- Best for: Quick tests only, not for actual work
| Model | Parameters | Download | RAM | Speed | Tool Calling | Code Quality |
|---|---|---|---|---|---|---|
| qwen2.5-coder:7b | 7B | 4.7GB | 8GB+ | Medium | ✅ Excellent | ⭐⭐⭐⭐⭐ |
| deepseek-coder:6.7b | 6.7B | 3.8GB | 8GB+ | Fast | ✅ Excellent | ⭐⭐⭐⭐⭐ |
| qwen2.5:3b | 3B | 1.9GB | 4GB+ | Fast | ⭐⭐⭐ | |
| qwen2.5-coder:1.5b | 1.5B | 986MB | 2GB+ | Very Fast | ❌ Poor | ⭐⭐ |
What is Tool Calling? HiveTerminal uses "tools" to interact with your system:
- Read/write files
- Execute bash commands
- Search code
- Manage todos
- And more...
Why Model Size Matters:
- 7B+ models: Understand when and how to use tools correctly
- 3B models: Sometimes confuse tool usage, may output raw JSON
- 1.5B models: Frequently fail at tool calling, output malformed responses
✅ No issues - Works as expected
- Proper tool execution
- Natural language responses
- Understands context
- May output raw JSON for simple queries
- Sometimes calls tools inappropriately
- Can get confused with complex instructions
Example Issue:
You: hi
Model: {"name": "ask_user_question", "arguments": {...}}
Instead of just saying "Hello!"
❌ Frequent issues:
- Regularly outputs raw JSON instead of responses
- Calls tools when it shouldn't
- Struggles with multi-step tasks
- Poor code quality
Not recommended for actual work.
macOS:
brew install ollamaLinux:
curl -fsSL https://ollama.ai/install.sh | shOr download from: https://ollama.ai
# Best choice (recommended)
ollama pull qwen2.5-coder:7b
# Fast alternative
ollama pull deepseek-coder:6.7b
# For testing
ollama pull qwen2.5:3b
# Minimal (not recommended)
ollama pull qwen2.5-coder:1.5b# List installed models
ollama list
# Test a model
ollama run qwen2.5-coder:7b "Hello, write a Python hello world"You can switch models anytime:
# Run setup again
hive --setup
# Select Ollama
# Enter new model name (e.g., "deepseek-coder:6.7b")Or manually edit ~/.vibe/config.toml:
active_model = "qwen2.5-coder:7b"
[[models]]
name = "qwen2.5-coder:7b"
provider = "ollama"
alias = "qwen2.5-coder:7b"
temperature = 0.2
input_price = 0.0
output_price = 0.0- CPU: Modern multi-core processor
- RAM: 2GB+ (for 1.5B models)
- Disk: 3GB+ free space
- OS: macOS or Linux
- CPU: 4+ cores
- RAM: 8GB+ (for 7B models)
- Disk: 10GB+ free space
- OS: macOS or Linux with recent kernel
- CPU: 8+ cores
- RAM: 16GB+
- Disk: SSD with 20GB+ free space
- GPU: Optional (Ollama can use GPU acceleration)
-
Use GPU acceleration (if available):
# Ollama automatically uses GPU if available # Check with: ollama ps
-
Reduce context size:
- Edit
~/.vibe/config.toml - Lower
max_charsin[project_context]
- Edit
-
Use smaller models for simple tasks:
- 3B models are 2-3x faster than 7B
- Good for quick edits and simple queries
- Use larger models (7B+)
- Lower temperature (0.1-0.3 for coding)
- Provide clear, specific prompts
- Use Spec Mode for complex tasks
Cause: Model too small for tool calling
Solution: Upgrade to 7B model:
ollama pull qwen2.5-coder:7b
hive --setup # Select new modelCause: Model too large for your system
Solution: Try smaller model:
ollama pull qwen2.5:3b
hive --setup # Select new modelCause: Not enough RAM
Solutions:
- Close other applications
- Use smaller model (3B instead of 7B)
- Upgrade RAM
Cause: Model name mismatch
Solution: Check exact name:
ollama list # See installed models
hive --setup # Enter exact name with tagYou can try other Ollama models:
# Code-focused models
ollama pull codellama:7b
ollama pull starcoder2:7b
ollama pull phind-codellama:34b # Needs 32GB+ RAM
# General models
ollama pull llama3.1:8b
ollama pull mistral:7b
ollama pull gemma2:9bNote: Not all models work well with tool calling. The recommended models have been tested with HiveTerminal.
If local models don't work for you, HiveTerminal also supports:
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude)
- OpenRouter (access to many models)
- Google AI Studio
- Groq
- Hugging Face
Run hive --setup and select your preferred provider.
For most users:
ollama pull qwen2.5-coder:7bFor fast systems:
ollama pull deepseek-coder:6.7bFor testing only:
ollama pull qwen2.5:3bAvoid for production:
# Don't use 1.5B models for actual work
ollama pull qwen2.5-coder:1.5bChoose based on your system resources and use case. When in doubt, go with the 7B model! 🐝