Open Source Computer Command Framework — v1.5.0
Voice-controlled, local-first AI agent that runs on your machine with any LLM.
No cloud. No subscription. No data leaves your computer.
opencodec.org · AVA Digital LLC
☕ Support the Project · 🏢 Enterprise Setup
CODEC turns your computer into a voice-controlled AI workstation. Press a key or say "Hey CODEC" — CODEC listens, thinks (using any LLM you choose), and acts: opening apps, drafting messages, reading your screen, analyzing documents, researching topics, writing code, and anything else you can describe.
A private, open-source alternative to Siri and Alexa that actually controls your computer — and writes its own plugins.
Built for macOS. Linux support planned.
"Hey CODEC, draft a reply to my last email saying I'll call tomorrow afternoon"
→ Reads Gmail, writes a professional reply, copies to clipboard
"CODEC, deep research artificial intelligence in healthcare — save to Google Docs"
→ Runs 8 AI agents, 20+ web searches, writes 10,000-word report with images to Drive
"Hey CODEC, open my competitor's website and write a 5-page analysis"
→ Reads the site via Chrome CDP, runs competitor analysis crew, saves to Google Docs
"CODEC, schedule my daily briefing every morning at 8"
→ Creates a cron schedule — every morning at 8am CODEC researches your industry and delivers a report
"Hey CODEC, what was I working on yesterday?"
→ Searches 1,800+ memory entries with FTS5, summarises your last 5 sessions
"CODEC, click the Submit button in Xcode"
→ ax_bridge reads the AX tree of Xcode, finds the button, clicks it
Hey CODEC, summarise this article [paste URL]
Hey CODEC, open Safari and search for Mac Studio reviews
Hey CODEC, write a LinkedIn post about my new project
Hey CODEC, what's on my calendar today?
Hey CODEC, install the bitcoin-price skill
Hey CODEC, run competitor analysis on OpenAI
git clone https://github.com/AVADSA25/codec.git
cd codec
./install.sh # one-line setup wizard
python3 codec.py # start CODECOr with Python directly:
pip3 install -r requirements.txt
python3 setup_codec.py # guided configuration
python3 codec.pyThe heart of CODEC. Press a key, speak, release — your command executes.
| Trigger | Action |
|---|---|
| Hold F18 (or F8 on laptop) | Record voice command |
| F16 / F9 | Type command in dialog |
| F13 / F5 | Toggle CODEC ON/OFF |
* * (double tap) |
Screenshot + describe screen |
+ + (double tap) |
Read document in clipboard |
| Say "Hey CODEC…" | Always-on wake word |
PTT Lock Mode: Double-tap F18 within 0.5s to lock recording hands-free. Tap again to stop.
Voice Session Warmup: CODEC pre-loads your memory the moment it detects you speaking — 500ms faster responses.
Hold any key, speak, release — text appears in any app.
- Whisper STT transcription (local, private)
- Smart post-processing: removes stutters, fixes "KODAK" → CODEC, strips hallucinations
- Works in any text field: Xcode, Word, Slack, browser, terminal
Select any text, right-click → CODEC transforms it instantly.
| Service | What it does |
|---|---|
| CODEC Proofread | Fix grammar, spelling, clarity |
| CODEC Elevate | Rewrite at executive level |
| CODEC Explain | Plain-English explanation |
| CODEC Translate | Translate to English |
| CODEC Prompt | Optimise text as an LLM prompt |
| CODEC Reply | Draft a reply (supports :tone syntax) |
| CODEC Read Aloud | Speak selected text via Kokoro TTS |
| CODEC Save | Save to Google Keep or local notes |
Install: System Settings → Privacy → Accessibility → Services
A 250K-context AI chat with file uploads, vision, web search, and multi-agent crews.
Features:
- 250K token context window
- File upload: PDF, code, images (vision)
- Force web search toggle (🔍 Web button)
- 8 built-in Agent Crews (see below)
- Custom agent builder — define role, tools, task
- Agent scheduling: run crews automatically on a cron schedule
8 Agent Crews:
| Crew | What it does |
|---|---|
deep_research |
Full research report → Google Docs (10,000 words, images) |
daily_briefing |
Morning news + industry summary → Google Docs |
trip_planner |
Itinerary, hotels, flights → Google Docs |
competitor_analysis |
SWOT + competitive positioning → Google Docs |
email_handler |
Triage, draft replies, summarise inbox |
social_media |
Platform-specific posts (Twitter, LinkedIn, Instagram) |
code_review |
Bug hunt + security audit + clean code suggestions |
data_analysis |
Gather data, find trends, write insights report |
Scheduled Agents:
"Hey CODEC, run daily briefing every morning at 8"
"Schedule competitor analysis every Monday"
"List schedules"
Monaco editor + AI chat + live preview + Skill Forge, all in your browser.
Features:
- Monaco editor (VS Code engine)
- AI chat panel with full context
- ▶ Run code in-browser
- 👁️ Live Preview with point-click Inspect Mode
- Click any element in the preview → jumps to that line in Monaco
- Inline Edit Panel: change text, color, font size, padding visually
- Apply + Sync to Editor — changes write back to Monaco code permanently
- ⚡ Skill Forge — paste code, GitHub URL, or describe in plain English → auto-converts to CODEC skill
- 💾 Save as Skill — save anything in the editor directly to
~/.codec/skills/
A WebRTC-style voice interface built from scratch. No third-party voice pipeline.
Features:
- Real-time voice activity detection (VAD)
- Wake-word detection with 4-layer noise gate:
- Smoothed energy threshold (decay rate 0.85)
- Speech fraction gate (≥12% samples above threshold)
- Whisper confidence filter (avg_logprob ≥ -1.0)
- Noise floor 30.0
- Voice energy ring animation (mic RMS → animated glow)
- Full conversation memory across calls
- Skill dispatch mid-call
- Native SwiftUI status bar overlay (replaces tkinter popups)
Control your Mac from anywhere via a mobile-optimised web dashboard.
- Real-time status: all services, memory stats, last command
- Send commands from phone
- View conversation history
- Secured via Cloudflare Zero Trust tunnel
- Works on any device with a browser
Setup:
cloudflared tunnel --url http://localhost:8090Access at: https://codec.yourdomain.com
CODEC exposes 43 tools as an MCP server. Claude Desktop, Cursor, VS Code Copilot, and any MCP client can invoke CODEC skills directly.
// ~/.claude/claude_desktop_config.json
{
"mcpServers": {
"codec": {
"command": "python3",
"args": ["/Users/you/codec-repo/codec_mcp.py"]
}
}
}Then in Claude: "Use the CODEC google_calendar skill to check my schedule for tomorrow."
Browse, install, and publish skills from the community registry.
codec install bitcoin-price # Install from marketplace
codec search weather # Search available skills
codec list # All installed skills
codec update # Update marketplace skills
codec publish my-skill.py # Submit your skillOr by voice: "Hey CODEC, install the bitcoin price skill"
Schedule any agent crew to run automatically.
# Via voice
"Run daily briefing every morning at 8"
"Schedule competitor analysis every Monday at 9am"
"List schedules"
# Via CLI
python3 codec_scheduler.py add daily_briefing --hour 8
python3 codec_scheduler.py listPowered by codec-scheduler PM2 process — checks every minute, fires crews on schedule.
codec-heartbeat runs every 30 minutes:
- ✅ Checks all 5 AI services are alive
- 📊 Reports memory stats (entries, sessions)
- 🔍 Finds tasks saved during voice calls
- 🚀 Auto-executes pending tasks via
/api/command
Example: Say "Save a task for CODEC to open YouTube and search for AI news" during a voice call. Next heartbeat picks it up and executes it automatically.
A compiled Swift binary giving CODEC full accessibility control over any macOS app.
# Click a button by name
./ax_bridge --pid <pid> --action click --selector "role:AXButton name:OK"
# Read a text field
./ax_bridge --pid <pid> --action value --selector "role:AXTextField"
# Show full UI tree
./ax_bridge --pid <pid> --action tree --depth 2Voice: "Hey CODEC, click the Submit button" → clicks in the frontmost app
Chrome DevTools Protocol integration for precise browser control.
Fill form fields by CSS selector
Click specific elements
Extract structured data (tables, prices, links)
Take page screenshots
Run JavaScript in any tab
Falls back to AppleScript if Chrome CDP is unavailable.
Full-text search across every conversation CODEC has ever had.
"Hey CODEC, what did I ask about React hooks last week?"
"Search my memory for anything about the Berlin trip"
SQLite FTS5 index — instant results across thousands of entries.
CODEC ships with 48 built-in skills. Add more via the Marketplace or write your own.
Google Workspace
| Skill | Triggers |
|---|---|
google_calendar |
"what's on my calendar", "schedule meeting", "add event" |
gmail_reader |
"read my emails", "check Gmail", "latest emails" |
google_drive |
"search my Drive", "find file", "list Drive files" |
google_docs_create |
"create a doc", "write to Google Docs" |
google_sheets |
"open spreadsheet", "create sheet" |
google_slides |
"create presentation", "make slides" |
google_tasks |
"add to tasks", "check my tasks" |
google_keep |
"save note", "add to Keep" |
Productivity
| Skill | Triggers |
|---|---|
calculator |
"calculate", "what is", "how much is" |
clipboard |
"paste that", "copy to clipboard" |
timer |
"set timer", "remind me in" |
pomodoro |
"pomodoro", "focus timer" |
file_search |
"find file", "where is", "search files" |
memory_search |
"what did I ask about", "search my memory" |
scheduler |
"schedule agent", "run every morning" |
marketplace |
"install skill", "search skills" |
System
| Skill | Triggers |
|---|---|
process_manager |
"what's using CPU", "kill process" |
network_info |
"what's my IP", "check network" |
brightness |
"dim screen", "brightness up" |
screenshot_ocr |
"read what's on screen", "OCR this" |
terminal_run |
"run command", "execute in terminal" |
Chrome (AppleScript + CDP)
| Skill | Triggers |
|---|---|
chrome_open |
"open Chrome", "open website" |
chrome_search |
"search for", "Google" |
chrome_read |
"read this page", "what does this page say" |
chrome_tabs |
"list tabs", "close tab", "switch tab" |
chrome_fill |
"fill in", "fill form" |
chrome_extract |
"extract from page", "get data from page" |
chrome_scroll |
"scroll down", "scroll to bottom" |
chrome_automate |
"morning tabs", "open work tabs" |
Research & Writing
| Skill | Triggers |
|---|---|
web_search |
"search the web", "look up" |
web_fetch |
"summarise this article", "fetch URL" |
draft_email |
"draft email", "write email to" |
proofread |
"proofread this", "fix my writing" |
CODEC runs entirely on your machine. Nothing is sent to any external server unless you configure it.
- Subprocess isolation: All skill scripts run as subprocesses
- Resource limits: 512MB RAM, 120s CPU per skill execution
- Safety blocklist: 30+ dangerous command patterns blocked
- Command Preview: Review and Approve/Deny before any script runs
- Skill verification: Marketplace skills require confirmation; unverified skills show warnings
- Google OAuth: Read+write scopes, token stored in
~/.codec/google_token.json
| Key | Action |
|---|---|
| F13 | Toggle CODEC ON/OFF |
| F18 (hold) | Record voice, release to send |
| F18 (double-tap) | PTT Lock — hands-free dictation |
| F16 | Text input dialog |
* * |
Screenshot mode |
+ + |
Document mode |
| Key | Action |
|---|---|
| F5 | Toggle CODEC ON/OFF |
| F8 (hold) | Record voice, release to send |
| F9 | Text input dialog |
For laptop mode: Enable "Use F1, F2, etc. as standard function keys" in System Settings → Keyboard.
Custom shortcuts: Edit ~/.codec/config.json:
{
"key_toggle": "f5",
"key_voice": "f8",
"key_text": "f9"
}Then restart: pm2 restart open-codec
CODEC works with any OpenAI-compatible local or cloud API:
| LLM | Setup |
|---|---|
| Qwen 3.5 35B (recommended) | mlx-lm.server --model mlx-community/Qwen3.5-35B-A3B-4bit |
| Llama 3.3 70B | mlx-lm.server --model mlx-community/Llama-3.3-70B-Instruct-4bit |
| Mistral 24B | mlx-lm.server --model mlx-community/Mistral-Small-3.1-24B-Instruct-2503-4bit |
| Gemma 3 27B | mlx-lm.server --model mlx-community/gemma-3-27b-it-4bit |
| GPT-4o (cloud) | "llm_url": "https://api.openai.com/v1" |
| Claude API (cloud) | Use OpenAI-compatible proxy |
| Ollama | "llm_url": "http://localhost:11434/v1" |
Change in ~/.codec/config.json:
{
"llm_url": "http://localhost:8081/v1",
"model": "mlx-community/Qwen3.5-35B-A3B-4bit"
}- Laptop keyboard? Run
python3 setup_codec.py→ select "Laptop / Compact" in Step 4 - macOS stealing F-keys? System Settings → Keyboard → enable "Use F1, F2, etc. as standard function keys"
- Specific conflict? System Settings → Keyboard → Keyboard Shortcuts → Mission Control → uncheck conflicting key
- After config change: Always restart:
pm2 restart open-codec
- Check Whisper:
pm2 logs whisper-stt --lines 5 --nostream - Check mic permission: System Settings → Privacy & Security → Microphone
- Speak clearly: "Hey CODEC" — 3 distinct syllables
- Background noise? The 4-layer noise gate filters most noise, but loud music near the mic can interfere
- Check Kokoro:
curl http://localhost:8085/v1/models - Fallback to macOS
say: set"tts_engine": "say"in config.json - Disable TTS: set
"tts_engine": "none"
- Check:
curl http://localhost:8090/ - Restart:
pm2 restart codec-dashboard - Remote access:
pm2 logs cloudflared --lines 3 --nostream
- Check:
pm2 logs open-codec --lines 20 --nostream | grep -i skill - Count:
ls ~/.codec/skills/*.py | wc -l - Test:
python3 -c "from calculator import run; print(run('2+2'))"
- The first run takes 2-5 min — wait for it
- Check:
pm2 logs codec-dashboard --lines 30 --nostream | grep Agents - Verify Serper API key in
deep_research.py - Agents use background jobs — no Cloudflare timeout
codec.py — Entry point
codec_config.py — Configuration and clean_transcript()
codec_keyboard.py — Keyboard listener, PTT lock, wake word
codec_dispatch.py — Skill matching and dispatch
codec_agent.py — LLM session builder
codec_overlays.py — Overlay popups
codec_compaction.py — Context compaction
codec_memory.py — FTS5 memory search
codec_agents.py — Multi-agent crew framework (8 crews)
codec_voice.py — WebSocket voice pipeline
codec_voice.html — Voice call UI
codec_dashboard.py — Web API + dashboard
codec_dashboard.html — Dashboard page
codec_chat.html — Deep Chat UI
codec_vibe.html — Vibe Code IDE
codec_textassist.py — 8 right-click services
codec_search.py — DuckDuckGo + Serper search
codec_mcp.py — MCP server (43 tools)
codec_heartbeat.py — Health monitoring + task auto-execution
codec_scheduler.py — Cron-like agent scheduling
codec_marketplace.py — Skill Marketplace CLI
deep_research.py — Deep Research crew (10k-word reports + images)
ax_bridge/ — Swift AX accessibility bridge
swift-overlay/ — SwiftUI status bar app
skills/ — 48 built-in skills
tests/ — 212+ pytest tests
install.sh — One-line installer
setup_codec.py — Setup wizard (9 steps)
- Linux support (keyboard daemon, TTS, Whisper all work — packaging needed)
- Windows support via WSL
- Multi-machine sync (skills, memory across devices)
- CODEC iOS app (dictation + remote dashboard)
- Plugin API (third-party skill packages)
- Web UI skill editor (visual trigger + code editor)
- Streaming voice responses (first token plays while rest generates)
- Multi-LLM routing (fast model for simple, powerful for complex)
See CONTRIBUTING.md. All skill submissions welcome — 48 built-in, marketplace growing.
git clone https://github.com/AVADSA25/codec.git
cd codec && ./install.sh
python3 -m pytest # all tests must passIf CODEC saves you time, consider supporting development:
- ⭐ Star on GitHub
- ☕ Buy me a coffee
- 🏢 Enterprise / Professional Setup: avadigital.ai
Need CODEC configured for your business, integrated with your tools, or deployed across a team? Contact AVA Digital for professional setup and custom skill development.
Built with ❤️ by AVA Digital LLC · MIT License
