security: Phase B/C top-10 remediation + full re-audit wave (1 critical + highs)#154
Conversation
The idiom `with ThreadPoolExecutor() as ex: fut.result(timeout=T)` defeats its own timeout: the context-manager __exit__ calls shutdown(wait=True), which blocks until the runaway task finishes. A 50ms timeout on a 5s task took ~5s, so codec_mcp tool dispatch and codec_observer OCR could hang on a slow skill or a stuck screencapture popup. Add codec_concurrency.run_with_timeout (daemon thread + join(timeout), no shutdown(wait=True)): surfaces the timeout promptly, abandons the daemon worker (never blocks shutdown), re-raises fn exceptions, and raises TimeoutError (== concurrent.futures.TimeoutError on py3.11+). Migrate both call sites; mirrors the proven codec_hooks._run_hook_with_timeout pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CodecMemory shared one sqlite3 connection across threads (check_same_thread=False) with no app-level lock — concurrent save/search on the same connection can corrupt cursor state or even segfault (the new test reproduced a hard crash before the fix). Add a threading.RLock and serialize every runtime connection use (save, search, search_recent, get_sessions, cleanup, rebuild_fts, close). RLock so get_context->search and cleanup->close don't self-deadlock; WAL + busy_timeout stay the cross-process layer. Route the unguarded ~/.codec JSON read-modify-writes through the cross-process file_lock: - grant_permission (routes/agents.py): new codec_agent_plan.grants_lock (mirrors _status_lock) held across load_grants -> modify -> save_grants -> set_grants_hash, so concurrent /grant calls can't clobber grants.json. - both notifications.json writers in codec_ask_user (question-write + mark-read) now also hold file_lock(NOTIFICATIONS_PATH), matching the pairing the pending_questions path already used. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…s (H1) Attacker-influenceable URLs (chat injection, clipboard, crew task) could be fetched against 127.0.0.1, 10/8, 192.168/16, 169.254.169.254 (cloud metadata), etc. — and the body flows back into the LLM transcript, so a read is an exfil path even with egress otherwise denied. Add codec_ssrf.validate_url: enforces http(s), resolves the host, and rejects if ANY resolved address is loopback / private / link-local / multicast / reserved / unspecified (dual-record + IPv4-mapped-IPv6 aware). Wire it BEFORE the HTTP call in skills/web_fetch.py (which clipboard_url_fetch delegates to) and in codec_agents._web_fetch. DNS-rebinding TOCTOU limitation documented. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…(H2/H6) The review-and-approve flow wrote an approved skill to the skills dir by basename with no name-collision check. An approved skill named after a hash-pinned built-in (calculator.py, file_write.py, ...) would take that trusted name — shadowing the real built-in, or overwriting it if the write dir is the repo skills dir. skill_approve now refuses any filename matching a name in the committed skills/.manifest.json (case-insensitive, so the guard also holds on case-insensitive filesystems), emitting skill_approve_blocked. PR-1A's load-time hash/AST gate remains the last line of defense; this refuses at the write point so the trusted file is never displaced. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
#10) - tests/test_a12_invariant.py: fails if a new inline `.post(...chat/completions ...)` appears outside codec_llm. Allowlists the documented vision sites (dashboard, watcher, both bridges, screenshot_text) + codec_core's generated session-script string. Runs in CI via the full pytest suite (no separate workflow step needed). Includes a synthetic-violation test proving the guard is not a no-op. - tests/test_oauth_provider.py: regression tests for the refresh-token scope-escalation defense (requested scopes must be a subset of the original grant) — previously untested. permission_gate mutation cases are already covered by test_agent_runner.py (8 tests incl. D-5 traversal/symlink); not duplicated. The flaky time.sleep -> threading.Event refactor is deferred — no specific flaky test identified, and rewriting passing tests without evidence risks regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…store registry) Design-only, per CLAUDE.md §11 + the SECURITY-REMEDIATION-DESIGN gate for the two remaining structural fixes. No production code. Each note grounds scope, seam, migration phasing, test plan, rollback, and open decisions in the actual HEAD surface (codec_dashboard chat cluster ~lines 2207-3005; ~30 ad-hoc json.dump writers + 2 duplicate atomic helpers). Awaiting sign-off before implementation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…k + registry (C8) Fix #9, scoped to the high-value, low-risk core (recommended decisions applied): - Phase 0: atomic_write_json gains `default=` (custom serializer, e.g. str for datetime) and `sort_keys=` passthrough; new read_modify_write(path, fn) standardizes lock+load+mutate+atomic-write so a future RMW can't forget the file_lock. - Phase 2: routes/_shared._save_notification now holds the cross-process file_lock across its load->insert->write (it ran under the in-process _notif_lock only, so separate PM2 daemons clobbered each other). - Phase 5: docs/STATE-FILES.md registry of every ~/.codec state file + lock policy + migration status. Deferred to a documented follow-on (see STATE-FILES.md): Phase 1 mass overwrite-durability swaps; Phase 3 helper convergence (entangles the don't-touch pending_questions/grants write paths — byte-preserving, reviewed per-helper); Phase 4 don't-touch files (config.json, OAuth/Google tokens) with per-file sign-off. The broad json.dump CI ratchet was rejected as low-signal (rationale in STATE-FILES.md). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…start) Fix #8, intra-file CC reduction. The routes/chat.py module split is deferred: chat_completion shares _load_prompt_overrides + CHAT_SYSTEM_PROMPT with the prompt-override endpoints, so a clean split needs those relocated first (tracked in docs/FIX8-ROUTES-CHAT-EXTRACTION-DESIGN.md). Extract two of the heaviest blocks out of the 356-line chat_completion into named, independently-testable helpers — behavior-preserving (the 48-test chat/stream/budget/escalation baseline is byte-identical before/after): - _build_chat_system_prompt(config, budget, has_attachment, last_user_text): override + step-budget + attachment / content-rewrite / observer-inject logic. - _chat_vision_response(body, messages): the image -> vision-model branch. chat_completion: 356 -> 252 lines. Adds tests/test_chat_helpers.py (5 unit tests now possible because the logic is no longer trapped inside the handler). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AuthMiddleware subclasses Starlette BaseHTTPMiddleware, which only runs on the
`http` scope — so the /ws/voice WebSocket handshake was completely
unauthenticated. Combined with the voice pipeline running any trigger-matched
skill (terminal/file_ops/pilot, no allowlist), an exposed dashboard (Cloudflare
tunnel / dashboard_host=0.0.0.0) was an open, auth-free path to code execution.
Add _ws_authorized() mirroring the HTTP AuthMiddleware Layers 0/1/2 (open when
nothing configured; else dashboard_token via ?token=/Bearer, OR a valid
TOTP-verified biometric session cookie) and reject before accept(). The
middleware author already intended /ws to be gated (the 401 branch checks
path.startswith("/ws")) — it just never executed on the websocket scope.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…re-audit N3) Fix #7 validated only the INITIAL URL, but both fetch sites follow redirects by default (requests allow_redirects=True; httpx follow_redirects=True), so a public URL returning 302 -> http://169.254.169.254/ (cloud metadata) or http://127.0.0.1/ reached the internal target unchecked and returned its body into the LLM transcript — defeating the H1 guard. Both sites now follow redirects MANUALLY with allow_redirects/follow_redirects False, calling codec_ssrf.validate_url on every hop and capping at 5 redirects. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… (re-audit N4/N10) Both _is_safe_target (file_write) and _is_safe_path (file_ops) realpath-resolved only the PARENT dir then re-joined the raw basename, so a symlinked FINAL component (e.g. ~/Documents/x.md -> ~/.zshrc, or x.json -> ~/.codec/ oauth_state.json) passed the blocklist and open() followed it — a write-through to ~/.zshrc/~/.codec (RCE/secret overwrite) or a read-exfil of secrets over the MCP-exposed surface, defeating D-4. Realpath the FULL path in both safety checks (resolves a symlinked basename to its blocked target), and open()/read()/write() the resolved target so the op can't be redirected between check and open (TOCTOU). realpath handles not-yet-existing files. Matches codec_agents._file_write, which already did it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
AuthMiddleware's ?s=<token> GET fallback (for mobile img/stream URLs) checked only token existence + age — NOT totp_verified, unlike the cookie path. Since auth_verify/auth_pin hand out a usable session token BEFORE TOTP is completed, a pre-TOTP (or stolen pre-TOTP) token could reach any GET /api endpoint via ?s=, bypassing 2FA. Extract _session_token_valid() (existence + age + TOTP) and route BOTH the cookie path (_verify_biometric_session) and the ?s= fallback through it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…lock/TTL (re-audit N7/N8/N9) N7: _chat_vision_response runs OUTSIDE chat_completion's try/except, so a non-200/malformed vision-backend response surfaced as a raw 500. Guard the POST + parse and return a graceful 502. N8: codec_observer._ocr_call did sys.path.insert() twice per poll (up to ~5760 entries/day) — and the very next comment says no import happens. Dead + a slow unbounded leak; removed. N9: _research_jobs had no lock and no eviction (unlike _agent_jobs post-H-4) — unbounded memory growth + a dict-changed-size race vs the worker. Add _evict_stale_research_jobs (+ _research_jobs_lock) mirroring the agent-jobs pattern; deep_research_start evicts + guards the add/update under the lock. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…re-audit N19/N20/N21) N19: add `builtins` to _SKILL_DANGEROUS_MODULES — `import builtins; builtins.exec(src)` bypassed the gate (builtins.exec is an Attribute call, not the bare-name Call the dangerous-calls check catches). N20: _pinned_builtin_names() returned an empty set on a manifest read failure, so skill_approve's pinned-name guard blocked nothing during a transient failure. Return None on failure and FAIL CLOSED (refuse the approve). N21: routes/auth.py compared the PIN hash with `==` (timing side-channel) — use hmac.compare_digest. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
#6) Live-env investigation (read-only + staged, rolled back) found: the deployed python3.13 runs fastmcp 3.1.1 while requirements declares >=3.3.1, and an in-place upgrade to 3.3.1 left a broken install state on that interpreter. In a CLEAN env the code imports fine on 3.3.1 — so the fix is a clean reinstall, NOT a code port. The fastmcp SSRF (CVE-2026-32871) is in OpenAPIProvider, which CODEC does not use, so it isn't exploitable here (hygiene, not an emergency). - requirements.txt: add advisory-safe floors for the transitive/runtime deps that were unpinned — urllib3>=2.7.0, pillow>=12.2.0, cryptography>=46.0.7. (fastmcp>=3.3.1 and requests>=2.34.2 were already correct; the deployed interpreters were just stale vs the declared floors.) - tests/test_fastmcp_compat.py: assert fastmcp meets the 3.3.1 floor and that codec_mcp + codec_oauth_provider import against it — runs in CI and would have caught both the version drift and any future fastmcp API move. NO live mutation: the deployed venvs are untouched (rolled back to their original state). Refresh them at a maintenance window — see PR runbook. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Fix #6 — live dependency-refresh runbook (run at a maintenance window)The deployed interpreters are stale vs the (now CVE-safe) # 1) codec-mcp-http interpreter — clean fastmcp reinstall (in-place upgrade left a broken state)
/usr/local/bin/python3.13 -m pip uninstall -y fastmcp fastmcp-slim
/usr/local/bin/python3.13 -m pip install "fastmcp>=3.3.1" "urllib3>=2.7.0" "pillow>=12.2.0" "cryptography>=46.0.7" "requests>=2.34.2"
# VERIFY before any restart — must print OK:
/usr/local/bin/python3.13 -c "import codec_mcp, codec_oauth_provider; from fastmcp import FastMCP; print('OK')"
pm2 restart codec-mcp-http && pm2 logs codec-mcp-http --lines 30 # confirm it comes up clean
# 2) the Homebrew interpreter (dashboard/bridges/etc.)
/opt/homebrew/bin/python3 -m pip install "urllib3>=2.7.0" "pillow>=12.2.0" "cryptography>=46.0.7" "requests>=2.34.2"
pm2 restart codec-dashboard # + any other Homebrew-python services as neededstarlette (BadHost, CVE-2026-48710) is a major bump requiring a coordinated fastapi upgrade + middleware review — deliberately deferred, not in the floors above. The repo + tests are already on the safe floors; |
…mediums) CHAIN-002: skill_forge writes forged code to disk WITHOUT the review gate, yet was on CHAT_SKILL_ALLOWLIST — a prompt-injected [SKILL:skill_forge:...] tag could persist an unsandboxed skill. Removed it (+ the stale ask_codec_to_build entry that had no backing skill); create_skill stays as the review-gated path. Atomic config writes (truncated-read → empty-fallback race): route save_triggers (custom_triggers.json), custom-agent save, and update_schedule (schedules.json) through codec_jsonstore.atomic_write_json instead of truncate-then-write, so a concurrent reader can't catch a half-written file. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Step-3 strict-consent gate lives only in codec_agent_runner; the chat [SKILL:] + MCP tool paths reach high-power skills with only is_dangerous / path-blocklists (red-team CHAIN-001/002/006). This >1-module, UX-changing fix is design-gated per CLAUDE.md §11 — note covers the two chokepoints (codec_dispatch.run_skill, codec_mcp.tool_fn), the per-transport consent UX (MCP hard-refuse / chat prompt / voice announce), destructive classification, test/rollback, and the open decisions. No code until approved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…1/C) — and regen skill manifest Consent gate, increment 1 (Decisions B1 + C): - codec_consent: is_destructive_skill (per-skill SKILL_DESTRUCTIVE flag via the registry + _HTTP_BLOCKED backstop + known high-power built-ins) + gate_enabled kill switch (CONSENT_GATE_ENABLED). - codec_skill_registry: AST-extract SKILL_DESTRUCTIVE + get_destructive() accessor (parallel to SKILL_MCP_EXPOSE) — the extensible per-skill path. - codec_mcp.tool_fn: hard-refuse destructive skills over MCP (claude.ai can't consent at the operator tier) — closes the red-team's file_ops/imessage_send/ pilot-over-MCP chains; emits a `denied` audit. Also REGENERATES skills/.manifest.json: the earlier Fix #7 (web_fetch) and N4/N10 (file_write, file_ops) edits changed those built-ins' bytes, so their hash-pins were stale — the load-time gate would have AST-refused file_write/ file_ops (they import os) in production, and CI --check would fail. Manifest now matches the 76 built-ins. Chat-side consent (A1: consent_required + re-dispatch) is the next increment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Consent gate increment 2 (Decision A2 — reuse the proven Phase 1 Step 3 PWA consent panel, no new frontend). Both chat skill-runners now gate destructive skills before firing: - _try_skill (pre-LLM hijack) and _try_skill_by_name (post-LLM [SKILL:] tag — the prompt-injection vector) call codec_consent.chat_consent_ok, which for a destructive skill routes through codec_ask_user.ask(destructive=True, asked_from="chat") — literal verb-match consent rendered by the existing PWA panel; declined/timeout/ask_user-unavailable → skill skipped (fail-closed). - Non-destructive skills are unaffected (no prompt). Kill switch: CONSENT_GATE_ENABLED=false. With B1 (MCP hard-refuse) this closes the red-team's cross-cutting root cause: the chat [SKILL:] + MCP paths reached high-power skills with only is_dangerous. is_dangerous is retained as the layered terminal command heuristic (Decision D). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Update — re-audit mediums + consent gate landedAdded since the PR opened:
230 tests green across the touched modules. Every Critical/High/actionable-Medium from both audits is now in this PR. |
Supersedes #153 — same Phase B/C commits plus a full re-audit remediation wave. Phase A (C1/C2/C3/C6/C7) is already on
main(PR #152). One squash-merge lands everything below.All TDD'd; 224 tests green across touched modules; ruff clean; rebased cleanly on current
main.Phase B/C — top-10 completion
run_with_timeout, kill hanging ThreadPoolExecutorRe-audit wave (fresh specialist sweep of the post-remediation tree)
/ws/voicewas unauthenticated — BaseHTTPMiddleware never runs on the websocket scope. Added_ws_authorizedmirroring HTTP auth; reject beforeaccept().file_write/file_opsresolved only the parent dir → symlinked final component slipped the blocklist. Realpath the full path + open the resolved target.?s=session fallback (skippedtotp_verified). Both paths now route through_session_token_valid.sys.pathevery poll (dead lines) → removed._research_jobsunbounded + unlocked → lock + TTL eviction (mirrors_agent_jobs).import builtins; skill-approve fails closed when the manifest can't be read; PIN compared withhmac.compare_digest.Still requires you (not in this PR)
fastmcp9.3 SSRF (CVE‑2026‑32871) is installed on the internet-facingcodec-mcp-http; plus urllib3/pillow/cryptography/requests/starlette CVEs and the PM2 interpreter pin.requirements.txtalready floors the safe versions; the installed venvs are stale. Needs you to confirm the venv path + watch thepm2 restart.🤖 Generated with Claude Code