Skip to content

security: license-readiness — close 7 stress-test blockers (LS-1..LS-7) + SR-6 bonus#157

Merged
AVADSA25 merged 3 commits into
mainfrom
reaudit-phase2
May 30, 2026
Merged

security: license-readiness — close 7 stress-test blockers (LS-1..LS-7) + SR-6 bonus#157
AVADSA25 merged 3 commits into
mainfrom
reaudit-phase2

Conversation

@AVADSA25
Copy link
Copy Markdown
Owner

Summary

Closes the 7 BLOCKS-LICENSE-SALE findings from CODEC-STRESS-TEST-2026-05-28 + the SR-6 obfuscation bypass.

Pytest baseline: 1957 → 2001 passed (+44 net: 43 new license-blocker regression tests + 1 inverted-old-test). 0 regressions. Full suite runtime: 76s.

Critical security fixes

  • LS-1chat_consent_ok no longer silently approves destructive skills when the user types no / cancel / stop. In strict-consent mode, _is_consenting_answer now rejects all free-text that's not an exact option-match or literal verb-match. Two rejections → existing ambiguous_consent timeout → TIMEOUT_SENTINEL → consent denied → skill blocked.
  • LS-3Crew.run(mode='parallel') now slices zip(agents, tasks)[:max_steps]. Before this, Crew(mode="parallel", agents=[12]) spawned 12 unbounded concurrent coroutines.
  • LS-4SAFE_CMDS exemption rejects commands containing shell metacharacters (; & | < > $ backtick). echo hi; touch /tmp/owned, cat README && nc, etc. no longer skip the preview gate.
  • SR-6 (bonus)is_dangerous() catches env-aliased dangerous binaries (B=base64; echo cm0gLXJmIC8= | $B -d and family). PR-2G's first-token extractor missed these.

High infrastructure fixes

  • LS-2 — Kokoro PM2 entry fixed (args: kokoro_server.pyargs: -m mlx_audio.server --host 0.0.0.0 --port 8085). Fresh clones can now start the TTS service.
  • LS-6pilot/ module vendored into the repo. PM2 pilot-runner cwd /Users/mickaelfarina/codec__dirname. Includes 15 source modules + 18 test files + 12 design docs.
  • LS-7 — Telegram log lines redact bot tokens via _sanitize_log(). Applied to all 5 log.error/log.warning sites.

Docs

  • LS-5README.md sample config llm_url 80818083 (would have silently broken every fresh install).
  • README badges: tests 1300+2000+, lines 67K+52K+.
  • FEATURES.md header v3.1v3.2. Counts updated to match repo state.
  • FEATURES.md:259 audit rotation 50MB rotationdaily rotation, 30-day retention (resolves the self-contradiction with line 319).
  • PRIVACY.md Kokoro default port 80808085.
  • setup_codec.py banner v2.1.0v3.2.0; product/skill/crew counts refreshed.

Test plan

  • pytest tests/test_license_blockers.py — 43/43 new regression tests pass
  • pytest tests/test_dangerous_command.py — 77/77 PR-2G tests still pass (no regression on existing hardening)
  • pytest tests/test_destructive_consent.py — 21/21 pass (inverted test, added general-mode test)
  • pytest tests/test_a12_invariant.py — passes (pilot allowlisted)
  • pytest tests/ (full suite) — 2001 passed, 80 skipped, 0 failed in 76s
  • node -e "require('./ecosystem.config.js')" — config parses, kokoro args + pilot cwd verified correct
  • python3 -c "import pilot" — vendored module imports cleanly from repo root
  • Manual smoke test: pm2 restart codec-dashboard && pm2 restart kokoro-82m && pm2 restart pilot-runner (operator action — verify all services come up green after merge)

Report

Full findings: ~/codec-audit-reports/CODEC-STRESS-TEST-2026-05-28.md (420 lines, 7 phases, 49 tests, 33 PASS / 9 NEW BUG / 22 DRIFT).

🤖 Generated with Claude Code

Mikarina13 and others added 3 commits May 29, 2026 16:36
…rtbeat (Fix #9 Phase 2)

The dashboard / ask_user notifications writers hold codec_jsonstore.file_lock
across their read-modify-write (Fix #5 / B-11), but the codec-scheduler and
codec-heartbeat daemons wrote notifications.json directly (load→insert→json.dump)
with no lock — so a scheduled-task or heartbeat-alert notification could clobber
a concurrent dashboard write (and vice-versa). Both now hold
file_lock(notif_path) across the whole RMW + use atomic_write_json. Every
notifications.json writer is now serialized.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…7) + SR-6 bonus

Closes the CRITICAL / HIGH findings from CODEC-STRESS-TEST-2026-05-28.md
that gated paid-license sale. Pytest: 1957 → 2001 (+44 net, 0 regressions).

CRITICAL security fixes
- LS-1: chat_consent_ok no longer auto-approves destructive skills when
  the user types "no" / "cancel" / "stop". In strict-consent mode
  (destructive_verb non-empty), _is_consenting_answer now rejects all
  free-text that's not an exact option-match or literal verb-match.
  Two rejections → existing ambiguous_consent timeout → TIMEOUT_SENTINEL
  → chat_consent_ok returns False → skill blocked. Affects every
  destructive skill on the chat path (terminal, file_write, file_ops,
  imessage_send, pilot, skill_forge, ax_control, pm2_control,
  process_manager). MCP + agent-runner paths were unaffected by the bug.
  codec_ask_user.py:_is_consenting_answer

- LS-3: Crew parallel mode now honors max_steps cap (was sequential-only).
  Before: Crew(mode="parallel", agents=[12]) spawned 12 unbounded
  coroutines. Now slices zip(agents, tasks)[:max_steps] in both modes.
  codec_agents.py:866

- LS-4: SAFE_CMDS exemption refuses safe-prefix commands containing shell
  metacharacters (`;` `&` `|` `<` `>` `$` backtick). `echo hi; touch X`,
  `cat README && nc`, `grep -r password ~/Documents` no longer skip the
  preview gate.
  codec_session.py:699

- SR-6 (bonus): is_dangerous() now catches env-aliased dangerous binaries
  like `B=base64; echo cm0gLXJmIC8= | $B -d`. PR-2G first-token extractor
  missed shell-var assignments to base64/sh/curl/python/etc. New Layer
  E-bis matches `VAR=<sensitive_binary>` regex. Fixes the one ~3% bypass
  the T1 obfuscation sweep found.
  codec_config.py:Layer E-bis

HIGH infrastructure fixes
- LS-2: Kokoro PM2 entry args changed from missing `kokoro_server.py`
  reference to `python3 -m mlx_audio.server --host 0.0.0.0 --port 8085`.
  Fresh clones can now actually start the TTS service.
  ecosystem.config.js:107

- LS-6: pilot/ module vendored into the repo from legacy /Users/.../codec.
  pilot-runner cwd changed from hardcoded /Users/mickaelfarina/codec
  (non-portable) to __dirname. Includes 15 source modules + 18 test
  files + 12 design docs.
  pilot/* (new) + ecosystem.config.js:189

- LS-7: Telegram log lines no longer leak bot tokens. New _sanitize_log()
  helper rewrites `bot<id>:<secret>` → `bot<REDACTED>`. Applied to all
  5 log.error/log.warning sites in codec_telegram.py that print raw
  Telegram API response dicts or exception strings.
  codec_telegram.py:_sanitize_log + 5 log sites

HIGH docs fixes (LS-5 + Tier 2)
- README.md sample config snippet llm_url 8081 → 8083 (would have
  silently broken every fresh install).
- README.md badge counts: tests 1300+ → 2000+, lines 67K+ → 52K+.
- FEATURES.md header v3.1 → v3.2, skills 75 → 76, tests 940+ → 2000+,
  lines 58K+ → 52K+. Same fixes on line 597 summary.
- FEATURES.md:259 audit rotation "50MB rotation" → "daily rotation,
  30-day retention" — was contradicting line 319 of the same doc.
- PRIVACY.md Kokoro default port 8080 → 8085 (matched code default).
- setup_codec.py banner v2.1.0 → v3.2.0; 7 products → 9, 50+ skills →
  76, 5 crews enumeration extended to all 12 in CREW_REGISTRY.

Tests
- 43 new regression tests in tests/test_license_blockers.py covering
  LS-1, LS-3, LS-4, LS-7, SR-6. Each parametrized to exercise both the
  vulnerable and the safe paths.
- Updated tests/test_destructive_consent.py: the old
  `test_consent_freetext_accepted_as_non_confirming` test pinned the
  broken behavior. Renamed + inverted to
  `test_consent_freetext_rejected_in_strict_mode`. Added
  `test_consent_freetext_accepted_in_general_mode` to pin the non-strict
  branch contract.
- Updated tests/test_a12_invariant.py: pilot/pilot_agent.py allowlisted
  (vendored CODEC Pilot module uses its own LLM client, not codec_llm).

Full pytest: 2001 passed, 80 skipped (env-dependent), 0 failed.
Total runtime: 76 seconds.

Report: ~/codec-audit-reports/CODEC-STRESS-TEST-2026-05-28.md

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…blockers

CI smoke ruff gate failed on PR #157 after the license-readiness commit.
Two unused imports (re — was used before parametrize simplification; Agent —
only Crew is needed in the parallel-cap test). No behavior change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AVADSA25 AVADSA25 merged commit 8982e5c into main May 30, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants