security(secrets): redact credentials before disk write + purge-history CLI (closes #3, v1.0.7)#8
Merged
Conversation
…ry CLI (closes #3, v1.0.7) The Apr 29 2026 audit's only release-blocking finding: pasted command output containing AWS keys / GCP service-account JSON / JWTs / bearer tokens lands unredacted in ~/.ghosthunter/audit.log and the memory palace, persisting indefinitely (and replicating via backups, Time Machine, cloud sync). This PR closes that gap with three pieces: 1. New module `src/ghosthunter/security/secrets_redactor.py` - 9 credential pattern classes (the issue spec required >= 8): aws_access_key (AKIA*), aws_temp_access_key (ASIA*), github_token (gh[psru]_*), anthropic_key (sk-ant-*), openai_key (sk-* x48), jwt (eyJ*.*.*), bearer_token, auth_header (authorization/api-key/x-api-key in shell/JSON/YAML forms), gcp_private_key + pem_private_key. - Order matters: jwt fires before bearer_token so Bearer eyJ... gets the more specific label. - Each match replaced with [REDACTED:<type>] preserving structure. - Returns RedactionResult with per-pattern hit counts. - redact_dict() helper recurses dicts/lists/tuples for the audit writer's nested entry shape; does NOT mutate inputs. 2. Wiring at every disk-write path - cli.py::_append_audit_log — runs redact_dict() over the entry before json.dump. Conclusion text from Opus may quote stdout snippets, so even a successful investigation can leak tokens without this pass. - memory/palace.py::Palace.remember — runs redact_secrets() over content before storage. Palace memories are long-lived; a leaked token persists across sessions. - chat_history is owned by prompt_toolkit and not on our write path — documented as a known limitation in SECURITY.md. 3. New `ghosthunter purge-history` CLI command - Wipes ~/.ghosthunter/chat_history, audit.log, palace/ after y/N confirmation. --yes / -y skips the prompt. - Configuration files (config.toml) are preserved. - For users who pasted sensitive output during v1.0.6 (no redaction available then) — gives them a clean migration path. SECURITY.md item 2 ("What Ghosthunter does NOT protect against") updated with v1.0.7 mitigation notes — honest about the chat_history limitation and the best-effort nature of pattern-based redaction. Verified - pytest tests/test_secrets_redactor.py -> 29 passed (8 pattern classes covered + redact_dict + registry invariants + JWT-vs-bearer ordering + false-positive suite) - pytest tests/ -> 1255 passed (zero regressions) - ruff check + ruff format --check -> clean Closes #3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3.
The audit's only release-blocking finding. Three pieces:
security/secrets_redactor.py— 9 pattern classes (AWS keys, GitHub tokens, Anthropic/OpenAI keys, JWTs, bearer tokens, auth headers, PEM/GCP private keys). Replace matches with[REDACTED:<type>]. JWT fires before bearer_token soBearer eyJ...gets the specific label.redact_dict()recurses for nested entries._append_audit_logredacts beforejson.dump.palace.remember()redacts content before storage. chat_history is prompt_toolkit-owned (out of our write path) — documented as known limitation.ghosthunter purge-historyCLI — wipes chat_history + audit.log + palace/ with y/N confirmation.--yesto skip. Migration path for users who pasted sensitive output under v1.0.6.SECURITY.md item 2 updated with honest scope notes.
Verification:
pytest tests/test_secrets_redactor.py→ 29 passedpytest tests/→ 1255 passed (zero regressions)🤖 Generated with Claude Code