feat(cli): soul diff enhancements — --require-same-did, --fail-on, diff-driver-install (#217)#243
Open
sshekhar563 wants to merge 11 commits into
Open
feat(cli): soul diff enhancements — --require-same-did, --fail-on, diff-driver-install (#217)#243sshekhar563 wants to merge 11 commits into
sshekhar563 wants to merge 11 commits into
Conversation
…st-0.4.0 chore: sync dev with main post-0.4.0
Three new commands wrap the org-level SQLite WAL journal as a CLI for shell hooks, CI scripts, and non-Python runtimes: - soul journal init <path> — bootstrap a standalone journal file (no root soul, no scope tree, no founder). - soul journal append <path> — write one event from flags, or batch JSONL events from stdin. Echoes the committed EventEntry to stdout with backend-assigned seq + prev_hash so callers can chain causation ids. - soul journal query <path> — filter by --action / --action-prefix (trailing-dot tolerated), --scope, --correlation-id, --since/--until, plus --at <iso> for point-in-time replay. Rich table by default; --json emits a parseable JSON array. Foundation for replacing memory-heavy soul-sync.sh hooks with structured, queryable journal events. Closes qbtrix#189
Read-only command compares two soul files at the soul level — not the byte level. Sections covered: identity, OCEAN/DNA, state, core memory, memories per layer + per domain, bond (default + per-user), skills, trust chain, self-model, evolution. Memory diff strategy: by id. Added = in right not left, removed = in left not right, modified = same id different fields. Superseded memories are filtered from the modified list by default since they still live in the file; --include-superseded surfaces the chain explicitly. Output formats: text (Rich panel, sections omitted when empty), --format json (full SoulDiff Pydantic dump for tooling), --format markdown (paste-ready table for PR bodies). Other flags: --section narrows to one section (with hyphen/underscore aliases), --summary-only collapses to per-section counts. Schema mismatch raises SchemaMismatchError → exit 1 with a clean message pointing at `soul migrate`. New public Python API at soul_protocol.runtime: diff_souls, SoulDiff, SchemaMismatchError. The SoulDiff model is fully Pydantic-roundtripable so PR review bots and CI checks can consume it. Closes qbtrix#191
…rediction-error gating (qbtrix#192) (qbtrix#209) Drafts the v0.5.0 RFC for six brain-aligned memory operations: confirm, update, supersede (extends 0.4.0), forget (semantics shift to weight decay), purge (new hard-delete with backup), and reinstate. Adds the schema additions (retrieval_weight, supersedes back-edge, prediction_error, revisions), recall changes (weight filter + provenance), trust chain hooks, CLI/MCP surface, a 2-3 hour spike scope for daily-use validation before the production build, the open questions for captain review, the 0.4 → 0.5 migration walkthrough, the SPEC.md follow-up stanzas, and the cog-sci references.
…rix#160) (qbtrix#211) Add a soul-aware eval framework: a YAML-driven format and runner that seeds the soul with explicit state (memories, OCEAN, bonds, mood, energy) before each case runs, so behaviour can be measured against a known starting point rather than being treated as a stateless function. New modules - src/soul_protocol/eval/{schema,runner,scoring}.py — Pydantic schema with five scoring kinds (keyword, regex, semantic, judge, structural), the run_eval orchestrator, plus run_eval_against_soul for the MCP variant that runs against the live soul without re-birthing. - src/soul_protocol/cli/eval_cmd.py — `soul eval` command. Runs one spec or every .yaml under a directory; --json, --filter, --judge-engine, --verbose options. Exit 0 on all-pass (skips OK), 1 on any failure or spec error. - soul_eval MCP tool (src/soul_protocol/mcp/server.py) — runs a YAML spec against the active soul. seed block ignored; live state is the seed. Accepts yaml_path or yaml_string. Shipped examples (tests/eval_examples/) - personality_expression.yaml — high-openness OCEAN seed surfaces creative memories - memory_recall_filtering.yaml — multi-user attribution prevents cross-user bleed - domain_isolation.yaml — domain-scoped recall stays inside its domain (qbtrix#41) - bond_strength_effect.yaml — bonded-visibility memories gate on bond_threshold - trust_chain_provenance.yaml — observe → recall side-effects flow through Tests: 66 new under tests/test_eval/ (schema, runner, examples, cli, mcp). Wired into pytest as smoke tests so the example specs never drift. Total: 2537 → 2603. Docs: full schema reference at docs/eval-format.md; soul eval section in cli-reference.md; soul_eval section in mcp-server.md; Evaluation section in api-reference.md; Unreleased note in CHANGELOG.md.
Three new commands wrap the org-level SQLite WAL journal as a CLI for shell hooks, CI scripts, and non-Python runtimes: - soul journal init <path> — bootstrap a standalone journal file (no root soul, no scope tree, no founder). - soul journal append <path> — write one event from flags, or batch JSONL events from stdin. Echoes the committed EventEntry to stdout with backend-assigned seq + prev_hash so callers can chain causation ids. - soul journal query <path> — filter by --action / --action-prefix (trailing-dot tolerated), --scope, --correlation-id, --since/--until, plus --at <iso> for point-in-time replay. Rich table by default; --json emits a parseable JSON array. Foundation for replacing memory-heavy soul-sync.sh hooks with structured, queryable journal events. Closes qbtrix#189
…oad typing + key rotation tests (qbtrix#210) Closes qbtrix#199, qbtrix#200, qbtrix#205, qbtrix#204. * qbtrix#199 — verify_chain rejects entries whose timestamp predates the previous entry's timestamp by more than 60s (skew tolerance), closing a backdating gap at the chain head. * qbtrix#200 — _canonical_json no longer silently stringifies non-JSON-native types via default=str. A strict default raises TypeError with an actionable message so hash-determinism cannot drift across Python versions. * qbtrix#205 — compute_payload_hash refuses BaseModel inputs at the public entry point. Callers must pre-serialize via model_dump(mode='json') so a BaseModel and a dict cannot accidentally produce different hashes for the same logical payload. * qbtrix#204 — Keystore gains previous_public_keys allow-list, persisted as keys/previous.keys (newline-separated base64). Soul.verify_chain accepts entries whose public_key matches either the current key or any in the allow-list, enabling key rotation. Default empty list preserves the v0.4.0 strict-current-key behavior. Existing chain-append payloads in runtime/soul.py and runtime/bond.py were audited under the strict canonical JSON rule; they were already JSON-native dicts so they keep hashing cleanly. Tests: 34 new test cases in test_verification_hardening.py and test_key_rotation.py covering monotonicity, strict JSON refusal, BaseModel guard, mixed-signer chains, allow-list mechanics, and keystore round-trips through directory + archive layouts. Docs: CHANGELOG Unreleased section, docs/trust-chain.md threat model + key management + on-disk layout, docs/SPEC.md §10A.6 Verification contract + §10A.7 Identity binding.
…ilure logging (qbtrix#201, qbtrix#202) (qbtrix#213) - TrustEntry gains a non-cryptographic ``summary`` field excluded from the canonical bytes used for ``compute_entry_hash`` and signing. - TrustChainManager.append accepts an optional ``summary=`` parameter; when omitted, an action-keyed default formatter registry covers the actions Soul emits (memory.write, memory.forget, memory.supersede, bond.strengthen, bond.weaken, evolution.proposed/applied, learning.event). - ``Soul.audit_log()`` rows include ``summary``. ``soul audit`` Rich table adds a Summary column; ``--no-summary`` restores the 0.4.0 hash-only view. JSON output always carries summary. - ``Soul._safe_append_chain`` splits the log path: verification-only (no public key, _PublicOnlyProvider) stays at DEBUG; an unexpected exception during ``append`` now logs at WARNING under the ``runtime.chain_append_skipped`` event with action, error type, error message, and soul name. BondRegistry's on_change callback failure path follows the same pattern under ``runtime.bond_callback_failed``. - evolution.applied and learning.event Soul callsites pass an explicit summary because their on-chain payloads don't carry the keys the registry default expects. - Tests: 39 new (registry coverage, manager-level summary behaviour, cryptographic-exclusion guarantee, back-compat read of pre-qbtrix#201 chains, Soul-level integration, read-only soul DEBUG-only behaviour, unexpected-exception WARNING shape, BondRegistry callback failure, long-horizon 50+ ops with flaky provider). - Docs: trust-chain.md (summary excluded from signed bytes), cli-reference.md (Summary column + --no-summary flag), api-reference.md (TrustEntry.summary + TrustChainManager.append signature), SPEC.md §10A.1 (summary as non-cryptographic field + §10A.2 canonical encoding clarification), CHANGELOG.md Unreleased.
…) (qbtrix#214) Adds a configurable cap on trust-chain length, enforced at append time. When the cap is reached, every non-genesis entry is compressed into a single signed `chain.pruned` marker and the chain resumes growing from there. Genesis (seq=0) is always preserved. The verifier gains exactly one carve-out from strict seq monotonicity: entries with action == `chain.pruned` may have a seq strictly greater than `prev.seq + 1`. Every other action remains strictly monotonic, so a tampered chain that injects a forged seq gap still fails verification. Surfaces: - Biorhythms.trust_chain_max_entries: int = 0 (unbounded by default) - TrustChainManager.prune(keep, *, reason) and dry_run_prune(keep) - soul prune-chain CLI (dry-run by default; --apply to mutate) - soul_prune_chain MCP tool (apply=False by default) Spec extension lands as SPEC.md §10A.10 (optional pruning extension). The full archival design — separate trust_chain/archive/ directory with checkpoint entries — is deferred to v0.5.x. This release is the touch-time stub.
…ff-driver-install (qbtrix#217) Implements items 1, 3, and 4 from issue qbtrix#217 as a single feat(cli) PR: 1. soul diff-driver-install (trivial): - Configures .gitattributes (*.soul diff=soul) and git config (diff.soul.command) so git diff/log -p/PR tools show soul-level diffs natively. - Supports --global (user-level) and --local (repo-level, default). - Idempotent: re-running doesn't duplicate the .gitattributes line. 3. --require-same-did foot-gun guard (trivial): - Exits non-zero when left and right souls have different DIDs. - Prevents accidental diffs between unrelated souls. - Override with --allow-cross-did for intentional cross-DID comparison. 4. --fail-on <category> CI guard (small): - Exits non-zero when the named change category has count > 0. - Repeatable: --fail-on memory.added --fail-on identity. - 15 categories mapped from SoulDiff.summary() keys. - Unknown categories fail fast (exit 2) with a list of known names. - Diff output still renders before failing, so CI logs stay useful. Tests: 11 new tests covering all three features (27 total, all passing).
|
This PR has been automatically marked as stale because it has not had activity in the last 14 days. It will be closed in 7 days if no further activity occurs. If you're still working on this, please push an update or leave a comment. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements items 1, 3, and 4 from #217 as a single
feat(cli)PR, as suggested in the issue description.1.
soul diff-driver-install(trivial)Configures
.gitattributes(*.soul diff=soul) andgit config(diff.soul.command) sogit diff,git log -p, and PR tools show soul-level diffs natively.--global(user-level) and--local(repo-level, default).gitattributesline3.
--require-same-didfoot-gun guard (trivial)Exits non-zero when left and right souls have different DIDs. Prevents accidental diffs between unrelated souls.
--allow-cross-didfor intentional cross-DID comparison4.
--fail-on <category>CI guard (small)Exits non-zero when the named change category has count > 0 in the diff summary.