feat(distill): distill environment-level memories into a shared catalog#3
Merged
Merged
Conversation
Add a "distill" capability that lifts transferable lessons (shell/OS quirks, CLI gotchas, toolchain, user identity) out of a single project's Claude Code memories and makes them reusable across projects. The work is split between judgment and mechanics: - internal/distill (Go, mechanical): regenerates the DISTILLED.md catalog index from per-lesson <slug>.md entry files, prunes stale entries whose source lost the marker or vanished, and reports a worklist of marked-but- not-yet-distilled memories plus cross-project conflicts. Ships a tolerant frontmatter parser (handles both flat and nested metadata schemas) with no new dependency. Conservative: never prunes when the projects tree is invisible; ignores MEMORY.md and *.tmp.* litter. - claude-memsync distill (CLI): --prune and --dry-run; loads config with a pre-init defaults fallback. - daemon: rebuilds the index locally after every sync. DISTILLED.md is a derived artifact (git-ignored, regenerated per-PC) so the generated table never causes merge conflicts; entry files sync via the existing git add -A. - skills/distill + skills/distill-apply: /distill is the classifier of record (classify, generalize, write entries, tag originals scope:environment); /distill-apply seeds chosen entries into the current project's memory. - init prints a one-time permission allow-rule for ~/.claudesync/distilled/ (non-destructive — it does not edit the user's global settings.json). Catalog lives at ~/.claudesync/distilled/, inside the sync work-tree, so entries propagate across workstations with no extra transport. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Step-by-step walkthrough: setup, distilling out of a project, applying into another, keeping the catalog fresh, troubleshooting, and the mental model. Linked from the README. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the full claude-memsync manual (prerequisites, setup, lifecycle, how it works, deletes, on-disk layout, auth, limitations) into docs/claude-memsync.md, matching the style of the distilling guide. Trim the README to a capabilities overview, a quick start, and links to both docs; keep Project layout, Releasing, and License. Repoint the distilling guide's setup cross-reference at the new claude-memsync doc. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Deterministic LF in the repo on every platform (Go/shell/markdown tolerate LF on Windows), with CRLF reserved for Windows launchers and a binary guardrail. Stops the LF→CRLF checkout warnings on Windows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…talog dir The skills assumed claude-memsync was on PATH and that ~/.claudesync/distilled/ already existed; when neither held, the agent probed $HOME to orient and tripped needless permission prompts. Now both skills: - work only with the two known paths (project memory dir + distilled catalog) and explicitly do not probe $HOME or the .claudesync parent - create the catalog dir directly instead of test-and-search - treat `claude-memsync distill` as best-effort — skip with a note if the binary isn't on PATH (the daemon or a later manual run rebuilds the index) - (distill-apply) read the entry files directly as the source of truth rather than depending on DISTILLED.md existing Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Explain that /distill calls `claude-memsync distill` to rebuild the index and degrades gracefully if the binary isn't found, and show the forwarding-shim trick for running out of a dev checkout without a stale duplicate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The /distill skill normalizes catalog slugs (kebab-case), so a catalog entry's name deliberately differs from its source memory's human-readable name. The pending-worklist check matched on name, which mis-reported every renamed entry as "pending". Match on provenance (originProject/originFile) instead — the same key Reconcile already uses. Add a regression test where the source name and catalog slug differ. Also harden the skill against the two issues this surfaced: - require `name` to be a kebab-case slug equal to the filename (don't carry over the source's sentence-style name) - add a "cross-stack test" so narrow library/framework/version references stay project-scoped rather than polluting unrelated projects Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
git invokes the merge driver through its bundled sh, which treats backslashes as escape characters. A Windows driver path like S:\src\rdl\claude-utils\bin\claude-memmerge.exe was mangled to "S:srcrdlclaude-utilsbinclaude-memmerge.exe: command not found", silently disabling the MEMORY.md union merge — so every concurrent MEMORY.md edit conflicted, breaking the daemon's rebases and stranding it on a detached HEAD. filepath.ToSlash keeps the path intact through sh (quoted or not). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The daemon's filter only skipped names ending in ".tmp", but interrupted memory writes leave "<name>.md.tmp.<pid>.<hash>" files that don't match that suffix — so 40+ of them synced into the repo as litter. Ignore any name containing ".tmp." and add the pattern to the init .gitignore template. Adds a unit test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a distill capability to
claude-utils: it lifts transferable lessons (shell/OS quirks, CLI gotchas, toolchain, user identity) out of a single project's Claude Code memories and makes them reusable across projects — so Claude doesn't have to re-learn the same environment lessons in every repo.The design splits judgment from mechanics:
What's included
internal/distill/(mechanical, Go):BuildIndexregenerates theDISTILLED.mdcatalog index from per-lesson<slug>.mdentry files;Reconcile(--prune) drops entries whose source memory lost the marker or vanished;Previewbacks--dry-run;analyzeSourcessurfaces the worklist of marked-but-not-yet-distilled memories and cross-project conflicts. Includes a tolerant frontmatter parser (handles both the flat and the nested-metadataschemas) with no new dependency. Conservative by design: never prunes when the projects tree is invisible; ignoresMEMORY.mdand the*.tmp.*litter.claude-memsync distillCLI:--pruneand--dry-run; loads config with a pre-initdefaults fallback.DISTILLED.mdis a derived artifact — git-ignored and regenerated per-PC — so the generated table never causes merge conflicts, while the entry files sync via the existinggit add -A.skills/distill,skills/distill-apply):/distillis the classifier of record (classify → generalize → write entries → tag originalsscope: environment);/distill-applyseeds chosen entries into the current project's memory.init+ README:initprints a one-time permission allow-rule for~/.claudesync/distilled/(it does not edit the user's globalsettings.json); README documents the flow, skill install, and permissions.Design notes
scope:marker on its own; rather than depend on that,/distillmakes and records the classification, and the marker is the cached result of its judgment. The Go daemon only mechanically aggregates/indexes what the skill produced.DISTILLED.mdis derived, not synced. Entry files are the synced source of truth; the index is regenerated locally to avoid merge conflicts on a generated table.Testing
go build ./...,go vet ./...clean.internal/distillhas 7 tests (frontmatter parsing both schemas, sorted index, catalog + source conflicts, pending worklist, pruning, never-prune-blind guard) — all passing.claude-memsync distillbinary against a temp catalog: indexes entries, detects pending, and generates a cleanDISTILLED.md.Follow-ups (not in this PR)
initwrite the permission allow-rule automatically (behind an opt-in flag, with a careful JSON merge) instead of only printing it.*.tmp.*files in memory dirs look like leftover merge-driver temp files — worth a separate cleanup fix in the sync path.🤖 Generated with Claude Code