fix: remove introspection framework#51
Open
MichielDean wants to merge 2 commits into
Open
Conversation
Remove the entire introspection/self_assessment system: - Deleted internal/introspect/ and internal/skillpatch/ packages - Removed introspect, learn, track-review CLI commands - Removed self_assessment memory type from store, retriever, migrations - Stripped dream REM behavioral insight generation (themes only now) - Removed SkillPatchDreamHook, ReviewOutcomeTracker, TestOutcomeTracker - Removed proposed-changes.md support, skill patch validation - Cleaned all documentation, skills, and plugin references The introspection loop never closed: patterns were detected but never changed behavior. Without calibration data and a promotion pipeline, the system was sophisticated self-documentation, not self-improvement.
There was a problem hiding this comment.
Pull request overview
This PR removes LLMem’s introspection/self-assessment and skill-patching subsystems end-to-end (Go packages, CLI commands, migrations/default types, plugins, skills, and docs), and simplifies the dream REM phase to theme-only output.
Changes:
- Removed
internal/introspect+internal/skillpatchpackages and theintrospect,learn,track-reviewCLI commands. - Dropped the
self_assessmentmemory type from defaults/migrations/reranking docs and updated plugins/skills accordingly. - Simplified REM dreaming to theme extraction only and removed behavioral insight / calibration / patch-validation plumbing.
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| skills/llmem/SKILL.md | Removes self_assessment type docs and introspection/learn/track-review command references; updates hook + REM descriptions. |
| skills/llmem-setup/SKILL.md | Updates setup docs to reflect fewer shipped skills and simplified hooks/injection guidance. |
| skills/introspection/SKILL.md | Deletes the introspection skill documentation. |
| skills/introspection-review-tracker/SKILL.md | Deletes the review-tracker skill documentation. |
| README.md | Removes introspection skills from the skills list and updates CLI/examples accordingly. |
| plugins/opencode/llmem.js | Removes behavioral/proposed search injection from session.created. |
| plugins/agent/skills/llmem/SKILL.md | Mirrors skills/llmem removals inside the agent plugin bundle. |
| plugins/agent/skills/llmem-setup/SKILL.md | Mirrors setup guidance changes inside the agent plugin bundle. |
| plugins/agent/skills/introspection/SKILL.md | Deletes bundled introspection skill. |
| plugins/agent/skills/introspection-review-tracker/SKILL.md | Deletes bundled review-tracker skill. |
| plugins/agent/hooks/hooks.json | Removes behavioral self-assessment search from SessionStart hook command. |
| migrations/003_register_default_types.sql | Removes self_assessment from default type registration and down-migration delete list. |
| internal/taxonomy/taxonomy.go | Removes introspection parsing/field helpers; keeps error taxonomy constants. |
| internal/taxonomy/taxonomy_test.go | Removes tests tied to removed taxonomy parsing/field helpers; adds local parsing tests. |
| internal/store/store_test.go | Updates expected default registered types list (drops self_assessment). |
| internal/store/models.go | Drops self_assessment from DefaultRegisteredTypes(). |
| internal/store/migration_test.go | Updates expected memory_types count after migrations (8 → 7). |
| internal/skillpatch/skillpatch.go | Deletes the skill patching implementation. |
| internal/skillpatch/skillpatch_test.go | Deletes skillpatch tests. |
| internal/retriever/retriever.go | Removes self_assessment from default type-priority weights. |
| internal/ollama/ollama.go | Updates package comment to remove introspection mention. |
| internal/introspect/introspect.go | Deletes introspection implementation. |
| internal/introspect/introspect_test.go | Deletes introspection tests. |
| internal/extract/extract.go | Removes self-assessment category prompting and self_assessment extraction support. |
| internal/dream/dream.go | Removes behavioral-insight generation, ollama wiring, and patch validation; REM becomes themes-only. |
| internal/dream/dream_test.go | Removes behavioral-insight and patch-validation tests; updates REM/report expectations. |
| internal/config/config.go | Removes introspection/skillpatch/dream-LLM config fields and related helpers; simplifies DreamerConfig wiring. |
| internal/config/config_test.go | Updates DreamerConfig tests to reflect removed behavioral fields. |
| docs/RERANKING.md | Updates type-priority docs to remove self_assessment. |
| docs/INTEGRATIONS.md | Updates integration docs to remove introspection skills and adjust hook descriptions. |
| docs/INSTALLATION.md | Updates installation doc to remove introspection mention. |
| docs/DREAM.md | Updates REM phase docs and Go API examples to remove behavioral insight/skill patch sections. |
| docs/CONFIGURATION.md | Removes config knobs for introspection, proposed changes, and skill patching. |
| docs/CLI.md | Removes introspect, learn, track-review command documentation and hook flags for introspection. |
| docs/API.md | Removes introspection/skillpatch API sections and self_assessment references; updates examples/tables. |
| cmd/llmem/main.go | Removes introspection-related commands, wiring, and helpers from the Go CLI. |
| cmd/llmem/main_test.go | Deletes CLI flag tests for the removed introspection command. |
Comments suppressed due to low confidence (1)
docs/API.md:1084
- The “Type Priority Weights” table omits
conversation(0.7), but the Go retriever defaults include it. Please addconversationto this table to keep the docs consistent withinternal/retriever/retriever.go.
#### Type Priority Weights
| Type | Priority | | Type | Priority |
|------|----------|-|------|----------|
| decision | 1.2 | | fact | 1.0 |
| preference | 1.1 | | project_state | 1.0 |
| procedure | 1.1 | | event | 0.9 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| "type": "command", | ||
| "command": "llmem stats 2>/dev/null && echo '---' && llmem search behavioral --type self_assessment --limit 5 2>/dev/null && echo '---' && llmem search proposed --type procedure --limit 5 2>/dev/null", | ||
| "command": "llmem stats 2>/dev/null && echo '---' && llmem search proposed --type procedure --limit 5 2>/dev/null", |
Comment on lines
68
to
76
| const stats = run(["stats"]); | ||
| if (stats) { | ||
| log(client, "info", STATS_TAG + "\n" + stats); | ||
| } | ||
|
|
||
| const context = run(["context", "--session-id", sessionId || "start"]); | ||
| if (context) { | ||
| log(client, "info", INJECT_TAG + "\n" + context); | ||
| } |
Comment on lines
94
to
98
| This runs `install.js` which: | ||
| 1. Copies 4 skill directories to `~/.agents/skills/` | ||
| 1. Copies 2 skill directories to `~/.agents/skills/` | ||
| 2. Auto-detects your platform (OpenCode, Claude Code, Copilot CLI) | ||
| 3. Deploys the correct plugin to the right location | ||
| 4. Deploys OpenCode custom tools to `.opencode/tools/` (if OpenCode detected) |
Comment on lines
143
to
+149
| ## Skills | ||
|
|
||
| LLMem ships four skills focused on memory management. They load on-demand via the skill system — no need to paste their content into instruction files. | ||
|
|
||
| | Skill | Description | | ||
| |-------|-------------| | ||
| | **llmem** | Manage LLMem memories — add, search, consolidate, dream, introspect, and track review outcomes. | | ||
| | **llmem** | Manage LLMem memories — add, search, consolidate, and dream. | |
Comment on lines
1061
to
1064
| // Get default type priority map (returns defensive copy). | ||
| priorities := retriever.DefaultTypePriority() | ||
| // map[decision:1.2 preference:1.1 procedure:1.1 fact:1.0 project_state:1.0 self_assessment:1.0 event:0.9] | ||
| // map[decision:1.2 preference:1.1 procedure:1.1 fact:1.0 project_state:1.0 event:0.9] | ||
| ``` |
Comment on lines
1481
to
1485
| // Get ordered category keys | ||
| keys := taxonomy.ErrorTaxonomyKeys() | ||
| // ["NULL_SAFETY", "ERROR_HANDLING", "OFF_BY_ONE", "RACE_CONDITION", "AUTH_BYPASS", | ||
| // "DATA_INTEGRITY", "MISSING_VERIFICATION", "EDGE_CASE", "PERFORMANCE", "DESIGN", "REVIEW_PASSED"] | ||
|
|
||
| // Parse a formatted self-assessment line | ||
| parsed := taxonomy.ParseSelfAssessment("NULL_SAFETY: null pointer dereference") | ||
| // map[string]string{"Category": "NULL_SAFETY", "What": "null pointer dereference"} | ||
|
|
||
| // Parse a specific field from self-assessment content (used by introspect and skillpatch) | ||
| proposedUpdate := taxonomy.ParseSelfAssessmentField(content, "Proposed_update") | ||
| category := taxonomy.ParseSelfAssessmentField(content, "Category") | ||
| // Returns empty string if field not found; never returns an error | ||
|
|
||
| // Get comma-separated category choices | ||
| choices := taxonomy.IntrospectCategoryChoices() | ||
| // "DATA_INTEGRITY", "MISSING_VERIFICATION", "EDGE_CASE", "PERFORMANCE", "DESIGN"] | ||
| ``` |
| @@ -11,14 +11,12 @@ Agent Session | |||
| │ | |||
| ├── Plugin (auto, no instructions needed) | |||
| │ ├── session.created/start → llmem stats + search → inject context | |||
Comment on lines
157
to
160
| "project_state", | ||
| "procedure", | ||
| "conversation", | ||
| "self_assessment", | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
internal/introspect/andinternal/skillpatch/packages (1,654 lines)introspect,learn,track-reviewCLI commandsself_assessmentmemory type from store, retriever, migrationsWhy
The introspection loop never closed: patterns were detected but never changed behavior. Without calibration data and a promotion pipeline, the system was sophisticated self-documentation, not self-improvement. The compounding loop was broken at the critical joint: detected patterns → changed behavior.
Changes
Deleted packages:
internal/introspect/(387 + 660 test = 1,047 lines)internal/skillpatch/(410 + 567 test = 977 lines)Deleted skills:
skills/introspection/SKILL.md,skills/introspection-review-tracker/SKILL.mdplugins/agent/skills/introspection/SKILL.md,plugins/agent/skills/introspection-review-tracker/SKILL.mdSimplified:
BehavioralInsightstruct,extractBehavioralInsights,buildBehavioralInsightPrompt,validatePatches, Ollama client wiring in dreambehavioral_threshold,behavioral_lookback_days,skill_patch_threshold,proposed_changes_path,model,model_timeout,call_model_timeoutNet change: 37 files, +219 −5,477 lines (5,258 net reduction)