feat: Proposal A v3 - Configurable Feedback Amplitudes (Phase 3)#505
feat: Proposal A v3 - Configurable Feedback Amplitudes (Phase 3)#505jlin53882 wants to merge 15 commits intoCortexReach:masterfrom
Conversation
- Add pendingRecall Map for tracking session recalls - Add agent_end hook to store response text for usage scoring - Add before_prompt_build hook (priority 5) to score recall usage - Add session_end hook to clean up pending recalls - Add isRecallUsed function to reflection-slices.ts - Guard: skip scoring for empty responseText (<=24 chars) Implements: recall usage tracking for Proposal A Phase 1
1. Bug 1 (CRITICAL): injectedIds regex in feedback hook never matched
- The feedback hook used a regex /\[([a-f0-9]{8,})\]/gi to parse IDs
from prependContext, but auto-recall injects memories in format
[preferences:global], [facts:dc-channel], NOT [hex-id].
- Fix: read recallIds directly from pendingRecall (which is populated
by auto-recall's before_prompt_build from the previous turn).
Also added code in auto-recall to store selected IDs into
pendingRecall[sessionKey].recallIds before returning.
2. Bug 2 (MAJOR): stripEnvelopeMetadata regex had literal backspace (0x08)
- In src/smart-extractor.ts line 76, a literal backspace character
(byte 0x08) was embedded in the regex pattern between 'agent' and '.',
producing 'agent[0x08].*?' instead of 'agent\b.*?'.
- Fix: replaced the 0x08 byte with the proper \b word boundary.
3. Bug 3 (MAJOR): WeakSet.clear() does not exist
- In index.ts resetRegistration(), _registeredApis.clear() was called,
but WeakSet has no clear() method.
- Fix: removed the .clear() call per the comment's own note.
…g, parseSmartMetadata, importance row update)
Bug 1 (P1): pendingRecall was written with recallIds from Turn N but responseText
from Turn N-1, causing feedback to score the wrong memories.
Fix: before_prompt_build (auto-recall) now CREATES pendingRecall with recallIds.
agent_end now only WRITES responseText to an existing entry (never creates).
Bug 2 (P2): parseSmartMetadata was called with empty placeholder metadata,
returning fallback values instead of real entry data.
Fix: use store.getById(recallId) to get the real entry before parsing.
Bug 3 (P2): patchMetadata only updates the metadata JSON blob, not the
entry.importance ROW column. applyImportanceWeight reads entry.importance,
so importance adjustments never affected ranking.
Fix: use store.update(id, { importance: newValue }) to update the row directly.
Bug 1 [P1]: pendingRecall.delete() moved from session_end to feedback hook finally block — prevents repeated scoring of the same recallIds/ responseText pair when subsequent turns skip auto-recall (greeting, short input). Now deleted immediately after scoring completes. Bug 2 [P2]: confirmed use now resets bad_recall_count to 0 — so penalty threshold (3) only applies to truly consecutive misses, not interleaved confirmed-use/miss patterns. Bug 3 [P3]: retrieveWithTrace now forwards source to hybridRetrieval(), aligning debug/trace retrieval with real manual-recall behavior.
…anup, env-resolve gate, recency double-boost)
P1-1 (isRecallUsed): Add direct injected-ID check
- The function accepted injectedIds but never used them
- Added loop to check if response contains any injected memory ID
- This complements the existing stock-phrase check
P1-2 (rerank env vars): Add rerank-enabled guard
- Only resolve \ placeholders when rerank is actually enabled
- Prevents startup failure when rerankApiKey has unresolved placeholder
but reranking is disabled (rerank='none')
P2 (multi-line wrapper stripping): Strip boilerplate continuation lines
- stripLeadingRuntimeWrappers now also strips lines matching
AUTO_CAPTURE_RUNTIME_WRAPPER_BOILERPLATE_RE (e.g.
'Results auto-announce to your requester.', 'Do not use any memory tools.')
while strippingLeadIn is still true, preventing these lines from
being kept when they appear right after the wrapper prefix line
…d configurable feedback amplitudes
…er prompt extraction, parsePluginConfig feedback, bad_recall_count double-increment)
Bug 1 (P1): isRecallUsed() only checked stock phrases and raw IDs,
but auto-recall injects [category:scope] summary format text.
Fix: store injectedSummaries (item.line) in pendingRecall on auto-recall
injection; pass them to isRecallUsed() which now checks if the response
contains any of the injected summary text verbatim.
Bug 2 (P1): confirm/error keywords were checked against pending.responseText
(previous-turn assistant response) instead of the current-turn user prompt.
Fix: read event.prompt (array of {role, content} messages) in the
before_prompt_build feedback hook and check keywords against the last user
message in that array.
Bug 3 (P2): parsePluginConfig() never copied cfg.feedback to the returned
config object, so all deployments fell back to hardcoded defaults.
Fix: add feedback block to the return object in parsePluginConfig.
Bug 4 (P2): bad_recall_count was incremented in BOTH the auto-recall
injection path AND the feedback hook, causing double-counting that made
the 3-consecutive-miss penalty trigger after only 2 actual misses.
Fix: remove +1 from the feedback hook; counter now only increments once
(in the auto-recall injection path where staleInjected is evaluated).
…ssages user prompt, agentId keying Bug 1 (P1): Score each recall independently instead of one usedRecall for the whole batch. - Build summaryMap: recallId -> injected summary - Call isRecallUsed per recallId with its specific summary - Prevents unused memories from being boosted or used ones penalized Bug 2 (P2): Extract user prompt from event.messages array, not event.prompt. - event.prompt is a plain string (confirmed by codebase usage), not an array - Extract last user message from event.messages (same pattern as agent_end) Bug 3 (P2): pendingRecall key includes agentId to avoid cross-agent overwrite. - Key format: sessionKey:agentId (both in auto-recall and feedback/agent_end hooks)
P1 fix: replace single-char CJK keywords (是/對/不/錯) with longer phrases (是對的/確認/錯誤/更正) to avoid false positives on ordinary conversation. P3 fix: session_end hook was not cleaning pendingRecall at all. Add cleanup of all pendingRecall entries that match the sessionId or sessionKey:agentId composite key pattern.
…ory leak When config.autoCapture === false, the auto-capture session_end (priority 10) was skipped, leaving only the Phase 1 session_end (priority 20) to clean up. The old code only deleted pendingRecall[sessionKey] - a simple key - but not composite keys (sessionKey:agentId). Now uses pattern matching (startsWith) to clean all related keys regardless of format. Fixes: P1 issue from Phase 1 audit
…ompatible field mapping - parsePluginConfig() now properly parses feedback fields with correct type guards - Maps legacy Phase 1 field names (boostOnUse, penaltyOnMiss) to new names (importanceBoostOnUse, importancePenaltyOnMiss) - Applies Math.max(1, Math.floor()) for minRecallCount fields - Filters array fields (confirmKeywords, errorKeywords) for type safety - Removes redundant inline config reading in before_prompt_build hook
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ce42d63be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| api.on("agent_end", agentEndAutoCaptureHook); | ||
| } | ||
|
|
||
| // ======================================================================== |
There was a problem hiding this comment.
Close auto-capture block before registering other hooks
The if (config.autoCapture !== false) block is no longer closed after api.on("agent_end", agentEndAutoCaptureHook), so subsequent registration code is now gated by autoCapture. In this commit that means disabling auto-capture also disables unrelated behavior (including the new recall-feedback hooks, self-improvement hooks, and later registration paths in onRegister), which is a functional regression for configurations that intentionally set autoCapture: false. Add the missing block close before the Proposal A feedback hook section so only auto-capture logic is conditional.
Useful? React with 👍 / 👎.
|
See review on #507 — the branch topology for the Proposal A chain needs to be fixed first. This PR should target the Phase 1 branch, not master. |
Summary
Proposal A Phase 3: Configurable Feedback Amplitudes. Extracts hardcoded feedback parameters into a FeedbackConfigManager class, allowing deployment-level customization.
What was built
FeedbackConfigManager (src/feedback-config.ts)
Configurable Parameters
importanceBoostOnUseimportanceBoostOnConfirmimportancePenaltyOnMissimportancePenaltyOnErrorminRecallCountForPenaltyconfirmKeywordserrorKeywordsBackward Compatibility
All defaults match Phase 1 hardcoded values exactly. No breaking change for existing deployments.
Files changed
src/feedback-config.ts: +91 lines (new class)index.ts: +49/-10 lines (parsePluginConfig + hook integration)Related
Requires Phase 1 (#PR to be determined). Enables Phase 4 test coverage.