Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
12aa085
feat: Proposal A Phase 1 - Recall usage tracking hooks
jlin53882 Apr 3, 2026
32d29c8
fix: 3 critical bugs in feat/proposal-a-v3
jlin53882 Apr 3, 2026
14100f9
fix: resolve 3 bugs in Proposal A feedback hooks (pendingRecall timin…
jlin53882 Apr 3, 2026
6084f73
fix(proposal-a): 3 recall-usage bugs in feat/proposal-a-v3
jlin53882 Apr 3, 2026
421f637
fix(proposal-a): 4 final bugs (importance fallback, pendingRecall cle…
jlin53882 Apr 3, 2026
d68ee69
fix(P1-1,P1-2,P2): Codex round-6 verification fixes
Apr 3, 2026
5bb2137
feat(proposal-a): add user confirmation signals, min_recall_count, an…
jlin53882 Apr 3, 2026
36a978f
fix(proposal-a): 4 bugs in feedback hooks (isRecallUsed summaries, us…
jlin53882 Apr 3, 2026
c6668eb
fix(proposal-a): OpenCode review fixes - per-recall scoring, event.me…
jlin53882 Apr 3, 2026
718fd74
fix(proposal-a): longer CJK keywords + session_end pendingRecall cleanup
jlin53882 Apr 3, 2026
ec4d643
fix: change bad_recall threshold from 3 to 2 per spec (issue #445)
jlin53882 Apr 3, 2026
2a8d0e5
fix(recall): AND logic in isRecallUsed + bad_recall_count increment fix
jlin53882 Apr 4, 2026
0b4aeff
fix(proposal-a): session_end hook clean composite keys to prevent mem…
jlin53882 Apr 4, 2026
b371f0c
fix(proposal-a): P0-1 pendingRecall TTL cleanup, P0-3 summary AND gat…
jlin53882 Apr 12, 2026
d3d0b71
fix(proposal-a): P2 align suppression threshold with scoring path (>= 2)
jlin53882 Apr 12, 2026
8990c9d
fix(proposal-a): P1 natural usage detection in Summary path
jlin53882 Apr 12, 2026
d33fe19
fix(proposal-a): P1 align scoring penalty with injection increment
jlin53882 Apr 12, 2026
45247f2
fix(proposal-a): P1/P2 refine feedback loop
jlin53882 Apr 12, 2026
84ddffa
fix(proposal-a): P1 fix errorKeywords precedence + prevent double-count
jlin53882 Apr 13, 2026
3d9f27d
fix(proposal-a): P1 restore autoCapture closing brace + P2 coerce fee…
jlin53882 Apr 13, 2026
64767ee
fix(proposal-a): P1/P2 Codex round-5 fixes - session cleanup, summary…
jlin53882 Apr 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
280 changes: 279 additions & 1 deletion index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ import {
import {
extractReflectionLearningGovernanceCandidates,
extractInjectableReflectionMappedMemoryItems,
isRecallUsed,
} from "./src/reflection-slices.js";
import { createReflectionEventId } from "./src/reflection-event-store.js";
import { buildReflectionMappedMetadata } from "./src/reflection-mapped-metadata.js";
Expand Down Expand Up @@ -225,6 +226,22 @@ interface PluginConfig {
skipLowValue?: boolean;
maxExtractionsPerHour?: number;
};
feedback?: {
/** Boost importance when a recalled memory is used (default: 0.05) */
boostOnUse?: number;
/** Penalty when a recalled memory is not used after consecutive misses (default: 0.03) */
penaltyOnMiss?: number;
/** Extra boost when user explicitly confirms a recalled memory is correct (default: 0.15) */
boostOnConfirm?: number;
/** Extra penalty when user explicitly corrects a non-recalled memory (default: 0.10) */
penaltyOnError?: number;
/** Minimum recall (injection) count before penalty applies to this memory (default: 2) */
minRecallCountForPenalty?: number;
/** Keywords indicating user confirmation of a recalled memory */
confirmKeywords?: string[];
/** Keywords indicating user correction/error for a non-recalled memory */
errorKeywords?: string[];
};
}

type ReflectionThinkLevel = "off" | "minimal" | "low" | "medium" | "high";
Expand Down Expand Up @@ -2006,6 +2023,32 @@ const memoryLanceDBProPlugin = {
const autoCapturePendingIngressTexts = new Map<string, string[]>();
const autoCaptureRecentTexts = new Map<string, string[]>();

// ========================================================================
// Proposal A Phase 1: Recall Usage Tracking Hooks
// ========================================================================
// Track pending recalls per session for usage scoring
type PendingRecallEntry = {
recallIds: string[];
responseText: string;
injectedAt: number;
/** Summary text lines actually injected into the prompt, used for usage detection. */
injectedSummaries: string[];
};
// P0-1 fix: pendingRecall TTL-based cleanup to prevent unbounded memory growth.
// Entries older than 10 minutes are cleaned up on each set() call.
const PENDING_RECALL_MAX_AGE_MS = 10 * 60 * 1000; // 10 minutes
function cleanupPendingRecall(): void {
const now = Date.now();
for (const [key, entry] of pendingRecall.entries()) {
if (now - entry.injectedAt > PENDING_RECALL_MAX_AGE_MS) {
pendingRecall.delete(key);
}
}
}
const pendingRecall = new Map<string, PendingRecallEntry>();
// Clean up on module load (handles re-registration edge cases)
cleanupPendingRecall();

const logReg = isCliMode() ? api.logger.debug : api.logger.info;
logReg(
`memory-lancedb-pro@${pluginVersion}: plugin registered (db: ${resolvedDbPath}, model: ${config.embedding.model || "text-embedding-3-small"}, smartExtraction: ${smartExtractor ? 'ON' : 'OFF'})`
Expand Down Expand Up @@ -2469,7 +2512,9 @@ const memoryLanceDBProPlugin = {
const nextBadRecallCount = staleInjected
? meta.bad_recall_count + 1
: meta.bad_recall_count;
const shouldSuppress = nextBadRecallCount >= 3 && minRepeated > 0;
// P2 fix: suppress threshold aligned with scoring path (>= 2). After 2 bad recalls,
// both the scoring penalty and suppression kick in simultaneously.
const shouldSuppress = nextBadRecallCount >= 2 && minRepeated > 0;
await store.patchMetadata(
item.id,
{
Expand All @@ -2496,6 +2541,27 @@ const memoryLanceDBProPlugin = {
`memory-lancedb-pro: injecting ${selected.length} memories into context for agent ${agentId}`,
);

// Create or update pendingRecall for this turn so the feedback hook
// (which runs in the NEXT turn's before_prompt_build after agent_end)
// sees a matching pair: Turn N recallIds + Turn N responseText.
// agent_end will write responseText into this same pendingRecall
// entry (only updating responseText, never clearing recallIds).
// Include agentId in the key so different agents in the same session do not overwrite each other's pendingRecall.
const sessionKeyForRecall = `${ctx?.sessionKey || ctx?.sessionId || "default"}:${agentId ?? ""}`;
// Bug 1 fix: also store the injected summary lines so the feedback hook
// can detect usage even when the agent doesn't use stock phrases or IDs
// but directly incorporates the memory content into the response.
// P1 fix: store summary content only (without prefix) for accurate matching.
const injectedSummaries = selected.map((item) => item.summary);
// P0-1 fix: run TTL cleanup before each set to prevent unbounded growth
cleanupPendingRecall();
pendingRecall.set(sessionKeyForRecall, {
recallIds: selected.map((item) => item.id),
responseText: "", // Will be populated by agent_end
injectedAt: Date.now(),
injectedSummaries,
});

return {
prependContext:
`<relevant-memories>\n` +
Expand Down Expand Up @@ -2539,6 +2605,15 @@ const memoryLanceDBProPlugin = {
recallHistory.delete(sessionId);
turnCounter.delete(sessionId);
lastRawUserMessage.delete(sessionId);
// P3 fix: clean all pendingRecall entries for this session.
// pendingRecall keys use format: sessionKey (or sessionKey:agentId with composite key).
// P1 fix: use sessionId only when sessionKey is absent to avoid clearing unrelated sessions.
const sessionKeyToClean = ctx?.sessionKey ?? sessionId;
for (const key of pendingRecall.keys()) {
if (key === sessionId || key.startsWith(`${sessionId}:`) || (sessionKeyToClean && key.startsWith(`${sessionKeyToClean}:`))) {
pendingRecall.delete(key);
}
}
}
// Also clean by channelId/conversationId if present (shared cache key)
const cacheKey = ctx?.channelId || ctx?.conversationId || "";
Expand Down Expand Up @@ -2891,8 +2966,205 @@ const memoryLanceDBProPlugin = {
};

api.on("agent_end", agentEndAutoCaptureHook);

// ========================================================================
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore closing brace for auto-capture guard

The if (config.autoCapture !== false) block that starts earlier is no longer closed here, so it now stays open until the end of register (around line 4033). That makes unrelated hook registration (self-improvement, reflection, lifecycle/backup setup, etc.) conditional on autoCapture, so deployments with autoCapture: false will silently lose those features.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both issues fixed in commit 3d9f27d:

P1 - autoCapture closing brace: The if (config.autoCapture !== false) block (agent_end auto-capture, line 2625) was missing its closing brace. Added } after the Phase 1 before_prompt_build hook (priority 5) to properly close the block. Self-improvement, reflection, and lifecycle hooks now run independently of autoCapture.

P2 - feedback config type coercion: Added Number() coercion for all numeric feedback config values to prevent string concatenation.

// Proposal A Phase 1: agent_end hook - Store response text for usage tracking
// ========================================================================
// NOTE: Only writes responseText to an EXISTING pendingRecall entry created
// by before_prompt_build (auto-recall). Does NOT create a new entry.
// This ensures recallIds (written by auto-recall in the same turn) and
// responseText (written here) remain paired for the feedback hook.
api.on("agent_end", (event: any, ctx: any) => {
// Use same key format as auto-recall hook (sessionKey:agentId) so we update the right entry.
const agentIdForKey = resolveHookAgentId(ctx?.agentId, (event as any).sessionKey);
const sessionKey = `${ctx?.sessionKey || ctx?.sessionId || "default"}:${agentIdForKey ?? ""}`;
if (!sessionKey) return;

// Get the last message content
let lastMsgText: string | null = null;
if (event.messages && Array.isArray(event.messages)) {
const lastMsg = event.messages[event.messages.length - 1];
if (lastMsg && typeof lastMsg === "object") {
const msgObj = lastMsg as Record<string, unknown>;
lastMsgText = extractTextContent(msgObj.content);
}
}

// Only update an existing pendingRecall entry — do NOT create one.
// This preserves recallIds written by auto-recall earlier in this turn.
const existing = pendingRecall.get(sessionKey);
if (existing && lastMsgText && lastMsgText.trim().length > 0) {
existing.responseText = lastMsgText;
}
}, { priority: 20 });

// ========================================================================
// Proposal A Phase 1: before_prompt_build hook (priority 5) - Score recalls
// ========================================================================
api.on("before_prompt_build", async (event: any, ctx: any) => {
// Use same key format as auto-recall hook (sessionKey:agentId) so we read the right entry.
const agentIdForKey = resolveHookAgentId(ctx?.agentId, (event as any).sessionKey);
const sessionKey = `${ctx?.sessionKey || ctx?.sessionId || "default"}:${agentIdForKey ?? ""}`;
// P2 fix: also cleanup on read path to handle idle sessions that never trigger set()
cleanupPendingRecall();
const pending = pendingRecall.get(sessionKey);
if (!pending) return;

// Guard: only score if responseText has substantial content
const responseText = pending.responseText;
if (!responseText || responseText.length <= 24) {
// Skip scoring for empty or very short responses
// Bug 5 fix: also clear pendingRecall so the next turn does not
// re-trigger feedback on stale recallIds / old responseText.
pendingRecall.delete(sessionKey);
return;
}

// Read recall IDs directly from pendingRecall (populated by auto-recall's
// before_prompt_build hook from the PREVIOUS turn). This replaces the
// broken regex-based parsing of prependContext which never matched the
// actual [category:scope] format used by auto-recall injection.
const injectedIds = pending.recallIds ?? [];

// Bug 1 fix: also retrieve the injected summary lines so isRecallUsed can
// detect when the agent directly incorporates memory content into the response.
const injectedSummaries = pending.injectedSummaries ?? [];

// Check if any recall was actually used by checking if the response contains reference to the injected content
// This is a heuristic - we check if the response shows awareness of injected memories
let usedRecall = false;
if (injectedIds.length > 0 || injectedSummaries.length > 0) {
// Use the real isRecallUsed function from reflection-slices
usedRecall = isRecallUsed(responseText, injectedIds, injectedSummaries);
}

// Read feedback config values with defaults
// P2 fix: coerce to Number and use ?? to preserve explicit zero values.
// P2 fix: use nullish coalescing to allow 0 as a valid config value.
const fb = config.feedback ?? {};
const boostOnUse = Number(fb.boostOnUse ?? 0) ?? 0.05;
const penaltyOnMiss = Number(fb.penaltyOnMiss ?? 0) ?? 0.03;
const boostOnConfirm = Number(fb.boostOnConfirm ?? 0) ?? 0.15;
const penaltyOnError = Number(fb.penaltyOnError ?? 0) ?? 0.10;
const minRecallCountForPenalty = fb.minRecallCountForPenalty ?? 2;
const confirmKeywords = fb.confirmKeywords ?? ["correct", "right", "yes", "confirmed", "exactly", "對", "沒錯", "正確", "確認", "好的"];
const errorKeywords = fb.errorKeywords ?? ["wrong", "incorrect", "not right", "that's wrong", "error", "mistake", "fix it", "change that", "改成", "改為", "不是這樣", "不對", "錯了"];

// event.prompt is a plain string in the current hook contract (confirmed by codebase usage).
// We extract the user's last message from event.messages array instead.
let userPromptText = "";
try {
if (event.messages && Array.isArray(event.messages)) {
for (let i = event.messages.length - 1; i >= 0; i--) {
const msg = event.messages[i];
if (msg && msg.role === "user" && typeof msg.content === "string" && msg.content.trim().length > 0) {
userPromptText = msg.content.trim();
break;
}
if (msg && msg.role === "user" && Array.isArray(msg.content)) {
// Handle array-form content
const text = extractTextContent(msg.content);
if (text && text.trim().length > 0) {
userPromptText = text.trim();
break;
}
}
}
}
} catch (_e) {
userPromptText = "";
}

// Helper: check if text contains any of the keywords (case-insensitive)
const containsKeyword = (text: string, keywords: string[]): boolean =>
keywords.some((kw) => text.toLowerCase().includes(kw.toLowerCase()));

// Score the recall - update importance based on usage
// Score each recall individually — do NOT compute a single usedRecall for the whole batch.
// Bug 1 fix (P1): when auto-recall injects multiple memories, the agent may use only some of them.
// Scoring them all with one decision corrupts ranking: unused memories get boosted, used ones get penalized.
if (injectedIds.length > 0) {
try {
// Build lookup: recallId -> injected summary text for this specific recall
const summaryMap = new Map<string, string>();
for (let i = 0; i < injectedIds.length; i++) {
if (injectedSummaries[i]) {
summaryMap.set(injectedIds[i], injectedSummaries[i]);
}
}

for (const recallId of injectedIds) {
const summaryText = summaryMap.get(recallId) ?? "";
// Score this specific recall independently
const usedRecall = isRecallUsed(
responseText,
[recallId],
summaryText ? [summaryText] : [],
);

const entry = await store.getById(recallId, undefined);
if (!entry) continue;
const meta = parseSmartMetadata(entry.metadata, entry);
const currentImportance = meta.importance ?? entry.importance ?? 0.5;

// P1 fix (Codex): check errorKeywords BEFORE usedRecall.
// If user explicitly corrects, that overrides the heuristic usage detection.
// Also set last_confirmed_use_at here to prevent injection path's staleInjected
// from double-counting in the same turn.
const hasError = containsKeyword(userPromptText, errorKeywords);
if (hasError) {
if ((meta.injected_count || 0) >= minRecallCountForPenalty) {
await store.update(recallId, { importance: Math.max(0.1, currentImportance - penaltyOnError) }, undefined);
}
await store.patchMetadata(recallId, { bad_recall_count: (meta.bad_recall_count || 0) + 1, last_confirmed_use_at: Date.now() }, undefined);
} else if (usedRecall) {
// Pure positive use: boost importance
let newImportance = Math.min(1.0, currentImportance + boostOnUse);
if (containsKeyword(userPromptText, confirmKeywords)) {
newImportance = Math.min(1.0, newImportance + boostOnConfirm);
}
await store.update(recallId, { importance: newImportance }, undefined);
await store.patchMetadata(recallId, { last_confirmed_use_at: Date.now(), bad_recall_count: 0 }, undefined);
} else {
// P1 fix: align scoring penalty threshold with injection increment.
// Silent miss: apply penalty if badCount >= 1 (injection path handles increment).
const badCount = meta.bad_recall_count || 0;
if (badCount >= 1 && (meta.injected_count || 0) >= minRecallCountForPenalty) {
await store.update(recallId, { importance: Math.max(0.1, currentImportance - penaltyOnMiss) }, undefined);
}
// No increment here - injection path already increments via staleInjected.
}
}
} catch (err) {
api.logger.warn(`memory-lancedb-pro: recall usage scoring failed: ${String(err)}`);
} finally {
pendingRecall.delete(sessionKey);
}
}
}, { priority: 5 });
}

// ========================================================================
// Proposal A Phase 1: session_end hook - Clean up pending recalls
// ========================================================================
api.on("session_end", (_event: any, ctx: any) => {
// P1 fix: clean all pendingRecall entries for this session, including composite keys.
// When autoCapture is false, the auto-capture session_end (priority 10) is skipped,
// so this hook must handle composite keys (sessionKey:agentId) as well.
const sessionId = ctx?.sessionId || "";
const sessionKey = ctx?.sessionKey || "";
for (const key of pendingRecall.keys()) {
if (
key === sessionKey ||
key === sessionId ||
key.startsWith(`${sessionKey}:`) ||
key.startsWith(`${sessionId}:`)
) {
pendingRecall.delete(key);
}
}
}, { priority: 20 });

// ========================================================================
// Integrated Self-Improvement (inheritance + derived)
// ========================================================================
Expand Down Expand Up @@ -4045,6 +4317,12 @@ export function parsePluginConfig(value: unknown): PluginConfig {
: 30,
}
: { skipLowValue: false, maxExtractionsPerHour: 30 },
// Bug 3 fix: parse and return the feedback config block so deployments
// that specify custom feedback values actually take effect instead of
// falling back to hardcoded defaults.
feedback: typeof cfg.feedback === "object" && cfg.feedback !== null
? { ...(cfg.feedback as Record<string, unknown>) }
Comment on lines +4323 to +4324
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize feedback config types before using them

This shallow copy keeps raw feedback values without type coercion, but the scoring path later performs numeric arithmetic with these fields. If a deployment provides values as strings (common with env-driven config), ?? will keep the string, currentImportance + boostOnUse becomes string concatenation, and downstream Math.min/Math.max can produce NaN, leading to invalid importance updates in storage.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both issues fixed in commit 3d9f27d:

P1 - autoCapture closing brace: The if (config.autoCapture !== false) block (agent_end auto-capture, line 2625) was missing its closing brace. Added } after the Phase 1 before_prompt_build hook (priority 5) to properly close the block. Self-improvement, reflection, and lifecycle hooks now run independently of autoCapture.

P2 - feedback config type coercion: Added Number() coercion for all numeric feedback config values to prevent string concatenation.

: {},
};
}

Expand Down
11 changes: 11 additions & 0 deletions src/auto-capture-cleanup.ts
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,17 @@ function stripLeadingRuntimeWrappers(text: string): string {
continue;
}

// Bug fix: also strip known boilerplate continuation lines (e.g.
// "Results auto-announce to your requester.", "Do not use any memory tools.")
// that appear right after the wrapper prefix. These lines do NOT match the
// wrapper prefix regex but are part of the wrapper boilerplate.
if (strippingLeadIn) {
AUTO_CAPTURE_RUNTIME_WRAPPER_BOILERPLATE_RE.lastIndex = 0;
if (AUTO_CAPTURE_RUNTIME_WRAPPER_BOILERPLATE_RE.test(current)) {
continue;
}
}

strippingLeadIn = false;
cleanedLines.push(line);
}
Expand Down
Loading
Loading