Skip to content

Bug: L2 Candidate Policy stuck in candidate forever (nextStatus never called for untouched policies) #1932

@a844810597

Description

@a844810597

Bug: L2 Candidate Policy 满足晋升条件但永远不晋升

现象

policies 表中存在 status='candidate' 的 policy,其 supportgain 已满足晋升阈值(support >= minSupport && gain >= minGain),但状态始终停留在 candidate,不会自动转为 active

实测数据:84 条 candidate 中有 29 条满足条件但未晋升,最长已停留超过 24 小时。


根因

nextStatus() 判断晋升的逻辑只在一个调用路径内执行:

reward.updated → subscriber.processReward() → runL2() → applyGain() → nextStatus()

runL2() 的 Step 4 遍历范围是 touched 集合——仅包含被当前 episode 的 trace 通过 cosine similarity ≥ minSimilarity 关联到的 policy:

// l2.ts Step 4
for (const policy of touched.values()) {
    ... applyGain(... nextStatus() ...)
}

问题:一个 candidate policy 如果在后续 episode 中不再被语义匹配到,就不会进入 touchednextStatus() 永远不会对它执行,即使 gain/support 已满足条件。


代码位置

  • packages/memos-core/src/core/memory/l2/l2.ts Step 4:for (const policy of touched.values()) — 仅处理 touched 的 policy
  • core/memory/l2/gain.tsnextStatus() 定义了 candidate→active 的确定性规则,但只有被 touch 的 policy 才能触发

相关注释

l2.ts 中已有类似 bug 的修复记录(关于 inductionEvidenceByPolicy 的注释):

"Without this bookkeeping, Step 4 would see zero withIds ... leaving it stuck in candidate forever. This was a real bug observed in end-to-end testing."

那次修了"创建时 withIds 为空导致 gain 为负卡住"的问题,但没修"后续不再 touch 导致 nextStatus 不执行"的问题。本质是同一类 bug——status 只在 touched 循环内变更。


修复方案

runL2() 的 Step 4 之后增加 Step 5,遍历所有 candidate policy 执行 nextStatus() 检查:

// Step 5: Re-evaluate untouched candidates
// A candidate policy that was induced in a previous episode but no longer
// matches any trace will never enter `touched` and therefore never have
// `nextStatus()` run against it. Without this sweep, it stays `candidate`
// forever even though its stored gain/support already satisfy the
// promotion thresholds.
{
  const t0 = Date.now();
  const untouchedCandidates = repos.policies.list({ status: "candidate" });
  for (const policy of untouchedCandidates) {
    if (touched.has(policy.id)) continue; // already handled in Step 4
    const next = nextStatus({
      currentStatus: policy.status,
      support: policy.support,
      gain: policy.gain,
      thresholds,
    });
    if (next !== policy.status) {
      repos.policies.updateStats(policy.id, {
        support: policy.support,
        gain: policy.gain,
        status: next,
        updatedAt: input.now ?? Date.now(),
      });
      emit(bus, {
        kind: "l2.policy.updated",
        episodeId: input.episodeId,
        policyId: policy.id,
        status: next,
        support: policy.support,
        gain: policy.gain,
      });
      log.info("run.recheck_candidate_promoted", {
        policyId: policy.id,
        status: next,
        support: policy.support,
        gain: policy.gain,
      });
    }
  }
}

同时需要在 l2.ts 的 import 中加上 nextStatus

-import { applyGain, computeGain, smoothGain } from "./gain.js";
+import { applyGain, computeGain, nextStatus, smoothGain } from "./gain.js";

可选优化

可以考虑加 throttle(按 updated_at 限制扫描范围,或增加最小扫描间隔)避免每次 episode 结束都全表扫描 candidate。但当前 candidate 数量通常不大(<1000),全表扫描成本很低。

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or improvement | 新功能或改进memosCore MemOS logic (memory, MCP, scheduler, API, database) | 核心模块

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions