Skip to content

v0.4: eval suite for Cognee graph extraction quality (per (route × model)) #265

@thinmintdev

Description

@thinmintdev

Deferred from v0.3 per ADR-0014 §4 and audit gap G2 in docs/internal/audit-2026-05-22-phase8-skill-review.md.

Scope

Build the eval suite that measures graph-extraction quality for each (route × model) pair so the v0.3 "configurable model" gate can graduate from "configurable + caveat" to "configurable + measured" — and so future versions can auto-flip the default route when a local model demonstrably passes the bar.

  • Standard corpus (held-out documents covering technical + casual + multi-entity passages).
  • Metrics: entity recall, relation precision, schema-violation rate, structured-output reliability.
  • Per-route runner (upstream / primary / agent slot).
  • Output report consumable by docs (docs/memory/graph.md) + dashboard.

When this lands

  • Once landed, ADR-0014 §4 caveat copy ("Graph quality varies by model. We don't currently measure it for you — your results may vary.") can be replaced with a per-model quality readout.
  • Could enable an auto-default-on path for proven model families (additive; no schema migration).

Tag

v0.4 (no v0.3 label).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions