Problem
RetentionEngine::select_eviction_candidates() loads ALL nodes into memory, deserializes them, sorts by (importance ASC, created_at ASC), then takes the top N.
At 100k+ nodes this becomes expensive — full table scan + deserialize + sort.
Current code
crates/cortex-core/src/policies/retention.rs — select_eviction_candidates()
Fix options
- Secondary index — maintain a
NODES_BY_IMPORTANCE multimap table (bucketed f32 → node IDs), query lowest bucket first
- Materialized priority queue — maintain a sorted eviction queue in a dedicated redb table, updated on
put_node
- Streaming sort with early exit — iterate by
created_at (already ordered in redb), keep a bounded heap of size N by importance
Option 3 is simplest and avoids new indexes. Options 1/2 are O(N) on write but O(1) on eviction.
Priority
Low — only matters at scale (100k+ nodes). Current usage is well below this.
Problem
RetentionEngine::select_eviction_candidates()loads ALL nodes into memory, deserializes them, sorts by(importance ASC, created_at ASC), then takes the top N.At 100k+ nodes this becomes expensive — full table scan + deserialize + sort.
Current code
crates/cortex-core/src/policies/retention.rs—select_eviction_candidates()Fix options
NODES_BY_IMPORTANCEmultimap table (bucketed f32 → node IDs), query lowest bucket firstput_nodecreated_at(already ordered in redb), keep a bounded heap of size N by importanceOption 3 is simplest and avoids new indexes. Options 1/2 are O(N) on write but O(1) on eviction.
Priority
Low — only matters at scale (100k+ nodes). Current usage is well below this.