This feature implements intelligent context management to prevent token overflow and hallucinations caused by excessive context window usage. When the conversation grows large, the system automatically summarizes older messages to maintain context continuity.
Purpose: Tracks context size, generates statistics, and handles message compaction.
Features:
- Token estimation using character-based approximation
- Context statistics tracking (usage %, message count)
- Sliding window for recent messages
- System message preservation
- Summary generation prompts
Configuration Options:
[memory.context]
# Target context window size (approximate tokens)
context_window_target = 32000
# Maximum messages to retain after compaction
max_retained_messages = 50
# Percentage of context window that triggers auto-summarize (0-100)
auto_summarize_threshold = 75
# Enable automatic context summarization
enable_auto_summarize = truePurpose: Integrates context management into the session lifecycle.
Flow:
- Before each
run_turn: Check if context needs attention - After each
run_turn: Check if summarization should trigger - If threshold exceeded: Call AI to generate summary
- Apply summary: Replace old messages with concise summary
- Emit events:
ContextWarning,ContextCompaction
Events Emitted:
ContextWarning: When context reaches 80% of targetContextCompaction: During summarization phases (starting/completed)
New ContextConfig:
pub struct ContextConfig {
pub context_window_target: usize, // Default: 32000
pub max_retained_messages: usize, // Default: 50
pub auto_summarize_threshold: u8, // Default: 75
pub enable_auto_summarize: bool, // Default: true
}Nested under memory:
[memory]
file_path = ".nca/memory.json"
max_notes = 128
auto_compact_on_finish = false
[memory.context]
context_window_target = 32000
max_retained_messages = 50
auto_summarize_threshold = 75
enable_auto_summarize = true// Rough approximation: tokens ≈ characters / 4
// Tool messages: more token-dense (3.5 divisor)
// System messages: standard (4.0 divisor)
// + 10 base overhead + ~50 per tool call- Preserve System Messages: Always keep at start
- Sliding Window: Keep last N messages (configurable)
- Summarize Middle: Old messages get summarized by AI
- Insert Summary: Summary inserted as system message with special header
## Conversation Summary (Earlier Context)
[AI-generated concise summary covering:]
- Key topics and goals discussed
- Important decisions or findings
- Critical context (file paths, variable names, errors)
Simply start a session - context management is enabled by default with sensible defaults.
For very long conversations, increase thresholds:
[memory.context]
context_window_target = 64000 # For 128k context models
auto_summarize_threshold = 80
max_retained_messages = 100[memory.context]
enable_auto_summarize = false- Prevents Token Overflow: Automatic compaction before hitting limits
- Reduces Hallucinations: Smaller, focused context = more accurate responses
- Context Continuity: Important context preserved via summarization
- Cost Efficiency: Uses fewer tokens per request
- Transparency: Events emitted for UI feedback
Run tests:
cargo test -p nca-runtime context_managerTest coverage:
- Token estimation
- Context statistics
- Sliding window behavior
- Summary application
- Hierarchical Summaries: Multiple levels of summarization
- Importance Scoring: Preserve critical messages
- Tool Call Preservation: Keep summaries of tool executions
- Async Summarization: Background summarization without blocking
- Model-Specific Tuning: Adjust based on provider context limits