-
Notifications
You must be signed in to change notification settings - Fork 0
Incremental Mining
Nick edited this page Nov 21, 2025
·
1 revision
Process only new messages instead of re-analyzing entire dataset.
Incremental mining uses checkpoint last_processed_message_id to process only messages added after the last mining run.
Benefits:
- Faster processing (only new messages)
- Lower costs (fewer LLM/embedding calls)
- Suitable for continuous operation
Limitations:
- Old patterns are not re-evaluated
- New patterns are discovered only from new messages
- Use full mining periodically to catch pattern evolution
# First run (full mining)
patas mine-patterns --days=7
# Subsequent runs (incremental, from last checkpoint)
patas mine-patterns --days=7 --since-checkpoint <checkpoint_id># List recent checkpoints
patas list-checkpoints
# Output shows checkpoint IDs and last_processed_message_idimport requests
# Get last checkpoint
response = requests.get("http://localhost:8000/api/v1/checkpoints?limit=1")
checkpoint_id = response.json()[0]["id"]
# Run incremental mining
response = requests.post(
"http://localhost:8000/api/v1/patterns/mine",
json={"days": 7, "since_checkpoint": checkpoint_id}
)-
Checkpoint stores
last_processed_message_idafter each mining run -
Incremental mining filters messages:
WHERE id > last_processed_message_id - Only new messages are processed for pattern discovery
- Old patterns remain unchanged (not re-evaluated)
# Daily cron job
0 2 * * * patas mine-patterns --days=1 --since-checkpoint $(patas list-checkpoints --limit=1 --status=completed | grep -o '[0-9]*' | head -1)# Weekly full re-analysis (no --since-checkpoint)
0 3 * * 0 patas mine-patterns --days=7- Daily: Incremental mining (new messages only)
- Weekly: Full mining (catch pattern evolution)
- Monthly: Full evaluation of all rules
Incremental (100K new messages):
- Time: ~20-30 min (vs ~3.5 hours for full 500K)
- Cost: ~$18 (vs ~$91 for full run)
Break-even: Incremental is 5-10x faster and cheaper for daily operations.