-
Notifications
You must be signed in to change notification settings - Fork 0
Checkpointing
PATAS supports checkpointing for long-running pattern mining operations, allowing you to resume from where you left off if the process is interrupted.
Checkpointing saves progress during pattern mining operations, including:
- Last processed message ID
- Intermediate pattern results
- Current stage (for two-stage pipeline)
- Operation metadata
This is especially important for:
- Large datasets (millions of messages)
- Long-running operations (hours)
- Unstable environments
- Resource-constrained deployments
Checkpoints are automatically created and updated during pattern mining:
- Checkpoint Creation: Created at the start of pattern mining
- Periodic Updates: Updated every 5 chunks (configurable)
- Completion: Marked as completed when mining finishes
- Failure Handling: Marked as failed on errors
Each checkpoint stores:
-
Parameters:
days,min_spam_count -
Progress:
last_processed_message_id,chunk_index -
State:
patterns_in_progress(intermediate results) -
Stage: Current pipeline stage (
stage1,stage2,completed) - Metadata: Additional operation information
Use the CLI to resume a failed or interrupted pattern mining operation:
# List recent checkpoints
patas list-checkpoints
# Resume from a specific checkpoint
patas resume-mining <checkpoint_id> [--use-llm] [--use-semantic]# Start pattern mining (creates checkpoint automatically)
patas mine-patterns 30
# If interrupted, list checkpoints
patas list-checkpoints
# Resume from checkpoint ID 5
patas resume-mining 5 --use-semanticCheckpoints have three statuses:
- RUNNING: Operation in progress
- COMPLETED: Operation finished successfully
- FAILED: Operation failed or was interrupted
# List recent checkpoints (default: 10)
patas list-checkpoints [limit] [status]
# Examples
patas list-checkpoints 20 # List 20 most recent
patas list-checkpoints 10 running # List running checkpoints only
patas list-checkpoints 5 failed # List failed checkpointspatas resume-mining <checkpoint_id> [--use-llm] [--use-semantic]
# Examples
patas resume-mining 5 # Resume without LLM
patas resume-mining 5 --use-llm # Resume with LLM
patas resume-mining 5 --use-semantic # Resume with semantic miningCheckpoints are stored in the pattern_mining_checkpoints table:
CREATE TABLE pattern_mining_checkpoints (
id INTEGER PRIMARY KEY,
started_at TIMESTAMP NOT NULL,
last_updated TIMESTAMP NOT NULL,
status VARCHAR NOT NULL, -- 'running', 'completed', 'failed'
days INTEGER NOT NULL,
min_spam_count INTEGER NOT NULL,
last_processed_message_id INTEGER,
patterns_in_progress JSON,
stage VARCHAR,
metadata JSON
);To add checkpoint support to an existing database:
python scripts/migrate_add_checkpoint_table.pyOr the table will be created automatically on next startup if using SQLAlchemy's create_all.
- Monitor Checkpoints: Regularly check for failed checkpoints
- Cleanup Old Checkpoints: Remove completed checkpoints older than 30 days
- Resume Promptly: Resume failed operations as soon as possible
- Checkpoint Frequency: Adjust batch size if checkpoint updates are too frequent/infrequent
If checkpoints aren't being created:
- Verify database connectivity
- Check database permissions
- Review application logs for errors
If resuming from checkpoint fails:
- Verify checkpoint exists:
patas list-checkpoints - Check checkpoint status (should be
runningorfailed) - Review checkpoint metadata for clues
- Check database for related data integrity
If checkpoint updates are impacting performance:
- Checkpoint updates happen every 5 chunks by default
- This is a balance between progress tracking and performance
- For very large datasets, consider increasing chunk size
- Pattern Mining Guide - Pattern mining overview
- Production Deployment Guide - Production best practices
- Performance Guide - Performance optimization