Checkpointing

Pattern Mining Checkpointing

PATAS supports checkpointing for long-running pattern mining operations, allowing you to resume from where you left off if the process is interrupted.

Overview

Checkpointing saves progress during pattern mining operations, including:

Last processed message ID
Intermediate pattern results
Current stage (for two-stage pipeline)
Operation metadata

This is especially important for:

Large datasets (millions of messages)
Long-running operations (hours)
Unstable environments
Resource-constrained deployments

How It Works

Automatic Checkpointing

Checkpoints are automatically created and updated during pattern mining:

Checkpoint Creation: Created at the start of pattern mining
Periodic Updates: Updated every 5 chunks (configurable)
Completion: Marked as completed when mining finishes
Failure Handling: Marked as failed on errors

Checkpoint Data

Each checkpoint stores:

Parameters: days, min_spam_count
Progress: last_processed_message_id, chunk_index
State: patterns_in_progress (intermediate results)
Stage: Current pipeline stage (stage1, stage2, completed)
Metadata: Additional operation information

Usage

Resuming from Checkpoint

Use the CLI to resume a failed or interrupted pattern mining operation:

# List recent checkpoints
patas list-checkpoints

# Resume from a specific checkpoint
patas resume-mining <checkpoint_id> [--use-llm] [--use-semantic]

Example Workflow

# Start pattern mining (creates checkpoint automatically)
patas mine-patterns 30

# If interrupted, list checkpoints
patas list-checkpoints

# Resume from checkpoint ID 5
patas resume-mining 5 --use-semantic

Checkpoint Status

Checkpoints have three statuses:

RUNNING: Operation in progress
COMPLETED: Operation finished successfully
FAILED: Operation failed or was interrupted

CLI Commands

List Checkpoints

# List recent checkpoints (default: 10)
patas list-checkpoints [limit] [status]

# Examples
patas list-checkpoints 20              # List 20 most recent
patas list-checkpoints 10 running      # List running checkpoints only
patas list-checkpoints 5 failed        # List failed checkpoints

Resume Mining

patas resume-mining <checkpoint_id> [--use-llm] [--use-semantic]

# Examples
patas resume-mining 5                  # Resume without LLM
patas resume-mining 5 --use-llm        # Resume with LLM
patas resume-mining 5 --use-semantic   # Resume with semantic mining

Database Schema

Checkpoints are stored in the pattern_mining_checkpoints table:

CREATE TABLE pattern_mining_checkpoints (
    id INTEGER PRIMARY KEY,
    started_at TIMESTAMP NOT NULL,
    last_updated TIMESTAMP NOT NULL,
    status VARCHAR NOT NULL,  -- 'running', 'completed', 'failed'
    days INTEGER NOT NULL,
    min_spam_count INTEGER NOT NULL,
    last_processed_message_id INTEGER,
    patterns_in_progress JSON,
    stage VARCHAR,
    metadata JSON
);

Migration

To add checkpoint support to an existing database:

python scripts/migrate_add_checkpoint_table.py

Or the table will be created automatically on next startup if using SQLAlchemy's create_all.

Best Practices

Monitor Checkpoints: Regularly check for failed checkpoints
Cleanup Old Checkpoints: Remove completed checkpoints older than 30 days
Resume Promptly: Resume failed operations as soon as possible
Checkpoint Frequency: Adjust batch size if checkpoint updates are too frequent/infrequent

Troubleshooting

Checkpoint Not Created

If checkpoints aren't being created:

Verify database connectivity
Check database permissions
Review application logs for errors

Resume Fails

If resuming from checkpoint fails:

Verify checkpoint exists: patas list-checkpoints
Check checkpoint status (should be running or failed)
Review checkpoint metadata for clues
Check database for related data integrity

Checkpoint Updates Too Slow

If checkpoint updates are impacting performance:

Checkpoint updates happen every 5 chunks by default
This is a balance between progress tracking and performance
For very large datasets, consider increasing chunk size

Checkpointing

Pattern Mining Checkpointing

Overview

How It Works

Automatic Checkpointing

Checkpoint Data

Usage

Resuming from Checkpoint

Example Workflow

Checkpoint Status

CLI Commands

List Checkpoints

Resume Mining

Database Schema

Migration

Best Practices

Troubleshooting

Checkpoint Not Created

Resume Fails

Checkpoint Updates Too Slow

Related Documentation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!