Constitutional.seq includes comprehensive error handling and recovery mechanisms to ensure reliable operation even in the face of network issues, API rate limits, and other failures.
The tool automatically classifies errors into categories:
- Network Timeout: Connection timeouts and network interruptions
- API Rate Limit: 429 errors and rate limiting
- Invalid Gene Name: Gene not found or invalid symbol
- Database Error: Database connection issues
- Validation Error: Input format or data validation failures
- File I/O Error: Permission or file access issues
- Parse Error: JSON/XML parsing failures
Each error type has specific recovery strategies and user-friendly suggestions.
Resilient network handling with:
- Automatic retry with exponential backoff
- Connection health checking
- API-specific rate limiting
- Configurable timeout and retry settings
# The tool automatically handles network issues
genbank-tool genes.txt output.tsv
# Customize network behavior
genbank-tool genes.txt output.tsv --config network_config.jsonFor large batch operations, the tool saves progress automatically:
# Start a large batch (checkpoints saved automatically)
genbank-tool large_gene_list.txt output.tsv
# If interrupted, resume from checkpoint
genbank-tool --resume batch_1234567890 output.tsv
# Retry only failed items
genbank-tool --retry-failed batch_1234567890 output.tsv
# List available checkpoints
genbank-tool --list-checkpointsMultiple log levels with detailed information:
# Enable verbose logging
genbank-tool genes.txt output.tsv --verbose
# Set specific log level
genbank-tool genes.txt output.tsv --log-level DEBUG
# Specify log directory
genbank-tool genes.txt output.tsv --log-dir ./my_logsLog files include:
constitutional_seq_YYYYMMDD.log: General application logserrors_YYYYMMDD.log: Error-specific logs with stack traces
Generate detailed error reports for analysis:
# Export error report after processing
genbank-tool genes.txt output.tsv --error-report error_analysis.jsonThe report includes:
- Error summary by type and severity
- Detailed error contexts with timestamps
- Recovery suggestions
- Processing statistics
Create a configuration file to customize error handling:
{
"error_handling": {
"max_retries": 5,
"enable_checkpoints": true,
"checkpoint_interval": 10,
"log_dir": ".genbank_logs",
"checkpoint_dir": ".genbank_checkpoints"
},
"network": {
"timeout": 60,
"backoff_factor": 2.0,
"retry_on_status": [408, 429, 500, 502, 503, 504]
}
}Configure different settings per API:
{
"api_configs": {
"ncbi": {
"timeout": 60,
"max_retries": 5,
"rate_limit_per_second": 3
},
"uniprot": {
"timeout": 30,
"max_retries": 3,
"rate_limit_per_second": 10
},
"ensembl": {
"timeout": 45,
"max_retries": 4,
"rate_limit_per_second": 15
}
}
}# Process genes with automatic error handling
genbank-tool genes.txt output.tsv
# Enable checkpoint for large batches
genbank-tool large_list.txt output.tsv --checkpoint
# Quiet mode (only show errors)
genbank-tool genes.txt output.tsv --quiet# Resume interrupted processing
genbank-tool --resume batch_1234567890 output.tsv
# Retry failed items with increased timeout
genbank-tool --retry-failed batch_1234567890 output.tsv --config high_timeout.json
# Process with parallel workers and checkpoints
genbank-tool genes.txt output.tsv --parallel --workers 10 --checkpoint# Real-time monitoring with verbose output
genbank-tool genes.txt output.tsv --verbose
# Debug mode with detailed logging
genbank-tool genes.txt output.tsv --log-level DEBUG
# Generate error report for analysis
genbank-tool genes.txt output.tsv --error-report analysis.json- Automatic retry with exponential backoff
- Wait for network connectivity restoration
- Fallback to cached data when available
- Respect Retry-After headers
- Token bucket rate limiting
- Automatic throttling for sustained operation
- Detailed error messages with suggestions
- Alternative gene symbol lookup
- Batch continuation despite individual failures
- Continue processing valid items
- Save failed items for retry
- Generate detailed failure reports
-
Use Checkpoints for Large Batches
- Enable checkpoints for lists > 50 genes
- Checkpoints allow resuming without reprocessing
-
Monitor Rate Limits
- Use API keys for higher rate limits
- Configure appropriate rate limits per API
- Monitor rate limit statistics in logs
-
Handle Errors Gracefully
- Review error reports regularly
- Adjust retry strategies based on error types
- Use validation to catch issues early
-
Optimize for Your Use Case
- Increase workers for I/O bound operations
- Adjust timeouts for slow networks
- Configure appropriate checkpoint intervals
-
Persistent Network Failures
# Increase timeout and retries genbank-tool genes.txt output.tsv --config high_tolerance.json -
Rate Limit Errors
# Reduce parallel workers genbank-tool genes.txt output.tsv --workers 1 # Add API key for higher limits export NCBI_API_KEY=your_key_here genbank-tool genes.txt output.tsv
-
Checkpoint Corruption
# List and clean old checkpoints genbank-tool --list-checkpoints rm .genbank_checkpoints/corrupted_checkpoint.json
When reporting issues, include:
- Error report:
--error-report issue_report.json - Debug logs:
--log-level DEBUG --log-dir debug_logs - Checkpoint files if relevant
- Configuration file used
- Checkpoint saves have minimal overhead (~1ms)
- Error handling adds < 5% processing time
- Network recovery may extend total time based on failures
- Parallel processing with checkpoints scales linearly
See the API documentation for detailed information about:
- ErrorHandler class
- NetworkRecoveryManager
- BatchProcessor
- Logging configuration