High-performance Kafka backup and restore with point-in-time recovery
kafka-backup is a production-grade tool written in Rust for backing up and restoring Apache Kafka topics to cloud storage or local filesystem. It supports point-in-time recovery (PITR) with millisecond precision and solves the consumer group offset discontinuity problem when restoring to different clusters.
- Multi-cloud storage β S3, Azure Blob, GCS, or local filesystem
- Point-in-time recovery β Restore to any millisecond within your backup window
- Consumer offset recovery β Automatically reset consumer group offsets after restore
- Compliance evidence β Signed JSON/PDF reports for SOX, CMMC, and GDPR auditors
- High performance β 100+ MB/s throughput with zstd/lz4 compression
- Incremental backups β Resume from where you left off
- Topic filtering β Wildcard patterns for include/exclude
- Auto-repartitioning β Restore to clusters with different partition counts
- Deployment agnostic β Bare metal, VM, Docker, or Kubernetes
Download the latest binary from the GitHub Releases page.
brew install osodevops/tap/kafka-backupcurl --proto '=https' --tlsv1.2 -LsSf https://github.com/osodevops/kafka-backup/releases/latest/download/kafka-backup-cli-installer.sh | shDownload the appropriate binary for your architecture from releases:
# Example for x86_64
curl -LO https://github.com/osodevops/kafka-backup/releases/latest/download/kafka-backup-cli-x86_64-unknown-linux-gnu.tar.xz
tar -xJf kafka-backup-cli-x86_64-unknown-linux-gnu.tar.xz
sudo mv kafka-backup /usr/local/bin/powershell -ExecutionPolicy ByPass -c "irm https://github.com/osodevops/kafka-backup/releases/latest/download/kafka-backup-cli-installer.ps1 | iex"We use Scoop to distribute releases for Windows.
scoop bucket add oso https://github.com/osodevops/scoop-bucket.git
scoop install kafka-backupdocker pull osodevops/kafka-backup
docker run --rm -v /path/to/config:/config osodevops/kafka-backup backup --config /config/backup.yamlSee the image on Docker Hub.
git clone https://github.com/osodevops/kafka-backup.git
cd kafka-backup
cargo build --releaseBinary location: target/release/kafka-backup
Want to see kafka-backup in action? Check out our Demo Repository with ready-to-run examples:
git clone https://github.com/osodevops/kafka-backup-demos
cd kafka-backup-demos
docker compose up -d
cd cli/backup-basic && ./demo.shAvailable demos:
- Backup & Restore β Full backup/restore cycle with S3/MinIO
- Point-in-Time Recovery β Restore to any millisecond with rollback safety
- Large Messages β Handle 1-10MB payloads with compression comparisons
- Offset Management β Consumer group offset snapshots and resets
- Kafka Streams β PITR with stateful stream processing apps
- Spring Boot β Microservice integration patterns
- Benchmarks β Throughput, latency, and scaling tests
Create a backup configuration file backup.yaml:
mode: backup
backup_id: "daily-backup-001"
source:
bootstrap_servers: ["kafka:9092"]
topics:
include: ["orders-*", "payments-*"]
exclude: ["*-internal"]
storage:
backend: s3
bucket: my-kafka-backups
region: us-east-1
prefix: prod/
backup:
compression: zstd
segment_max_bytes: 134217728 # 128MBRun the backup:
kafka-backup backup --config backup.yamlCreate a restore configuration file restore.yaml:
mode: restore
backup_id: "daily-backup-001"
target:
bootstrap_servers: ["kafka-dr:9092"]
storage:
backend: s3
bucket: my-kafka-backups
region: us-east-1
prefix: prod/
restore:
# Point-in-time recovery (optional)
time_window_start: 1736899200000 # epoch millis
time_window_end: 1736985600000
# Remap topics (optional)
topic_mapping:
orders-prod: orders-recoveredRun the restore:
kafka-backup restore --config restore.yaml| Feature | OSO Kafka Backup | itadventurer/kafka-backup | Kannika Armory | Confluent Replicator | MirrorMaker 2 | Lenses K2K |
|---|---|---|---|---|---|---|
| PITR | Yes (ms precision) | No | Yes (ms precision) | No | No | No |
| Cloud storage | S3, Azure, GCS | Filesystem only | S3, Azure, GCS & K8s PV | No | No | No |
| Offset recovery | Yes (multi-strategy) | Partial | Yes | Limited | Limited | Limited |
| Compliance evidence | Yes (signed JSON/PDF) | No | No | No | No | No |
| SOX/CMMC/GDPR mapping | Yes (automatic) | No | No | No | No | No |
| Air-gapped DR | Yes | Partial | Yes (commercial) | No | No | No |
| Auto-repartitioning | Yes | No | No | No | No | No |
| Platform dependency | None (single binary) | Kafka Connect | K8s platform | Confluent Platform | MM2 framework | Lenses platform |
| License | MIT (OSS) | MIT (unmaintained) | Commercial | Commercial | Apache 2.0 | Commercial |
π See the full comparison guide for detailed analysis of each solution.
OSO Kafka Backup is the only option that combines millisecondβprecision PITR, cloudβnative cold backups, and automated consumer offset recovery in a single, OSSβfriendly binary.
Competing tools either:
- Only do filesystem backups
- Are commercial platforms you have to buy and operate
- Are replication tools that don't give you true, airβgapped backups
This makes OSO Kafka Backup the highestβleverage choice for teams that need real Kafka disaster recovery without adopting a whole new proprietary platform.
- Real-time replication β Use MirrorMaker 2 for active-active or active-passive replication
- Schema evolution β kafka-backup preserves bytes exactly; it doesn't handle schema registry
- Infinite retention β For long-term archival, consider Tiered Storage (KIP-405)
Full documentation is available at osodevops.github.io/kafka-backup-docs
| Document | Description |
|---|---|
| Quick Start | Get started in 5 minutes |
| Configuration Reference | All configuration options |
| Storage Guide | S3, Azure, GCS setup |
| Restore Guide | Restore scenarios and examples |
| Offset Recovery | Consumer offset strategies |
| Offset Reset & Rollback | Bulk offset resets and rollback safety net |
# Backup & restore
kafka-backup backup --config backup.yaml
kafka-backup restore --config restore.yaml
kafka-backup three-phase-restore --config restore.yaml # restore + offset recovery
# Inspect backups
kafka-backup list --path s3://bucket/prefix
kafka-backup describe --path s3://bucket --backup-id backup-001 --format json
kafka-backup status --config backup.yaml --watch # live monitoring
kafka-backup validate --path s3://bucket --backup-id backup-001 --deep
# Restore validation (dry-run)
kafka-backup validate-restore --config restore.yaml
# Offset mapping & consumer offset management
kafka-backup show-offset-mapping --path s3://bucket --backup-id backup-001 --format json
kafka-backup offset-reset plan --path s3://bucket --backup-id backup-001 --groups my-group
kafka-backup offset-reset execute --path s3://bucket --backup-id backup-001 --groups my-group
kafka-backup offset-reset script --path s3://bucket --backup-id backup-001 --groups my-group
# Bulk parallel offset reset (~50x faster)
kafka-backup offset-reset-bulk --path s3://bucket --backup-id backup-001 \
--groups group1,group2 --bootstrap-servers kafka:9092
# Offset snapshots & rollback (safety net)
kafka-backup offset-rollback snapshot --path s3://bucket --groups my-group --bootstrap-servers kafka:9092
kafka-backup offset-rollback list --path s3://bucket
kafka-backup offset-rollback rollback --path s3://bucket --snapshot-id <id> --bootstrap-servers kafka:9092
# Validation & compliance evidence
kafka-backup validation run --config validation.yaml
kafka-backup validation run --config validation.yaml --triggered-by "KPMG Q1 audit"
kafka-backup validation evidence-list --path s3://bucket
kafka-backup validation evidence-get --path s3://bucket --report-id <id> --format pdf --output report.pdf
kafka-backup validation evidence-verify --report report.json --signature report.sig --public-key key.pemValidate restored data against the original backup and generate signed evidence reports for auditors:
# validation.yaml
backup_id: "production-daily-001"
storage:
backend: s3
bucket: my-kafka-backups
target:
bootstrap_servers: ["restored-kafka:9092"]
checks:
message_count:
enabled: true
mode: exact
offset_range:
enabled: true
evidence:
formats: [json, pdf]
signing:
enabled: true
private_key_path: "/etc/kafka-backup/signing-key.pem"
storage:
prefix: "evidence-reports/"
retention_days: 2555 # 7 years (SOX)
notifications:
slack:
webhook_url: "https://hooks.slack.com/services/..."Validation checks: MessageCountCheck, OffsetRangeCheck, ConsumerGroupOffsetCheck, CustomWebhookCheck
Evidence outputs: JSON (machine-readable) + PDF (auditor-ready) + ECDSA-P256-SHA256 signatures
Compliance mappings: SOX ITGC, CMMC RE.3.139, GDPR Article 32 β automatically included in every report
Backups are stored in a structured format:
s3://kafka-backups/
βββ {prefix}/
βββ {backup_id}/
βββ manifest.json # Backup metadata
βββ state/
β βββ offsets.db # Checkpoint state (synced from local)
βββ topics/
βββ {topic}/
βββ partition={id}/
βββ segment-0001.zst
βββ segment-0002.zst
A local SQLite offset database is maintained at $TMPDIR/{backup_id}-offsets.db (configurable via offset_storage.db_path) and periodically synced to remote storage for durability. To enable incremental one-shot backups (resume from where the last run stopped), add the offset_storage section to your config.
kafka-backup exposes Prometheus metrics at /metrics for monitoring backup operations:
# Enable metrics in your config
metrics:
enabled: true
port: 8080Key metrics:
kafka_backup_lag_recordsβ Consumer lag per partitionkafka_backup_records_totalβ Total records backed upkafka_backup_compression_ratioβ Compression efficiencykafka_backup_storage_write_latency_secondsβ Storage I/O latencykafka_backup_validation_checks_passed_totalβ Validation checks passedkafka_backup_validation_consecutive_failuresβ Consecutive validation failures (SLO alerting)
A complete Grafana + Prometheus monitoring stack is available in the demos repository:
cd kafka-backup-demos/monitoring-stack
docker-compose -f docker-compose.metrics.yml up -d
# Grafana at http://localhost:3000 (admin/admin)| Metric | Target |
|---|---|
| Throughput | 100+ MB/s per partition |
| Checkpoint latency | <100ms p99 |
| Compression ratio | 3-5x (typical JSON/Avro) |
| Memory usage | <500MB for 4 partitions |
Requirements:
- Rust 1.75+
- OpenSSL development libraries
# Clone the repository
git clone https://github.com/osodevops/kafka-backup.git
cd kafka-backup
# Build release binary
cargo build --release
# Run tests
cargo test
# Run with debug logging
RUST_LOG=debug cargo run -p kafka-backup-cli -- --help# Unit tests
cargo test
# Integration tests (requires Docker)
cargo test --test integration_suite_tests
# All tests including ignored (Docker required)
cargo test -- --include-ignored
# With coverage
cargo tarpaulin --out Htmlkafka-backup/
βββ crates/
β βββ kafka-backup-core/ # Core library
β β βββ src/
β β β βββ backup/ # Backup engine
β β β βββ restore/ # Restore engine + offset recovery
β β β βββ validation/ # Validation check framework
β β β βββ evidence/ # Evidence reports, signing, PDF
β β β βββ notification/ # Slack, PagerDuty webhooks
β β β βββ kafka/ # Kafka protocol client
β β β βββ storage/ # S3, Azure, GCS, filesystem
β β β βββ metrics/ # Prometheus metrics
β β β βββ compression.rs
β β βββ tests/ # Unit, integration, chaos tests
β βββ kafka-backup-cli/ # CLI binary
βββ config/ # Example configs
βββ docs/ # Documentation
OSO engineers are solely focused on deploying, operating, and maintaining Apache Kafka platforms. If you need SLA-backed support or advanced features for compliance and security, our Enterprise Edition extends the core tool with capabilities designed for large-scale, regulated environments.
| Feature Category | Enterprise Capability |
|---|---|
| Security & Compliance | AES-256 Encryption (client-side encryption at rest) |
| GDPR Compliance Tools (right-to-be-forgotten, PII masking) | |
| Audit Logging (comprehensive trail of all backup/restore ops) | |
| Role-Based Access Control (granular permissions) | |
| Advanced Integrations | Schema Registry Integration (backup & restore schemas with ID remapping) |
| Secrets Management (Vault / AWS Secrets Manager integration) | |
| SSO / OIDC (Okta, Azure AD, Google Auth) | |
| Scale & Operations | Multi-Region Replication (active-active disaster recovery) |
| Log Shipping (Datadog, Splunk, Grafana Loki) | |
| Advanced Metrics & Dashboard (throughput, latency, drill-down UI) | |
| Support | 24/7 SLA-Backed Support & dedicated Kafka consulting |
Need help resolving operational issues or planning a failover strategy? Our team of experts can recover data from non-responsive clusters, fix configuration errors, and get your environment operational as fast as possible.
π Talk with an expert today or email us at enquiries@oso.sh.
We welcome contributions of all kinds!
- Report Bugs: Found a bug? Open an issue on GitHub.
- Suggest Features: Have an idea? Request a feature.
- Contribute Code: Check out our good first issues for beginner-friendly tasks.
- Improve Docs: Help us improve the documentation by submitting pull requests.
See CLAUDE.md for development guidelines and architecture overview.
kafka-backup is licensed under the MIT License Β© OSO.
Built with these excellent Rust crates:
- kafka-protocol β Kafka protocol implementation
- object_store β Cloud storage abstraction
- tokio β Async runtime
- zstd β Compression
Made with β€οΈ by OSO