Skip to content

Optimize database backup strategy for high-load production tracker #1651

@josecelano

Description

@josecelano

cc @da2ce7

Today in our weekly meeting, we were discussing this topic.

Problem Statement

Backing up the production Torrust Tracker database (torrust-tracker-demo) while the tracker is active is impossible due to database locks. The tracker continuously writes to the database (especially the torrents table), and the backup process cannot keep up with these changes. Backups timeout or fail because they never catch up with the active writes.

Current Production Database Stats

  • Database Size: ~17 GB
  • Primary Bottleneck: torrents table containing ~161 million rows

Torrents Table Analysis

Table Rows Estimated Size % of DB
torrents 161,454,700 ~8 GB 99.8%
whitelist 30,076 ~1.4 MB < 0.01%
keys 10 ~0.5 KB < 0.01%
torrent_aggregate_metrics 1 ~0.1 KB < 0.01%

Download Statistics Distribution

Within the torrents table:

Category Rows Percentage Data Value
Never completed (completed = 0) 156,482,966 96.9% Low
Completed at least once (completed > 0) 4,971,734 3.1% Higher

Current Solution (Workaround)

Stop the tracker → Perform backup → Restart tracker within a maintenance window. This is not ideal for a production 24/7 service.

Research Findings & Alternative Solutions

1. Exclude Statistics from Backups ✅ (Tested)

Concept: Make statistics backup optional via configuration, excluding the torrents table metrics.

Pros:

  • Simple to implement
  • Reduces backup size by ~99.8%
  • Backup completes quickly
  • Optional feature for users who don't need metrics

Cons:

  • Loses historical statistics
  • Not suitable if metrics are important for your use case

2. Pause Database Writes During Backup (Tested)

Concept: Stop the tracker from writing to the database for ~15 seconds during backup, but allow it to continue responding to peers with outdated metrics.

Pros:

  • Maintains data consistency
  • Short downtime (~15 seconds)
  • No schema changes needed
  • Metrics update resumes after backup completes

Cons:

  • Brief metrics stalling period
  • Peers see slightly outdated statistics during maintenance window

3. Database Schema Refactoring (Proposed)

Concept: Split the torrents table into two separate tables:

  1. torrents_registry (append-only)

    • Stores the list of all torrents ever announced
    • Write-once, never updated
    • Cleaner separation of concerns
  2. torrents_metrics (high-churn)

    • Only contains torrents with completed > 0 (≤3% of records)
    • Smaller table with frequent updates
    • Separate lock for independent backup operations

Current Table Schema:

CREATE TABLE `torrents` (
  `id` int NOT NULL AUTO_INCREMENT,
  `info_hash` varchar(40) NOT NULL,
  `completed` int NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `info_hash` (`info_hash`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;

Pros:

  • Cleaner, more maintainable schema
  • Separate locks could enable independent backups
  • Potentially better query performance (smaller working set for metrics queries)
  • Future-proof for querying "all torrents ever seen vs active torrents"

Cons:

  • Requires database migration
  • Schema change to the tracker code
  • More complex queries using joins

4. MySQL Performance Testing (In Progress)

All tests above were conducted using SQLite. Production is migrating to MySQL. Need to:

  • Test backup performance with MySQL
  • Evaluate how different storage engines handle the large torrents table
  • Test the schema refactoring approach with InnoDB

Important Context

Note on Database Persistence: Persistence is totally optional for Torrust Tracker:

  • If you don't use the whitelist feature, you don't need a private tracker
  • If you don't want to persist metrics, you don't need persistence at all
  • The tracker can operate memory-only if needed

Legacy Data Issue: The production database may contain historical zeros from an earlier database design. Current code only inserts torrents that have been downloaded at least once (see increase_downloads_for_torrent). Monitoring if new zero records are still being created after cleanup.

Next Steps & Action Items

  • Delete records with completed = 0 from production database and monitor
  • Verify no new zero-download records are being created
  • Monitor database growth rate after zero-record deletion
  • Test all solutions with MySQL in the new production environment
  • Performance benchmark comparison: SQLite vs MySQL for backup operations
  • If schema refactoring is pursued: Implementation plan and migration strategy
  • Document findings and finalize backup strategy recommendations

Related Documentation

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions