This enhanced version adds robust failure recovery using persistent JSON caching. When ClickHouse is unavailable (due to restarts, upgrades, or server issues), the scraper automatically caches spot data to disk and replays it when ClickHouse becomes available again.
- Default:
/var/lib/wsprnet/cache - Survives Linux reboots
- Survives ClickHouse server restarts
- Falls back to
/tmp/wsprnet-cacheif default isn't writable
When ClickHouse insert fails:
[2025-11-10 14:23:45] WARNING: ClickHouse insert failed - caching spots for later replay
[2025-11-10 14:23:45] INFO: Cached 1234 spots to /var/lib/wsprnet/cache/spots_20251110_142345_123456.json
When ClickHouse comes back online:
[2025-11-10 14:28:30] INFO: Found 5 cached spot files to replay
[2025-11-10 14:28:31] INFO: Loaded 1234 spots from cache file spots_20251110_142345_123456.json
[2025-11-10 14:28:31] INFO: Inserted 1234 spots into wsprnet.spots
[2025-11-10 14:28:31] INFO: Successfully replayed and deleted cache file spots_20251110_142345_123456.json
[2025-11-10 14:28:35] INFO: Cache replay: 5 succeeded, 0 still pending
- On startup: Automatically checks for and replays cached files
- Every 5 cycles: Periodically checks for cached files during normal operation
- Non-blocking: Replay happens between download cycles, doesn't block new data
# Create persistent cache directory
sudo mkdir -p /var/lib/wsprnet/cache
sudo chown wsprdaemon:wsprdaemon /var/lib/wsprnet/cache
sudo chmod 755 /var/lib/wsprnet/cache# Backup current version
sudo cp /usr/local/bin/wsprnet_scraper.py /usr/local/bin/wsprnet_scraper.py.v2.0.0
# Install new version
sudo cp wsprnet_scraper.py /usr/local/bin/wsprnet_scraper.py
sudo chmod 755 /usr/local/bin/wsprnet_scraper.pysudo systemctl restart wsprnet_scraper
sudo systemctl status wsprnet_scraperwsprnet_scraper.py --cache-dir /path/to/cache [other options]{
"cache_dir": "/var/lib/wsprnet/cache"
}If not specified, uses /var/lib/wsprnet/cache
Cache files are named: spots_YYYYMMDD_HHMMSS_microseconds.json
Example: spots_20251110_142345_123456.json
Structure:
{
"timestamp": "20251110_142345_123456",
"spot_count": 1234,
"spots": [
[id, time, band, rx_sign, ...],
...
]
}1. ClickHouse goes down at 14:00
2. Scraper downloads spots at 14:02 → insert fails → cached
3. Scraper downloads spots at 14:04 → insert fails → cached
4. ClickHouse comes back at 14:05
5. Scraper downloads spots at 14:06 → insert succeeds
6. Next cycle at 14:08 → replays 2 cached files
Result: No data loss
1. Stop ClickHouse for upgrade at 15:00
2. Scraper continues running, caches all spots (15:02, 15:04, 15:06...)
3. Upgrade completes at 15:20
4. Start ClickHouse
5. Next scraper cycle at 15:22:
- Replays all 10 cached files
- Continues normal operation
Result: All spots preserved and inserted
1. Server reboots at 16:00
2. Cache directory persists: /var/lib/wsprnet/cache
3. Server comes back at 16:05
4. ClickHouse starts at 16:06
5. wsprnet_scraper service starts at 16:06
6. Startup sequence:
- Checks for cached files
- Replays any found
- Starts normal operation
Result: Pre-reboot cached data is preserved and inserted
ls -lh /var/lib/wsprnet/cache/find /var/lib/wsprnet/cache -name "spots_*.json" | wc -l# Get spot count from cache file
jq '.spot_count' /var/lib/wsprnet/cache/spots_20251110_142345_123456.json
# View first cached spot
jq '.spots[0]' /var/lib/wsprnet/cache/spots_20251110_142345_123456.json# Watch for caching events
tail -f /var/log/wsprdaemon/wsprnet_scraper.log | grep -i cache
# Count successful replays
grep "Successfully replayed" /var/log/wsprdaemon/wsprnet_scraper.log | wc -l- Cache files are automatically deleted after successful replay
- No manual cleanup needed under normal operation
# Remove all cache files (only if you're sure they're not needed)
rm -f /var/lib/wsprnet/cache/spots_*.json
# Or move to backup before removing
mkdir -p /var/lib/wsprnet/cache_backup
mv /var/lib/wsprnet/cache/spots_*.json /var/lib/wsprnet/cache_backup/[2025-11-10 14:00:00] ERROR: Failed to create/access cache directory /var/lib/wsprnet/cache: Permission denied
[2025-11-10 14:00:00] WARNING: Cache directory /var/lib/wsprnet/cache is not accessible - falling back to /tmp/wsprnet-cache
Fix:
sudo mkdir -p /var/lib/wsprnet/cache
sudo chown wsprdaemon:wsprdaemon /var/lib/wsprnet/cache
sudo chmod 755 /var/lib/wsprnet/cache
sudo systemctl restart wsprnet_scraperCheck logs:
grep "replay" /var/log/wsprdaemon/wsprnet_scraper.logCommon causes:
- ClickHouse still down → Files remain cached (normal)
- Permission issues → Check ownership of cache files
- Corrupted JSON → Check file contents with
jq
If ClickHouse is down for extended period:
# Count files
ls /var/lib/wsprnet/cache/spots_*.json | wc -l
# Estimate total spots cached
for f in /var/lib/wsprnet/cache/spots_*.json; do jq -r '.spot_count' "$f"; done | awk '{sum+=$1} END {print sum}'This is normal and safe - files will be replayed when ClickHouse returns.
- Minimal overhead when ClickHouse is healthy
- Cache write: ~10ms for 1000 spots
- Cache replay: Same speed as normal insert
- Memory: Spots loaded one file at a time (no memory accumulation)
This version is 100% backwards compatible:
- Same command-line arguments
- Same database schema
- Same behavior when ClickHouse is healthy
- Additional caching is transparent to existing monitoring
# Stop ClickHouse
sudo systemctl stop clickhouse-server
# Wait for next scraper cycle - should see caching in logs
tail -f /var/log/wsprdaemon/wsprnet_scraper.log
# Restart ClickHouse
sudo systemctl start clickhouse-server
# Wait for next cycle - should see replay in logs# Create some cache files by stopping ClickHouse briefly
sudo systemctl stop clickhouse-server
sleep 300 # Wait for 2-3 scraper cycles
sudo systemctl start clickhouse-server
# Verify files exist
ls -l /var/lib/wsprnet/cache/
# Reboot
sudo reboot
# After reboot, check that files were replayed
grep "Successfully replayed" /var/log/wsprdaemon/wsprnet_scraper.log- ✅ Persistent cache directory - survives reboots
- ✅ Automatic caching on ClickHouse failure
- ✅ Automatic replay when ClickHouse recovers
- ✅ Startup replay - processes old cached files on restart
- ✅ Periodic replay - every 5 cycles during operation
- ✅ Automatic cleanup - removes files after successful insert
- ✅ Fallback directory - /tmp if default not writable
- ✅ Zero data loss - all spots preserved through failures
- ✅ Non-blocking - doesn't slow down new data collection
- ✅ Full logging - all cache operations logged
- Version: 2.1.0
- Previous: 2.0.0
- Changes: Added persistent JSON caching for ClickHouse resilience