Skip to content

Expand WAL growth troubleshooting guide with detailed diagnostics#3807

Open
KyleAMathews wants to merge 1 commit intomainfrom
claude/electric-migration-issue-kWdlG
Open

Expand WAL growth troubleshooting guide with detailed diagnostics#3807
KyleAMathews wants to merge 1 commit intomainfrom
claude/electric-migration-issue-kWdlG

Conversation

@KyleAMathews
Copy link
Contributor

Summary

Significantly expanded the "WAL growth" troubleshooting section with comprehensive guidance on diagnosing and resolving replication slot issues. The original brief explanation has been replaced with detailed diagnostic procedures, monitoring strategies, and PostgreSQL configuration recommendations.

Key Changes

  • Added replication slot status query with explanation of key columns (active, wal_status, retained_wal, confirmed_flush_lsn)

  • Documented wal_status values with a reference table explaining reserved, extended, unreserved, and lost states and their implications

  • Expanded troubleshooting sections covering three common scenarios:

    • Electric disconnection (with slot removal guidance)
    • Slot active but not advancing (with step-by-step diagnostic tests)
    • High write volume (with max_slot_wal_keep_size configuration)
  • Added PostgreSQL configuration recommendations table with max_slot_wal_keep_size and wal_keep_size settings and their purposes

  • Included monitoring guidance referencing Prometheus metrics for replication slot health (slot_retained_wal_size, slot_confirmed_flush_lsn_lag)

  • Added quick diagnostic checklist for users to systematically verify slot health

Notable Details

  • Includes practical SQL examples for testing whether changes flow through the replication slot
  • Clarifies the behavior when slots exceed max_slot_wal_keep_size (invalidation and recreation)
  • Provides context-specific guidance for AWS RDS users
  • Uses warning callout to highlight the trade-off between disk space and shape log recreation

https://claude.ai/code/session_01H49sfABH4wS2g45H5nQ89C

…ostgreSQL settings

Add comprehensive guidance for diagnosing and resolving replication slot
WAL accumulation issues:

- Diagnostic SQL query to check slot health
- Explanation of wal_status values (reserved, extended, unreserved, lost)
- Common causes: disconnected Electric, slot not advancing, high write volume
- Step-by-step verification that changes are flowing through
- Recommended PostgreSQL settings (max_slot_wal_keep_size)
- Monitoring guidance using Electric's Prometheus metrics
- Quick diagnostic checklist

https://claude.ai/code/session_01H49sfABH4wS2g45H5nQ89C
@netlify
Copy link

netlify bot commented Feb 1, 2026

Deploy Preview for electric-next ready!

Name Link
🔨 Latest commit cc82fd4
🔍 Latest deploy log https://app.netlify.com/projects/electric-next/deploys/697f6d3ac0d4df0008f1994f
😎 Deploy Preview https://deploy-preview-3807--electric-next.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants