Skip to content

Conversation

@joecorall
Copy link
Member

@joecorall joecorall commented Oct 26, 2025

When multiple Traefik routers use this plugin, each creates a separate instance that competes for the same state file. This caused verified IPs to lose their TTLs and get re-challenged across instances.

Changes:

  • Add expiration timestamps to State struct for proper TTL serialization
  • Implement file locking to prevent concurrent write conflicts
  • Add state reconciliation to merge in-memory and file-based state
  • Keep periodic full state saves (10 min) as backup for rate/bot caches

Verified IPs now maintain their TTLs across plugin instances, preventing unnecessary re-challenges when requests hit different routers.

Closes #52 #18

…n persistence

When multiple Traefik routers use this plugin, each creates a separate instance that competes for the same state file. This caused verified IPs to lose their TTLs and get re-challenged across instances.

Changes:
  - Add expiration timestamps to State struct for proper TTL serialization
  - Implement file locking (flock) to prevent concurrent write conflicts
  - Add state reconciliation to merge in-memory and file-based state
  - Persist verified IPs immediately after CAPTCHA success (lightweight)
  - Keep periodic full state saves (1 min) as backup for rate/bot caches

Verified IPs now maintain their TTLs across plugin instances, preventing unnecessary re-challenges when requests hit different routers.
@joecorall
Copy link
Member Author

joecorall commented Oct 26, 2025

going to sit on this for a little bit. perhaps deploy locally in a prod environment for a stress test. i am worried about the iops added on the excessive state reconciliation between memory and disk. Since we can't use syscalls in traefik plugins we can't know about writes to the state file shared across services/memory. That leaves us with needing to lock and reconcile disk and memory pretty aggressively. Though since it's happening in a go routine should be fine since it won't be locking/blocking during requests

we probably aren't getting hit across services with the same ip before we're in a reconcile loop
@joecorall joecorall changed the title Fix multi-instance state coordination with file locking and expiration persistence Fix multi-instance state coordination with file locking and expiration persistence [minor] Oct 26, 2025
@joecorall joecorall force-pushed the shared-state branch 2 times, most recently from 8bd020f to 6957ca8 Compare October 26, 2025 12:53
@joecorall joecorall changed the title Fix multi-instance state coordination with file locking and expiration persistence [minor] Use redis for multi-instance state coordination [minor] Oct 26, 2025
@joecorall joecorall changed the title Use redis for multi-instance state coordination [minor] Fix multi-instance state coordination with file locking and expiration persistence [minor] Oct 27, 2025
use synctest to test state expiry
joecorall pushed a commit to lehigh-university-libraries/isle-preserve that referenced this pull request Oct 27, 2025
@joecorall joecorall enabled auto-merge (squash) October 27, 2025 18:04
@joecorall joecorall merged commit 1656712 into main Oct 27, 2025
14 of 27 checks passed
@joecorall joecorall deleted the shared-state branch October 27, 2025 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Persistence file not working as expected and state is not shared between plugin instances

2 participants