Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 8 additions & 22 deletions .github/workflows/github-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,27 +5,13 @@ on:
- main
types:
- closed
permissions:
contents: write
actions: write
jobs:
release:
if: github.event.pull_request.merged == true && !contains(github.event.pull_request.title, '[skip-release]')
runs-on: ubuntu-24.04
steps:
- name: Checkout
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5
with:
fetch-depth: 0

- name: install autotag binary
run: curl -sL https://git.io/autotag-install | sudo sh -s -- -b /usr/bin

- name: create release
run: |-
TAG=$(autotag)
git push origin v$TAG
gh release create v$TAG --title "v$TAG" --generate-notes
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

if: github.event.pull_request.merged == true && !contains(github.event.pull_request.title, 'skip-release')
uses: libops/actions/.github/workflows/bump-release.yaml@main
with:
prefix: v
permissions:
contents: write
actions: write
secrets: inherit
28 changes: 25 additions & 3 deletions .github/workflows/lint-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,14 +53,36 @@ jobs:
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}

integration-test:
integration-test-latest:
needs: [run]
permissions:
contents: read
runs-on: ubuntu-24.04
strategy:
matrix:
traefik: [v2.11, v3.0, v3.1, v3.2, v3.3, v3.4]
traefik: [latest]
steps:
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5

- name: run
run: go run test.go
working-directory: ./ci
env:
TRAEFIK_TAG: ${{ matrix.traefik }}

- name: cleanup
if: ${{ always() }}
run: docker compose logs --tail 100 nginx nginx2 traefik && docker compose down
working-directory: ./ci

integration-test:
needs: [integration-test-latest]
permissions:
contents: read
runs-on: ubuntu-24.04
strategy:
matrix:
traefik: [v2.11, v3.0, v3.1, v3.2, v3.3, v3.4, v3.5]
steps:
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5

Expand All @@ -72,5 +94,5 @@ jobs:

- name: cleanup
if: ${{ always() }}
run: docker compose down
run: docker compose logs --tail 100 nginx nginx2 traefik && docker compose down
working-directory: ./ci
1 change: 1 addition & 0 deletions .traefik.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ testData:
CaptchaProvider: turnstile
SiteKey: 1x00000000000000000000AA
SecretKey: 1x0000000000000000000000000000000AA
EnableStateReconciliation: "false"
18 changes: 14 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,10 @@ This is a Traefik middleware plugin that protects websites from bot traffic by c
- `CaptchaProtect` struct: Main middleware handler with rate limiting, bot detection, and challenge serving
- `Config` struct: Configuration from Traefik labels
- Three in-memory caches (using `github.com/patrickmn/go-cache`):
- `rateCache`: Tracks request counts per subnet
- `rateCache`: Tracks request counts per subnet (TTL = `window` config value)
- `verifiedCache`: Stores IPs that have passed challenges (24h default TTL)
- `botCache`: Caches reverse DNS lookups for bot verification
- `botCache`: Caches reverse DNS lookups for bot verification (1h TTL)
- **Why go-cache instead of sync.Map?** The plugin requires automatic TTL-based expiration for all caches. `sync.Map` has no built-in expiration mechanism, requiring manual cleanup goroutines. `go-cache` provides thread-safe maps with automatic expiration and cleanup.

### Request Flow Decision Tree

Expand Down Expand Up @@ -120,9 +121,18 @@ Regex is significantly slower (~41ns vs ~3.4ns per operation) - see README bench
### State Persistence

When `persistentStateFile` is configured:
- State saves every 1 minute to JSON file (`saveState()` at `main.go:695-727`)
- On startup, loads previous state from file (`loadState()` at `main.go:729-756`)
- State saves every 10 seconds (with 0-2s random jitter) to JSON file (`saveState()` at `main.go:716-746`)
- Uses file locking (`.lock` files) to prevent concurrent writes (`internal/state/state.go:61-129`)
- On startup, loads previous state from file (`loadState()` at `main.go:729-761`)
- Contains: rate limits per subnet, bot verification cache, verified IPs
- **Important**: Each middleware instance runs its own save goroutine. If multiple instances share the same `persistentStateFile`, they will write more frequently (e.g., 2 instances = writes every ~5 seconds)
- **State Reconciliation**: When `enableStateReconciliation: "true"`, each save performs a read-modify-write cycle to merge state from other instances. This adds I/O overhead but prevents data loss in multi-instance deployments (see `internal/state/state.go:86-100`)

**Why not Redis?** Traefik plugins are loaded via Yaegi (a Go interpreter), which has significant limitations:
- Yaegi cannot interpret Go packages that use `unsafe`, cgo, or complex reflection patterns
- Popular Redis clients like `go-redis/redis` are incompatible with Yaegi

**Current solution**: File-based persistence with reconciliation avoids these issues. Local caches remain fast (no network overhead), state saves are batched (every 10s), and reconciliation handles conflicts without complex coordination. The tradeoff is accepting slightly stale data across instances (max 10s delay) rather than the complexity and performance cost of real-time Redis synchronization.

### Good Bot Detection

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ services:
| `enableStatsPage` | `string` | `"false"` | Allows `exemptIps` to access `/captcha-protect/stats` to monitor the rate limiter. |
| `logLevel` | `string` | `"INFO"` | Log level for the middleware. Options: `ERROR`, `WARNING`, `INFO`, or `DEBUG`. |
| `persistentStateFile` | `string` | `""` | File path to persist rate limiter state across Traefik restarts. In Docker, mount this file from the host. |
| `enableStateReconciliation` | `string` | `"false"` | When `"true"`, reads and merges disk state before each save to prevent multiple instances from overwriting data. Adds extra I/O overhead. Only enable for multi-instance deployments sharing state. |


### Good Bots
Expand Down
2 changes: 1 addition & 1 deletion ci/.env
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
TRAEFIK_TAG=v3.3.3
TRAEFIK_TAG=v3.5
NGINX_TAG=1.27.4-alpine3.21
TURNSTILE_SITE_KEY=1x00000000000000000000AA
TURNSTILE_SECRET_KEY=1x0000000000000000000000000000000AA
32 changes: 31 additions & 1 deletion ci/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,44 @@ services:
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.goodBots: ""
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.protectRoutes: "/"
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.persistentStateFile: "/tmp/state.json"
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.enableStateReconciliation: "true"
healthcheck:
test: curl -fs http://localhost/healthz | grep -q OK || exit 1
volumes:
- ./conf/nginx/default.conf:/etc/nginx/conf.d/default.conf:r
networks:
default:
aliases:
- nginx
- nginx
nginx2:
image: nginx:${NGINX_TAG}
labels:
traefik.enable: true
traefik.http.routers.nginx2.entrypoints: http
traefik.http.routers.nginx2.service: nginx2
traefik.http.routers.nginx2.rule: Host(`localhost`) && PathPrefix(`/app2`)
traefik.http.services.nginx2.loadbalancer.server.port: 80
traefik.http.routers.nginx2.middlewares: captcha-protect@docker
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.captchaProvider: turnstile
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.window: 120
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.rateLimit: ${RATE_LIMIT}
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.siteKey: ${TURNSTILE_SITE_KEY}
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.secretKey: ${TURNSTILE_SECRET_KEY}
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.enableStatsPage: "true"
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.ipForwardedHeader: "X-Forwarded-For"
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.logLevel: "DEBUG"
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.goodBots: ""
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.protectRoutes: "/"
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.persistentStateFile: "/tmp/state.json"
traefik.http.middlewares.captcha-protect.plugin.captcha-protect.enableStateReconciliation: "true"
healthcheck:
test: curl -fs http://localhost/healthz | grep -q OK || exit 1
volumes:
- ./conf/nginx/default.conf:/etc/nginx/conf.d/default.conf:r
networks:
default:
aliases:
- nginx2
traefik:
image: traefik:${TRAEFIK_TAG}
command: >-
Expand Down
64 changes: 42 additions & 22 deletions ci/test.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ var (

const numIPs = 100
const parallelism = 10
const expectedRedirectURL = "http://localhost/challenge?destination=%2F"

func main() {
_ips := []string{
Expand All @@ -48,24 +47,19 @@ func main() {
fmt.Println("Bringing traefik/nginx online")
runCommand("docker", "compose", "up", "-d")
waitForService("http://localhost")
waitForService("http://localhost/app2")

fmt.Printf("Making sure %d attempt(s) pass\n", rateLimit)
runParallelChecks(ips, rateLimit)
runParallelChecks(ips, rateLimit, "http://localhost")

fmt.Printf("Making sure attempt #%d causes a redirect to the challenge page\n", rateLimit+1)
ensureRedirect(ips)
time.Sleep(cp.StateSaveInterval + cp.StateSaveJitter + (1 * time.Second))
runCommand("jq", ".", "tmp/state.json")

fmt.Println("Sleeping for 2m")
time.Sleep(125 * time.Second)
fmt.Println("Making sure one attempt passes after 2m window")
runParallelChecks(ips, 1)
fmt.Println("All good 🚀")
fmt.Printf("Making sure attempt #%d causes a redirect to the challenge page\n", rateLimit+1)
ensureRedirect(ips, "http://localhost")

// make sure the state has time to save
fmt.Println("Waiting for state to save")
runCommand("jq", ".", "tmp/state.json")
time.Sleep(80 * time.Second)
runCommand("jq", ".", "tmp/state.json")
fmt.Println("\nTesting state sharing between nginx instances...")
testStateSharing(ips)

runCommand("docker", "container", "stats", "--no-stream")

Expand Down Expand Up @@ -138,7 +132,7 @@ func waitForService(url string) {
}
}

func runParallelChecks(ips []string, rateLimit int) {
func runParallelChecks(ips []string, rateLimit int, url string) {
var wg sync.WaitGroup
sem := make(chan struct{}, parallelism)

Expand All @@ -151,7 +145,7 @@ func runParallelChecks(ips []string, rateLimit int) {
defer func() { <-sem }()

fmt.Printf("Checking %s\n", ip)
output := httpRequest(ip)
output := httpRequest(ip, url)
if output != "" {
slog.Error("Unexpected output", "ip", ip, "output", output)
os.Exit(1)
Expand All @@ -164,21 +158,47 @@ func runParallelChecks(ips []string, rateLimit int) {
wg.Wait()
}

func ensureRedirect(ips []string) {
func ensureRedirect(ips []string, url string) {
expectedURL := url + "/challenge?destination=%2F"
if url != "http://localhost" {
// For /app2, the destination should be the app2 path
expectedURL = "http://localhost/challenge?destination=%2Fapp2"
}

for _, ip := range ips {
fmt.Printf("Checking %s\n", ip)
output := httpRequest(ip)
output := httpRequest(ip, url)

if output != expectedRedirectURL {
slog.Error("Unexpected output", "ip", ip, "output", output)
if output != expectedURL {
slog.Error("Unexpected output", "ip", ip, "output", output, "expected", expectedURL)
os.Exit(1)
}

fmt.Printf("Got a redirect! %s\n", output)
}
}

func httpRequest(ip string) string {
func testStateSharing(ips []string) {
// Use first IP to test state sharing
testIP := ips[0]

fmt.Printf("Testing with IP: %s\n", testIP)

// The IP should already be at rate limit from previous tests on localhost/
// Now verify it's also rate limited on localhost/app2 (shared state)
fmt.Println("Verifying IP is rate limited on /app2 (state should be shared)...")
output := httpRequest(testIP, "http://localhost/app2")
expectedURL := "http://localhost/challenge?destination=%2Fapp2"

if output != expectedURL {
slog.Error("State NOT shared between instances!", "ip", testIP, "output", output, "expected", expectedURL)
os.Exit(1)
}

fmt.Println("✓ State is correctly shared between nginx instances!")
}

func httpRequest(ip, url string) string {
client := &http.Client{
CheckRedirect: func(req *http.Request, via []*http.Request) error {
// Capture the redirect URL and stop following it
Expand All @@ -189,7 +209,7 @@ func httpRequest(ip string) string {
},
}

req, err := http.NewRequest("GET", "http://localhost", nil)
req, err := http.NewRequest("GET", url, nil)
if err != nil {
slog.Error("Failed to create request", "err", err)
os.Exit(1)
Expand Down
Loading