From 6d0617c2db1084c8a53be376e96c5ffae9898933 Mon Sep 17 00:00:00 2001 From: Thanatat Tamtan Date: Sun, 24 May 2026 13:55:44 +0700 Subject: [PATCH] Document the internal GC endpoint and Cloud Scheduler setup Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/README.md b/README.md index 35f8610..05a7656 100644 --- a/README.md +++ b/README.md @@ -27,6 +27,7 @@ Docker image is built and pushed automatically on push to `main`. See `.github/w | `bucket_name` | GCS bucket name | | `sign_key` | HMAC key for signing download tokens. Rotating invalidates every outstanding URL. | | `base_url` | Download URL prefix (default: `https://dropbox.deploys.app/files/`) | +| `internal_secret` | Bearer token guarding `POST /internal/gc`. When unset, the endpoint is unauthenticated. | | `PORT` | Listen port (default: `8080`) | --- @@ -109,3 +110,38 @@ $ http POST https://dropbox.deploys.app/ \ param-project:my-project \ < file ``` + +--- + +## Garbage collection + +`POST /internal/gc` + +Deletes every file whose `expires_at` is in the past from both GCS and PostgreSQL, then returns `204 No Content`. Storage objects that are already gone are ignored, so re-running is safe. This is the only mechanism that reclaims expired files — nothing runs it on a timer inside the service, so schedule an external caller. + +When `internal_secret` is set, the request must carry `Authorization: Bearer `; otherwise it returns `401`. Leave `internal_secret` unset only if the route is unreachable from outside the cluster. + +### Schedule with Cloud Scheduler + +Run it hourly with a [Cloud Scheduler](https://cloud.google.com/scheduler) HTTP job: + +```shell +$ gcloud scheduler jobs create http dropbox-gc \ + --location=asia-southeast1 \ + --schedule="0 * * * *" \ + --time-zone="Etc/UTC" \ + --uri="https://dropbox.deploys.app/internal/gc" \ + --http-method=POST \ + --headers="Authorization=Bearer " +``` + +- `--location` must be a region Cloud Scheduler supports; it does not have to match where the service runs. +- `--schedule` is a standard cron expression — `0 * * * *` fires at the top of every hour. +- `--uri` must resolve to a host the job can reach. If `/internal/gc` is only exposed inside the cluster, target the in-cluster address instead and the job will need network access to it. +- Set `--headers` to match `internal_secret`. Update the job with `gcloud scheduler jobs update http dropbox-gc --headers=...` whenever the secret rotates. + +Trigger a one-off run to verify the job: + +```shell +$ gcloud scheduler jobs run dropbox-gc --location=asia-southeast1 +```