Conversation
…116) ## What I'm changing Pushed by a recent spike of high ALB egress bills, this PR swaps out the data proxy for a data proxy written with [multistore](https://github.com/developmentseed/multistore). This allows us to deploy the data proxy onto Cloudflare Workers, thereby pushing all egress charges directly to S3 and in line with the AWS Open Data Program. This is a read-only proxy, write operations will be added at a later date. ## How I did it Deployed to Cloudflare Workers. We're currently serving ~4M requests per day and are seeing an error rate of ~0.001% ### Custom URLs Obtaining custom URLs without migrating all of the source.coop DNS settings over the Cloudflare was a bit of a challenge. I opted to host the proxy workers on `coolnewgeo.com` (an unused domain owned by Radiant Earth): * `data.coolnewgeo.com` - prod * `staging.data.coolnewgeo.com` - staging Custom Hostnames have been set up under `coolnewgeo.com` for `data.source.coop` and `data.staging.source.coop`, both pointing to a null fallback origin of `fallback.coolnewgeo.com` (which has a DNS A record pointing to `192.0.2.1`). Configured routes on the workers connect these custom hostnames to each worker environment: https://github.com/source-cooperative/data.source.coop/blob/0a44d6bd70d6f132f0c519f0d9367d82ed79a2dd/wrangler.toml#L5-L11 https://github.com/source-cooperative/data.source.coop/blob/0a44d6bd70d6f132f0c519f0d9367d82ed79a2dd/wrangler.toml#L36-L40 ## How to test it This has been running in production for the past week and is used by https://source.coop. ## PR Checklist - [ ] ~This PR has **no** breaking changes.~ - [ ] I have updated or added new tests to cover the changes in this PR. - [ ] This PR affects the [Source Cooperative Frontend & API](https://github.com/source-cooperative/source.coop), and I have opened issue/PR #XXX to track the change. ## Related Issues * #115 * #1 ## TODO - [x] Setup autodeploy for staging on merges to `main` - [x] Setup autodeploy for production on releases - [x] Setup autodeploy for PRs --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Covers federated identity via OIDC, user-defined Roles and IdPs, claim constraint language, permission model, credential issuance, request-time authorization, and client tooling integration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… token exchange ADR-001: Replace embedded SecretAccessKey with HMAC derivation, add ES256 signing, revocation via jti deny-list, and updated SessionToken JWT structure. ADR-004: Rewrite for two-tier IdP model (platform + account-registered), user-defined Roles with claim constraints and permission statements, AWS STS-compatible request/response format, and removal of SC Credential Tokens. ADR-005: Replace fixed 3-role model with user-defined Roles as permission ceiling. Resolve grant schema with concrete permission statement format (read/write actions, URN resource patterns with prefix scoping). Update authorization flow to use Role ceiling from SessionToken intersected with dynamic account permissions. RFC-001: Update sections 4, 7, 8, 13, and 14 to reflect new design. Mark open question 7 (grant schema) as resolved. Add new open questions for org permission model, HMAC secret rotation, and multipart upload credential expiry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
600e47a to
2fd17f2
Compare
|
🚀 Latest commit deployed to https://source-data-proxy-pr-115.source-coop.workers.dev
|
|
|
||
| The Workers deployment hosts an STS endpoint at `/.sts` for credential exchange. | ||
|
|
||
| ```mermaid |
There was a problem hiding this comment.
It would be useful to visualize both the current (non-Cloudflare) data flow vs. the new Cloudflare data flow.
4b90274 to
70fb0ec
Compare
| **Costs / Risks** | ||
|
|
||
| - WASM compilation constrains library choices (no `std` features that don't work in WASM) | ||
| - In-region, high-throughput workflows (e.g. bulk ETL in `us-west-2`) route through the edge rather than staying within the region — this adds latency and may incur upstream egress fees that an in-region proxy would avoid |
| 1. Parse `RoleArn` → extract `account_id` and `role_name` | ||
| 2. Load Role definition from policy store (cached, 30–60s TTL) | ||
| 3. Extract `iss` from JWT (without verification) | ||
| 4. Match `iss` against the Role's allowed IdPs — reject immediately if no match | ||
| 5. Fetch JWKS from the matched IdP (cached, 1hr TTL, 3s timeout, stale-while-revalidate on fetch failure) | ||
| 6. Verify JWT signature, `exp`, `nbf` (60s clock skew tolerance), and `aud` | ||
| 7. Evaluate claim constraints for the matched IdP binding | ||
| 8. Validate `DurationSeconds` ≤ Role's `max_session_duration` | ||
| 9. Generate credentials (see ADR-001 for token structure) and return response |
There was a problem hiding this comment.
| 1. Parse `RoleArn` → extract `account_id` and `role_name` | |
| 2. Load Role definition from policy store (cached, 30–60s TTL) | |
| 3. Extract `iss` from JWT (without verification) | |
| 4. Match `iss` against the Role's allowed IdPs — reject immediately if no match | |
| 5. Fetch JWKS from the matched IdP (cached, 1hr TTL, 3s timeout, stale-while-revalidate on fetch failure) | |
| 6. Verify JWT signature, `exp`, `nbf` (60s clock skew tolerance), and `aud` | |
| 7. Evaluate claim constraints for the matched IdP binding | |
| 8. Validate `DurationSeconds` ≤ Role's `max_session_duration` | |
| 9. Generate credentials (see ADR-001 for token structure) and return response | |
| 1. Parse `RoleArn` → extract `account_id` and `role_name` | |
| 2. Load Role definition from policy store (cached, 30–60s TTL) | |
| 3. Extract `iss` from JWT (without verification) | |
| 4. Match `iss` against the Role's allowed IdPs — reject immediately if no match | |
| 5. Fetch JWKS from the matched IdP (cached, 1hr TTL, 3s timeout, stale-while-revalidate on fetch failure) | |
| 6. Verify JWT signature, `exp`, `nbf` (60s clock skew tolerance), and `aud` | |
| 7. Evaluate claim constraints for the matched IdP binding | |
| 8. Validate `DurationSeconds` ≤ Role's `max_session_duration` | |
| 9. Generate credentials (see ADR-001 for token structure) and return response | |
| ````mermaid | |
| flowchart TD | |
| Start["Receive AssumeRoleWithWebIdentity request"] --> Parse["1. Parse RoleArn<br/>→ account_id + role_name"] | |
| Parse --> LoadRole["2. Load Role from policy store<br/>(cached, 30–60s TTL)"] | |
| LoadRole --> ExtractIss["3. Extract iss from JWT<br/>(without verification)"] | |
| ExtractIss --> MatchIdP{"4. Does iss match<br/>Role's allowed IdPs?"} | |
| MatchIdP -- No --> RejectIdP["Reject:<br/>IDPRejectedClaim"] | |
| MatchIdP -- Yes --> FetchJWKS["5. Fetch JWKS from IdP<br/>(cached 1hr TTL, 3s timeout,<br/>stale-while-revalidate)"] | |
| FetchJWKS --> VerifyJWT{"6. Verify JWT<br/>signature, exp, nbf, aud"} | |
| VerifyJWT -- Invalid --> RejectJWT["Reject:<br/>InvalidIdentityToken"] | |
| VerifyJWT -- Valid --> EvalClaims{"7. Evaluate claim<br/>constraints"} | |
| EvalClaims -- Fail --> RejectClaims["Reject:<br/>IDPRejectedClaim"] | |
| EvalClaims -- Pass --> ValidateDuration{"8. DurationSeconds ≤<br/>max_session_duration?"} | |
| ValidateDuration -- No --> RejectDuration["Reject:<br/>ValidationError"] | |
| ValidateDuration -- Yes --> GenCreds["9. Generate short-lived<br/>SigV4 credentials"] | |
| GenCreds --> Return["Return credentials response"] |
There was a problem hiding this comment.
I'm not sure the mermaid diagram is worth the extra space, but I do find it easier to understand the decision flow.
| "failure_reason": null | ||
| } | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Note that the STS log entries may include PII (e.g. assumed_by and client_ip), and that a future logging ADR will need to address retention/redaction policies.
|
|
||
| Data providers register their upstream storage (their own S3 bucket, GCS bucket, etc.) with Source Cooperative. The proxy serves as an access control, metering, and distribution layer in front of their data. | ||
|
|
||
| Data providers get: |
There was a problem hiding this comment.
Does the current version of object_store compile to WASM? Or is the risk that future versions of object_store may not be compatible?
Co-authored-by: Tyler Erickson <tylerickson@gmail.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
What I'm changing
This PR adds a new RFP and accompanying ADRs to the codebase.
How I did it
This RFC (
adrs/rfc-001.md) describes a new architecture for our data proxy; some of which has been partially actualized in #109. Accompanying the RFC is various ADRs that go into greater detail about components of the architecture. The new architecture is future facing (ie it includes designs for new features) but attempts to be constrained, only thoroughly exploring a first wave of near-term features and only acknowledging later-stage features like metering and rate-limiting.How to test it
Review, comment inline on PR.