From a1dd9b5e9880111fbb88baeeafafca2c12ccf18b Mon Sep 17 00:00:00 2001 From: emeraldleaf Date: Wed, 3 Jun 2026 21:35:24 -0600 Subject: [PATCH 1/2] docs+infra: refresh stale refs from VSA-collapse + add broken-link CI guard MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit After the /check-rules audit + the doc-currency sweep flagged multiple stale references inherited from PR #31 (CatalogService Clean → VSA collapse) and from the IPaymentRepository deletion. Most of these had been sitting in the repo for months because nothing fails when a markdown link rots. Drift fixed: - CLAUDE.md:341 — outbox-atomic example linked to PaymentRepository.cs (deleted in simplicity refactor). Repointed at PaymentRecoveryJob, the current inline implementation site. CLAUDE.md:27 already correctly described the move; only the deep-link was stale. - docs/demo-deployment.md + docs/demo-deployment-story.md — multiple references to CatalogService.Api/, CatalogService.Infrastructure/, the pre-collapse 4-project Clean layout. All paths refreshed to the current single-project structure (CatalogService/Program.cs, CatalogService/ Infrastructure/Data/CatalogDbContext.cs, etc.). Two `dotnet ef migrations add` snippets that referenced the old --project / --startup-project pairing simplified to `--project CatalogService`. - docs/performance-and-data-correctness.md:129,326 — handler citations used the *Handler.cs file naming convention (Clean) but VSA co-locates command + validator + handler in a single use-case file. Renamed to GetProductById.cs / UpdateProduct.cs / ReserveStock.cs. The dead GetProductByIdHandlerTests.cs citation replaced with the existing ProductCachingTests.cs (integration tier — the right tier for cache-projection behavior). - Dockerfile.catalog — rewrote the build stage for the single-project structure. The old Dockerfile COPYed CatalogService/CatalogService.Api/ *.csproj and 3 sibling projects that haven't existed since PR #31. Anyone triggering the Fly.io deploy workflow today would have hit a build failure on the very first COPY. Verified locally with `docker build --platform=linux/amd64 -f Dockerfile.catalog -t catalog-api .` → produces a 116 MB runtime image. Mechanical guard added: - .github/workflows/ci.yml — broken-link audit step in the build job. Scans every markdown file for relative links to .cs/.csproj/.props/.sh/.yml/ .yaml/.svg/.excalidraw/.cls/.md files, fails the build if any don't resolve. Skips http(s)://, // bare URLs, and anchors-only. Process substitution throughout so the failure flag survives the loop (a piped while runs in a subshell and loses its updates — this is the trap that bit the first draft). Same shape as the existing "Concurrency audit" step: grep + fail. Would have caught every drift in this PR mechanically. Smoke-tested locally against the post-fix tree (exit 0). Why option B (full fix) over option A (doc-only): the deployed Fly demo at https://catalog-api-demo.fly.dev/ is still running pre-collapse code (last successful deploy was SHA 73388e8, before #31 merged). Any redeploy attempt would have failed silently — the broken Dockerfile + a workflow nobody had triggered in 2 weeks meant the failure mode would surface as a confusing CI error on the next demo refresh, not when the actual drift landed. Knocking out the Dockerfile + the docs that describe how to use it in the same PR keeps the system internally consistent. Co-Authored-By: Claude Opus 4.7 --- .github/workflows/ci.yml | 29 ++++++++++++++++++++++++ CLAUDE.md | 2 +- Dockerfile.catalog | 26 ++++++++++++--------- docs/demo-deployment-story.md | 27 +++++++++------------- docs/demo-deployment.md | 10 ++++---- docs/performance-and-data-correctness.md | 4 ++-- 6 files changed, 63 insertions(+), 35 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 791acb5d..964ab0d4 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -109,6 +109,35 @@ jobs: exit 1 fi + # Broken-link audit: relative markdown links to local source files (.cs/.csproj/.props/ + # .yml/.svg/.excalidraw/.md) that don't resolve. Catches drift from file renames and + # the kind of fallout the simplicity refactor produced — CLAUDE.md citations pointing at + # deleted PaymentRepository.cs, demo docs pointing at the pre-VSA-collapse 4-project + # layout. Same shape as the static-mutable-collections check above: grep + fail. Skips + # external URLs (https?://) and anchors-only (#section); resolves paths relative to + # the markdown file's directory. Uses process substitution everywhere so the failure + # flag survives the loop (a piped while runs in a subshell and loses its updates). + - name: Broken-link audit — markdown citations to local files + run: | + fail=0 + while IFS= read -r mdfile; do + dir=$(dirname "$mdfile") + while IFS= read -r link; do + case "$link" in http*|//*) continue ;; esac + candidate="${link%%#*}" + candidate="${candidate%%\?*}" + if [ "$dir" = "." ]; then target="$candidate"; else target="$dir/$candidate"; fi + if [ ! -e "$target" ] && [ ! -e "$candidate" ]; then + echo "::error file=$mdfile::broken link → $link" + fail=1 + fi + done < <(grep -oE '\[[^]]+\]\(([^)#]+\.(cs|csproj|props|sh|yml|yaml|svg|excalidraw|cls|md))[^)#]*\)' "$mdfile" \ + | sed -E 's/.*\(([^)]+)\)/\1/') + done < <(find . -type f -name '*.md' \ + -not -path './bin/*' -not -path './obj/*' \ + -not -path '*/node_modules/*' -not -path '*/.git/*') + exit "$fail" + # Testcontainers-based integration tests, in their own job: they need Docker (the # ubuntu-latest runner ships it at the standard /var/run/docker.sock, so Testcontainers # auto-detects — no DOCKER_HOST override, unlike macOS Docker Desktop locally). Kept diff --git a/CLAUDE.md b/CLAUDE.md index dc61fd81..12480b15 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -338,7 +338,7 @@ public async Task ExecuteInTransactionAsync(Func work, await tx.CommitAsync(ct); } ``` -Reference: [PaymentRepository.ExecuteInTransactionAsync](PaymentService/Infrastructure/PaymentRepository.cs) (fixed in the commit captured by docs/STATUS.md). When adding a non-handler code path that publishes events, **either** wrap it in this pattern **or** factor the publish back into a Wolverine handler triggered by an internal scheduled message. +Reference: [PaymentRecoveryJob](PaymentService/Infrastructure/PaymentRecoveryJob.cs) — the canonical inline implementation of this wrapper (the previous `IPaymentRepository.ExecuteInTransactionAsync` wrapper was deleted in the simplicity refactor; the pattern itself is unchanged, just inlined). When adding a non-handler code path that publishes events, **either** wrap it in this pattern **or** factor the publish back into a Wolverine handler triggered by an internal scheduled message. ### Event Replay diff --git a/Dockerfile.catalog b/Dockerfile.catalog index 09099765..dcc363d6 100644 --- a/Dockerfile.catalog +++ b/Dockerfile.catalog @@ -1,6 +1,6 @@ -# Multi-stage build for CatalogService.Api targeting single-service demo deploys (App Runner, -# Container Apps, Lightsail, etc.). Build context is the repo root because the API project -# transitively references ServiceDefaults, Contracts, Application, Infrastructure, Domain. +# Multi-stage build for CatalogService targeting single-service demo deploys (Fly.io, +# App Runner, Container Apps, Lightsail, etc.). Build context is the repo root because +# CatalogService transitively references NextAurora.ServiceDefaults and NextAurora.Contracts. # # Build: docker build -f Dockerfile.catalog -t catalog-api . # Run: docker run --rm -p 8080:8080 \ @@ -9,6 +9,11 @@ # -e DemoMode=true \ # -e ConnectionStrings__catalog-db="Host=...;Database=...;Username=...;Password=..." \ # catalog-api +# +# Single-project layout (post the VSA-collapse refactor — see CLAUDE.md "Project Structure"): +# CatalogService is one Web SDK project under CatalogService/, not the four-project Clean +# layout that existed up to PR #31. If you're rolling back to a pre-#31 commit, this Dockerfile +# won't build — that's intentional, the pre-#31 Dockerfile is preserved in git history. # ─── Build stage ────────────────────────────────────────────────────────────── FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build @@ -19,15 +24,14 @@ WORKDIR /src # their stricter defaults — CA2007/CA1062/CA1724 etc. become errors under TreatWarningsAsErrors). COPY .editorconfig Directory.Build.props Directory.Packages.props BannedSymbols.txt NextAurora.slnx ./ -# Copy only the project files needed for the Catalog dependency graph. -COPY CatalogService/CatalogService.Api/CatalogService.Api.csproj CatalogService/CatalogService.Api/ -COPY CatalogService/CatalogService.Application/CatalogService.Application.csproj CatalogService/CatalogService.Application/ -COPY CatalogService/CatalogService.Domain/CatalogService.Domain.csproj CatalogService/CatalogService.Domain/ -COPY CatalogService/CatalogService.Infrastructure/CatalogService.Infrastructure.csproj CatalogService/CatalogService.Infrastructure/ +# Copy only the project files needed for the Catalog dependency graph. csproj-then-source is +# the standard Docker .NET layer-caching pattern: the restore layer only invalidates when a +# csproj actually changes, not on every source edit. +COPY CatalogService/CatalogService.csproj CatalogService/ COPY NextAurora.ServiceDefaults/NextAurora.ServiceDefaults.csproj NextAurora.ServiceDefaults/ COPY NextAurora.Contracts/NextAurora.Contracts.csproj NextAurora.Contracts/ -RUN dotnet restore CatalogService/CatalogService.Api/CatalogService.Api.csproj +RUN dotnet restore CatalogService/CatalogService.csproj # Copy the actual source after restore — keeps the dependency-graph layer cached unless a # .csproj actually changes. @@ -35,7 +39,7 @@ COPY CatalogService/ CatalogService/ COPY NextAurora.ServiceDefaults/ NextAurora.ServiceDefaults/ COPY NextAurora.Contracts/ NextAurora.Contracts/ -RUN dotnet publish CatalogService/CatalogService.Api/CatalogService.Api.csproj \ +RUN dotnet publish CatalogService/CatalogService.csproj \ --configuration Release \ --output /app/publish \ --no-restore \ @@ -56,4 +60,4 @@ EXPOSE 8080 USER app COPY --from=build /app/publish . -ENTRYPOINT ["dotnet", "CatalogService.Api.dll"] +ENTRYPOINT ["dotnet", "CatalogService.dll"] diff --git a/docs/demo-deployment-story.md b/docs/demo-deployment-story.md index 1deda982..bf8f3397 100644 --- a/docs/demo-deployment-story.md +++ b/docs/demo-deployment-story.md @@ -1,6 +1,6 @@ # Deployment Story — Getting CatalogService Live on Fly.io -A step-by-step narrative of what we actually did to deploy [CatalogService.Api](../CatalogService/CatalogService.Api/) to a public URL, including the dead ends and why we ended up where we did. Useful for walking someone through the deployment story or refreshing your own memory. +A step-by-step narrative of what we actually did to deploy [CatalogService](../CatalogService/) to a public URL, including the dead ends and why we ended up where we did. Useful for walking someone through the deployment story or refreshing your own memory. For the reusable checklist (do this from scratch), see [demo-deployment.md](demo-deployment.md). This doc is the *story*; that doc is the *recipe*. @@ -8,7 +8,7 @@ For the reusable checklist (do this from scratch), see [demo-deployment.md](demo ## Goal -Get a working public URL serving `CatalogService.Api` with the Scalar API documentation reachable, on as little budget and complexity as possible — without breaking any of the existing local development paths (Aspire, integration tests, future production deploy). +Get a working public URL serving `CatalogService` with the Scalar API documentation reachable, on as little budget and complexity as possible — without breaking any of the existing local development paths (Aspire, integration tests, future production deploy). ## What "done" looks like @@ -39,9 +39,9 @@ Get a working public URL serving `CatalogService.Api` with the Scalar API docume ## Step 1 — Make the code deploy-aware (`DemoMode` flag) -**Problem**: `CatalogService.Api` was wired for two environments — local development (where Scalar/OpenAPI are exposed) and a hypothetical production (where they're hidden because OpenAPI specs are reconnaissance gold). For the demo we needed a *third* mode: Production-environment behavior PLUS Scalar visibility, because the whole point is showing the API documentation. +**Problem**: `CatalogService` was wired for two environments — local development (where Scalar/OpenAPI are exposed) and a hypothetical production (where they're hidden because OpenAPI specs are reconnaissance gold). For the demo we needed a *third* mode: Production-environment behavior PLUS Scalar visibility, because the whole point is showing the API documentation. -**Solution**: a `DemoMode` configuration flag in [Program.cs](../CatalogService/CatalogService.Api/Program.cs). When set, it: +**Solution**: a `DemoMode` configuration flag in [Program.cs](../CatalogService/Program.cs). When set, it: 1. Exposes `/openapi/v1.json`, `/openapi/v1.yaml`, `/scalar/v1` even outside Development 2. Skips `UseHttpsRedirection()` (PaaS hosts terminate TLS at the edge — would cause redirect loops) 3. Runs EF Core migrations on startup (so we don't need a separate "deploy migrations" step) @@ -50,7 +50,7 @@ Get a working public URL serving `CatalogService.Api` with the Scalar API docume ## Step 2 — Make Redis optional -`CatalogService.Infrastructure` registers Redis via HybridCache's L2 tier. For a single-replica demo we don't want to pay for managed Redis. The registration is now conditional: if no `cache` connection string is configured, Redis isn't registered, and HybridCache gracefully degrades to L1-only (in-process MemoryCache). When run via Aspire locally, Redis IS registered because `WithReference(cache)` provides the connection string — so local dev is unchanged. +`CatalogService.Infrastructure` (the `Infrastructure/` folder inside the single CatalogService project) registers Redis via HybridCache's L2 tier. For a single-replica demo we don't want to pay for managed Redis. The registration is now conditional: if no `cache` connection string is configured, Redis isn't registered, and HybridCache gracefully degrades to L1-only (in-process MemoryCache). When run via Aspire locally, Redis IS registered because `WithReference(cache)` provides the connection string — so local dev is unchanged. ## Step 3 — Containerize @@ -102,7 +102,7 @@ Postgres provisioning prints the connection details once — **password is unrec **Problem**: Fly's secret names only allow `[A-Z0-9_]` — hyphens are rejected. But our app reads `GetConnectionString("catalog-db")` (kebab-case, set by Aspire's `WithReference()` convention). The corresponding env var name would be `ConnectionStrings__catalog-db`, which Fly bounces. -**Solution**: a tiny adapter in [Program.cs](../CatalogService/CatalogService.Api/Program.cs) that, only when `DemoMode=true`, reads from a Fly-compatible secret name (`CATALOG_DB_CONNECTION_STRING`) and copies it into the `ConnectionStrings:catalog-db` slot the Infrastructure layer reads from. 5 lines, fully gated behind the demo flag, doesn't touch Aspire wiring. +**Solution**: a tiny adapter in [Program.cs](../CatalogService/Program.cs) that, only when `DemoMode=true`, reads from a Fly-compatible secret name (`CATALOG_DB_CONNECTION_STRING`) and copies it into the `ConnectionStrings:catalog-db` slot the Infrastructure layer reads from. 5 lines, fully gated behind the demo flag, doesn't touch Aspire wiring. Then set the secret: @@ -138,7 +138,7 @@ This is the one decision in the demo deploy that deliberately *violates* a produ ### What actually happens -[Program.cs](../CatalogService/CatalogService.Api/Program.cs) ends its startup with: +[Program.cs](../CatalogService/Program.cs) ends its startup with: ```csharp if (app.Environment.IsDevelopment() || isDemoMode) @@ -194,9 +194,7 @@ The `xmin` system column is Postgres-specific — it's the transaction ID of the If we change a domain entity later (e.g. add a `Sku` field to `Product`): ```bash -dotnet ef migrations add AddProductSku \ - --project CatalogService/CatalogService.Infrastructure \ - --startup-project CatalogService/CatalogService.Api +dotnet ef migrations add AddProductSku --project CatalogService ``` This generates a new `.cs` file in `Migrations/`. Commit it. Next `fly deploy --remote-only` ships the new code + new migration, the Machine reboots, `Migrate()` notices `AddProductSku` is unapplied, runs the `ALTER TABLE` it contains, and the new boot is serving with the new schema. Zero downtime if the change is backward-compatible (additive columns, new indexes, new tables). Forward-incompatible changes (drop column, rename, NOT NULL on existing column) need the multi-step plan described in [ef-core.md "The immutable-once-applied rule"](ef-core.md#67-the-immutable-once-applied-rule). @@ -207,17 +205,14 @@ After the first deploy worked, the catalog was empty (`GET /api/v1/products` ret ### Adding the seed -In [CatalogDbContext.cs](../CatalogService/CatalogService.Infrastructure/Data/CatalogDbContext.cs), `OnModelCreating` calls a private `SeedDemoData` method that uses `modelBuilder.Entity().HasData(...)` to declaratively register 3 categories and 7 products. Fixed GUIDs and a fixed `CreatedAt` (not `Guid.NewGuid()` / `DateTime.UtcNow`) so the generated migration is **deterministic** — re-running the model snapshot wouldn't emit a diff. +In [CatalogDbContext.cs](../CatalogService/Infrastructure/Data/CatalogDbContext.cs), `OnModelCreating` calls a private `SeedDemoData` method that uses `modelBuilder.Entity().HasData(...)` to declaratively register 3 categories and 7 products. Fixed GUIDs and a fixed `CreatedAt` (not `Guid.NewGuid()` / `DateTime.UtcNow`) so the generated migration is **deterministic** — re-running the model snapshot wouldn't emit a diff. `HasData` writes via reflection, which **bypasses** the entity's factory method (`Product.Create`) and private setters. That's the right trade for curated design-time data — validation is unnecessary because we control the values. We still set `IsAvailable` explicitly to match the `StockQuantity > 0` invariant the factory would have enforced. Then generate the migration: ```bash -dotnet ef migrations add SeedDemoCatalog \ - --project CatalogService/CatalogService.Infrastructure \ - --startup-project CatalogService/CatalogService.Api \ - --context CatalogDbContext +dotnet ef migrations add SeedDemoCatalog --project CatalogService --context CatalogDbContext ``` EF Core produced two files: @@ -323,7 +318,7 @@ In rough order: 2. **Local Docker daemon was corrupted** from an earlier disk-full event. → `--remote-only` builds on Fly's builder, sidestepping local Docker entirely. 3. **Fly removed dashboard-level spending caps**; only soft alerts remain. → Bought $25 prepaid credits and didn't save a card. When credits hit $0, Fly suspends instead of charging. Effective hard cap. 4. **Fly's `fly postgres create` warns it's "unmanaged"** and pushes Managed Postgres ($15+/mo). → Legacy unmanaged is fine for throwaway demo data; ignored the nudge. -5. **Fly secret names reject hyphens** (`[A-Z0-9_]` only). → Added a `DemoMode`-only bridge in [Program.cs](../CatalogService/CatalogService.Api/Program.cs) that copies `CATALOG_DB_CONNECTION_STRING` into `ConnectionStrings:catalog-db`. Aspire wiring untouched. +5. **Fly secret names reject hyphens** (`[A-Z0-9_]` only). → Added a `DemoMode`-only bridge in [Program.cs](../CatalogService/Program.cs) that copies `CATALOG_DB_CONNECTION_STRING` into `ConnectionStrings:catalog-db`. Aspire wiring untouched. 6. **Docker build failed: analyzer errors (CA1062/CA2007/CA1724/MA0004) under `TreatWarningsAsErrors=true`.** → The `.editorconfig` at the repo root suppresses these; wasn't being copied into the build context. Added to the COPY line in [Dockerfile.catalog](../Dockerfile.catalog). 7. **First deploy crashed: `Exception while performing SSL handshake / Received an unexpected EOF`** on the EF Core migration's first Postgres connection. → Fly's legacy unmanaged Postgres on `.flycast` doesn't speak SSL. Npgsql's default `SSL Mode=Prefer` crashes hard instead of falling back to plain. Fix: append `SSL Mode=Disable` to the connection string. Flycast is already a private encrypted network, so disabling Postgres-layer SSL is safe inside that perimeter. 8. **Health-check grace period was too short** for first boot (20s default vs ~30-60s for migration + Postgres connect). → Bumped to 120s in fly.toml. Subsequent boots are fast because `Migrate()` finds the migration already applied and returns in ms. diff --git a/docs/demo-deployment.md b/docs/demo-deployment.md index fd17040c..f778eec8 100644 --- a/docs/demo-deployment.md +++ b/docs/demo-deployment.md @@ -1,6 +1,6 @@ # Demo Deployment — CatalogService -One-time setup to get [CatalogService.Api](../CatalogService/CatalogService.Api/) running on a public URL with the Scalar UI exposed for a public demo. +One-time setup to get [CatalogService](../CatalogService/) running on a public URL with the Scalar UI exposed for a public demo. For a narrative walkthrough of the actual deploy session (what we did, why we made each call, dead ends along the way), see [demo-deployment-story.md](demo-deployment-story.md). This doc is the *recipe*; that one is the *story*. @@ -21,20 +21,20 @@ The demo scaffolding is fully additive. Local Aspire development, the test suite | Surface | When `DemoMode` is absent | Why | |---|---|---| -| `dotnet run --project NextAurora.AppHost` (local Aspire) | Unchanged | All three `DemoMode` branches in [Program.cs](../CatalogService/CatalogService.Api/Program.cs) short-circuit: `IsDevelopment() \|\| false` → `IsDevelopment()`. | +| `dotnet run --project NextAurora.AppHost` (local Aspire) | Unchanged | All three `DemoMode` branches in [Program.cs](../CatalogService/Program.cs) short-circuit: `IsDevelopment() \|\| false` → `IsDevelopment()`. | | Redis registration | Unchanged | Aspire's `WithReference(cache)` sets `ConnectionStrings__cache`, so the new conditional still registers `AddStackExchangeRedisCache`. Skipping only triggers when no `cache` conn string is wired at all. | | `dotnet build` | Unchanged | Zero new warnings under `TreatWarningsAsErrors`. | | Integration tests | Unchanged | Testcontainers provides Redis via the same `ConnectionStrings__cache` path. | | Existing CI workflows ([ci.yml](../.github/workflows/ci.yml), [codeql.yml](../.github/workflows/codeql.yml)) | Unchanged | New workflows are `workflow_dispatch` only — never fire on push or PR. | | Production posture (if/when we deploy real prod) | Unchanged | `DemoMode` defaults to `false`. OpenAPI + Scalar stay hidden. HTTPS redirection stays on. Migrate-on-startup stays off. | -**Watch-out**: if you ever export `ConnectionStrings__catalog-db=` in your local shell, a bare `dotnet run --project CatalogService/CatalogService.Api` would try to talk to the remote DB. This is self-inflicted-only — `dotnet run --project NextAurora.AppHost` overrides connection strings before child processes inherit them, so Aspire-driven local runs are immune. +**Watch-out**: if you ever export `ConnectionStrings__catalog-db=` in your local shell, a bare `dotnet run --project CatalogService` would try to talk to the remote DB. This is self-inflicted-only — `dotnet run --project NextAurora.AppHost` overrides connection strings before child processes inherit them, so Aspire-driven local runs are immune. The [Dockerfile.catalog](../Dockerfile.catalog) and [.dockerignore](../.dockerignore) at the repo root are pure opt-in — Aspire runs the .NET services as `dotnet` processes (only infra deps like Postgres/SQL/Redis/Keycloak/ASB-emulator run in containers), so nothing in the local workflow invokes `docker build`. ## What gets deployed (either path) -- **CatalogService.Api** as a single replica, scale-to-zero when idle +- **CatalogService** as a single replica, scale-to-zero when idle - **Managed Postgres** for product/stock data (Fly Postgres or AWS RDS depending on path) - **No Redis** — HybridCache degrades to L1-only (in-process MemoryCache). Real prod would add a managed Redis for L2. - **No Service Bus / no other services** — single-service demo. Cross-service choreography (Order → Payment → Shipping saga) doesn't fit a free-tier budget; flag this as a "would need ASB + ≥2 services" caveat when walking through the deployment. @@ -161,7 +161,7 @@ Use this path if you want AWS specifically — slower setup, ~$5/mo. Requires a ## What gets deployed (AWS specifics) -- **CatalogService.Api** as an App Runner service +- **CatalogService** as an App Runner service - **RDS Postgres** (`db.t4g.micro`, free tier 12 mo) ## Architecture diff --git a/docs/performance-and-data-correctness.md b/docs/performance-and-data-correctness.md index 35cb1ff3..e8f5e497 100644 --- a/docs/performance-and-data-correctness.md +++ b/docs/performance-and-data-correctness.md @@ -126,7 +126,7 @@ See [decision: optimistic concurrency tokens](#decision-optimistic-concurrency-t **Spec:** the same handler that owns the change owns the invalidation. For domain events that affect cached entities cross-service (e.g., `ProductPriceChanged` invalidating product cache), the event handler invalidates. -**Where it applies:** [CatalogService.Domain.IProductCache](../CatalogService/Domain/IProductCache.cs), backed by `HybridCache` ([HybridProductCache.cs](../CatalogService/Infrastructure/Caching/HybridProductCache.cs)). [GetProductByIdHandler](../CatalogService/Features/GetProductByIdHandler.cs) reads through it; [UpdateProductHandler](../CatalogService/Features/UpdateProductHandler.cs) and [ReserveStockHandler](../CatalogService/Features/ReserveStockHandler.cs) call `InvalidateAsync` after their save in the same unit of work. Tag-based invalidation clears L1 (in-process) and L2 (Redis) atomically. Full rationale: [decision: distributed read caching with HybridCache](#decision-distributed-read-caching-with-hybridcache). +**Where it applies:** [CatalogService.Domain.IProductCache](../CatalogService/Domain/IProductCache.cs), backed by `HybridCache` ([HybridProductCache.cs](../CatalogService/Infrastructure/Caching/HybridProductCache.cs)). [GetProductByIdHandler](../CatalogService/Features/GetProductById.cs) reads through it; [UpdateProductHandler](../CatalogService/Features/UpdateProduct.cs) and [ReserveStockHandler](../CatalogService/Features/ReserveStock.cs) call `InvalidateAsync` after their save in the same unit of work. Tag-based invalidation clears L1 (in-process) and L2 (Redis) atomically. Full rationale: [decision: distributed read caching with HybridCache](#decision-distributed-read-caching-with-hybridcache). ### 13. Migrations are immutable once applied @@ -323,7 +323,7 @@ This is broken in two specific ways: 1. **The `Get` then `Set` sequence cannot dedupe concurrent misses.** By the time the second caller calls `Get` and sees a miss, the first caller is already between `Get` and `Set`. Stampede protection requires the cache to know about the in-flight load — it has to hand back the same `Task` to all concurrent miss-callers and `await` it once. That's only possible if the cache *owns* the factory call. 2. **The handler is the wrong owner of the policy.** Every new cached entity in a new handler reinvents the same five lines, and small differences (forgetting to filter null on `Set`, not propagating `CancellationToken`, swallowing exceptions from the load) are how staleness bugs ship. -The factory-based shape pushes all of that into the cache. The handler describes *intent* ("how to load on miss"); the cache owns the *flow* (try L1, try L2, dedupe, run factory, populate both layers, return). Test surface drops to the projection logic — see [GetProductByIdHandlerTests.cs](../tests/CatalogService.Tests.Unit/Application/GetProductByIdHandlerTests.cs). +The factory-based shape pushes all of that into the cache. The handler describes *intent* ("how to load on miss"); the cache owns the *flow* (try L1, try L2, dedupe, run factory, populate both layers, return). Test surface drops to the projection logic — see [ProductCachingTests.cs](../tests/CatalogService.Tests.Integration/ProductCachingTests.cs) (integration tier, exercises the real HybridCache against Testcontainers Postgres + Redis — the right tier for cache-projection behavior). ### What we cache, and why From 844d21b38a1c25e47e2c3e8c8c6fd3e4472d8409 Mon Sep 17 00:00:00 2001 From: emeraldleaf Date: Wed, 3 Jun 2026 21:41:05 -0600 Subject: [PATCH 2/2] ci: exclude .claude/audits/INDEX.md from broken-link audit MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit INDEX.md ships in the repo but its links point at per-article audit files that are gitignored for copyright reasons (verbatim quoted prose). See .claude/commands/article-audit.md step 5 'Copyright note' — contract is 'INDEX ships, per-article files don't.' On a contributor's machine the links resolve; on the CI runner they don't, by design. Surfaced by #112's first run — guard correctly flagged 14 broken refs, but they're all from this one intentionally-excluded file. Co-Authored-By: Claude Opus 4.7 --- .github/workflows/ci.yml | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 964ab0d4..75c945ce 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -135,8 +135,14 @@ jobs: | sed -E 's/.*\(([^)]+)\)/\1/') done < <(find . -type f -name '*.md' \ -not -path './bin/*' -not -path './obj/*' \ - -not -path '*/node_modules/*' -not -path '*/.git/*') + -not -path '*/node_modules/*' -not -path '*/.git/*' \ + -not -path './.claude/audits/INDEX.md') exit "$fail" + # `.claude/audits/INDEX.md` is intentionally excluded — it links to per-article audit + # files under `.claude/audits/*.md` that are gitignored for copyright reasons (they + # contain verbatim quoted prose from external articles). See `.claude/commands/article-audit.md` + # step 5 "Copyright note" — the contract is "INDEX ships, per-article files don't." + # On a contributor's machine the links resolve; on the CI runner they don't, by design. # Testcontainers-based integration tests, in their own job: they need Docker (the # ubuntu-latest runner ships it at the standard /var/run/docker.sock, so Testcontainers