refactor: reorganize infrastructure/ by adapted technology, not by domain

## Problem

`infrastructure/` currently mixes two taxonomies:

- **By technology**: `persistence/` (Postgres), `s3/`, `http/`, `k8s/`, `oci/`
- **By domain**: `auth/`, `data/`, `event/`, `ingest/`

The result is that the same kind of thing lands in different places:

- The data read store (a Postgres adapter) sits in `data/` while every other Postgres adapter sits in `persistence/`.
- The role repository (a Postgres repo) sits in `auth/`.
- The **filesystem** storage adapters live under `persistence/adapter/` while their **S3 twins** live under `s3/` — two implementations of the same ports in unrelated corners of the tree.
- `persistence/adapter/spreadsheet.py` is openpyxl template generation, not persistence.
- `messaging/` is an empty package (0 lines).
- `persistence/` itself is a grab-bag: engine setup, migrations, seeding, static table definitions (one 387-line `tables.py`), dynamic-table builders, dynamic-table stores, repositories, mappers, read adapters, and query utilities all at one level.

## Target layout

Principle: **top-level packages name an infrastructure concern — the external system being adapted, or the port when multiple backends implement it — and never a bounded context.** Within each package, organize **by adapter role** (setup / table definitions / write repos / read queries). Domain names appear at the file level, never the directory level.

Two concern shapes are both valid at the top level:

- **Single-technology concerns** get the technology's name: `postgres/`, `http/`.
- **Ports with multiple backends** get the port's name with technology subdirectories: `storage/` (fs + S3), `runner/` (OCI + K8s) — keeping interchangeable implementations as siblings is the point.

What is *not* valid is a bounded-context name: the current `auth/` and `event/` packages are renamed to `idp/` and `worker/` so a reader can't mistake them for "adapters owned by the auth/event domains".

```
infrastructure/
├── logging.py
├── postgres/                      # ← renamed persistence/ (it's all PG-specific)
│   ├── setup/                     #   lifecycle & wiring
│   │   ├── database.py            #   engine/session factory
│   │   ├── migrate.py
│   │   ├── seed.py
│   │   └── di.py                  #   PersistenceProvider
│   ├── tables/                    #   ALL table shapes — the one place schemas live
│   │   ├── records.py, events.py, auth.py, ...   # tables.py split by domain
│   │   ├── feature_table.py       #   dynamic builders
│   │   ├── metadata_table.py
│   │   ├── column_mapper.py       #   ColumnDef → sa.Column
│   │   └── naming.py              #   api_naming.py
│   ├── repository/                #   write side: aggregate repositories
│   │   ├── (existing 8 repos)
│   │   ├── role.py                #   ← from auth/role_repository.py
│   │   └── mappers/               #   row ↔ aggregate (only repos use them)
│   ├── store/                     #   dynamic-table DDL + bulk writes
│   │   ├── feature_store.py
│   │   └── metadata_store.py
│   └── query/                     #   read side (CQRS: queries ≠ repositories)
│       ├── data_read_store.py     #   ← from infrastructure/data/
│       ├── feature_reader.py      #   ← from persistence/adapter/
│       ├── readers.py             #   ← cross-domain read ports
│       └── keyset.py
├── storage/                       # file-storage port — both backends together
│   ├── layout.py
│   ├── fs/                        #   ← persistence/adapter/{storage,ingest_storage}.py
│   └── s3/                        #   ← s3/{client,storage,ingest_storage}.py
├── runner/                        # validator/ingester execution backends
│   ├── shared.py                  #   ← runner_utils.py
│   ├── oci/
│   └── k8s/
├── idp/                           # ← renamed auth/ — external identity providers (orcid, provider_registry, di)
├── worker/                        # ← renamed event/ — APScheduler WorkerPool / outbox drainer
├── http/                          # outbound HTTP (ontology fetcher; unchanged)
└── spreadsheet/                   # openpyxl adapter — it was never persistence
```

Deleted outright: `messaging/` (empty), `infrastructure/data/` (absorbed into `postgres/query/`), `persistence/adapter/` (disbanded — a 'miscellaneous' folder is how this drift started). `ingest/di.py` moves next to whatever it actually provides (likely `runner/` or `storage/`).

The `repository/` vs `query/` split mirrors the CQRS layering: repositories serve aggregates to command handlers; the query package serves read models and streams. `store/` sits apart because the dynamic-table stores are neither — they are DDL + projection writers driven by events.

## Guardrail: directory-scoped CLAUDE.md

Add `server/osa/infrastructure/CLAUDE.md` recording the placement rule so the drift doesn't recur:

> Top-level packages under `infrastructure/` name an infrastructure concern: the external system being adapted (`postgres/`, `http/`), or the port when multiple backends implement it (`storage/`, `runner/`). Never create a package named after a bounded context. Within a package, separate setup, table definitions, write-side repositories, and read-side queries. Domain names appear at the file level only.

Also update the repository-structure section of the root CLAUDE.md to match the new tree.

## Execution notes

- Pure-mechanical move: `git mv` + import rewrites, **zero logic changes**. Existing test suites are the safety net.
- Touches imports across the whole server and Alembic's `target_metadata` import path — do as a standalone PR **after #139 merges** to avoid conflicting with open review threads.
- Splitting `tables.py` per domain is the only step requiring judgment; it can trail in a second commit within the same PR.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: reorganize infrastructure/ by adapted technology, not by domain #142

Problem

Target layout

Guardrail: directory-scoped CLAUDE.md

Execution notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

refactor: reorganize infrastructure/ by adapted technology, not by domain #142

Description

Problem

Target layout

Guardrail: directory-scoped CLAUDE.md

Execution notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions