Rule
As a rule, we do NOT allow top-level functions. We organise things into repos, services, etc.
A module-level function that carries behaviour — especially one taking a session/engine/conn — is a repository/store/service method wearing a function costume. It hides where behaviour lives, can't be injected/swapped, and erodes the entry-point → service → repository layering.
Precedent (already done for /data/)
PR #137 set the pattern for the /data/ adapters:
- Shared schema→feature resolution (was
schema_features.py, four loose functions, two doing session.execute) → SchemaFeatureReader (a session-holding class) composed into both read stores.
- Per-file helpers (
statement_timeout_sql, _escape_like, _invalid_cursor, _apply_scalar_op, _feature_column_specs) → @staticmethods on their store classes.
grep '^def\|^async def' osa/infrastructure/data/*.py now returns nothing. This issue is to apply the same standard everywhere else (~170 module-level functions remain across osa/).
Decision needed first
Two categories are structurally forced to be module-level by the frameworks we use. Decide whether they are sanctioned exceptions before sweeping:
- FastAPI route handlers —
@router.get(...)-decorated functions (e.g. routes/data/catalog.py::get_node_catalog). FastAPI registers free functions; they can't trivially become methods.
- Dishka
@provide DI factories — provider functions in util/di / infrastructure/persistence/di.py.
My recommendation: explicitly exempt these two (document the exemption in CLAUDE.md), and enforce the rule on everything else.
Where to look (explicitly non-exhaustive — a map, not a plan)
Highest priority — functions taking a session/engine/conn (clearest repository smell):
infrastructure/persistence/seed.py::ensure_system_user(engine)
infrastructure/runner_utils.py::parse_session_from_s3(s3, prefix), parse_session_file(output_dir)
infrastructure/persistence/database.py::{create_db_engine, create_session_factory, get_session} — bootstrap factories; may warrant a Database class (or be exempted alongside DI).
Domain parse/codec functions that belong on their value objects:
domain/data/model/filter.py::parse_field_ref → classmethod on FieldRef
domain/data/model/query_plan.py::{encode_cursor, decode_cursor} → likely onto PaginationCursor / Keyset
Service-private recursion/helpers that could be methods:
domain/data/service/data_query.py::{_tree_depth, _iter_predicates}
domain/data/query/read_table.py::_pagination
Route-wiring helpers (in routes/data/, plus the rest of routes/):
routes/data/tables.py::{format_key, path_for, register_table_routes, _existing_operation_ids}
routes/data/_params.py::parse_sort, _streaming.py::{build_table_response, _next_cursor, ...}, serializers/csv.py::_stringify
Broader counts by area (excluding the now-clean infrastructure/data/): application/api/v1/routes ~18, infrastructure/persistence (+repository/mappers/adapter) ~20, domain/shared* ~8, plus scattered singles in infrastructure/{k8s,s3,oci,event,auth}, domain/{validation,semantics,ingest,deposition}. Run grep -rn '^def \|^async def ' osa --include='*.py' for the live list.
Related
Rule
A module-level function that carries behaviour — especially one taking a
session/engine/conn— is a repository/store/service method wearing a function costume. It hides where behaviour lives, can't be injected/swapped, and erodes the entry-point → service → repository layering.Precedent (already done for
/data/)PR #137 set the pattern for the
/data/adapters:schema_features.py, four loose functions, two doingsession.execute) →SchemaFeatureReader(a session-holding class) composed into both read stores.statement_timeout_sql,_escape_like,_invalid_cursor,_apply_scalar_op,_feature_column_specs) →@staticmethods on their store classes.grep '^def\|^async def' osa/infrastructure/data/*.pynow returns nothing. This issue is to apply the same standard everywhere else (~170 module-level functions remain acrossosa/).Decision needed first
Two categories are structurally forced to be module-level by the frameworks we use. Decide whether they are sanctioned exceptions before sweeping:
@router.get(...)-decorated functions (e.g.routes/data/catalog.py::get_node_catalog). FastAPI registers free functions; they can't trivially become methods.@provideDI factories — provider functions inutil/di/infrastructure/persistence/di.py.My recommendation: explicitly exempt these two (document the exemption in
CLAUDE.md), and enforce the rule on everything else.Where to look (explicitly non-exhaustive — a map, not a plan)
Highest priority — functions taking a
session/engine/conn(clearest repository smell):infrastructure/persistence/seed.py::ensure_system_user(engine)infrastructure/runner_utils.py::parse_session_from_s3(s3, prefix),parse_session_file(output_dir)infrastructure/persistence/database.py::{create_db_engine, create_session_factory, get_session}— bootstrap factories; may warrant aDatabaseclass (or be exempted alongside DI).Domain parse/codec functions that belong on their value objects:
domain/data/model/filter.py::parse_field_ref→ classmethod onFieldRefdomain/data/model/query_plan.py::{encode_cursor, decode_cursor}→ likely ontoPaginationCursor/KeysetService-private recursion/helpers that could be methods:
domain/data/service/data_query.py::{_tree_depth, _iter_predicates}domain/data/query/read_table.py::_paginationRoute-wiring helpers (in
routes/data/, plus the rest ofroutes/):routes/data/tables.py::{format_key, path_for, register_table_routes, _existing_operation_ids}routes/data/_params.py::parse_sort,_streaming.py::{build_table_response, _next_cursor, ...},serializers/csv.py::_stringifyBroader counts by area (excluding the now-clean
infrastructure/data/):application/api/v1/routes~18,infrastructure/persistence(+repository/mappers/adapter) ~20,domain/shared*~8, plus scattered singles ininfrastructure/{k8s,s3,oci,event,auth},domain/{validation,semantics,ingest,deposition}. Rungrep -rn '^def \|^async def ' osa --include='*.py'for the live list.Related
infrastructure/persistence/.