Skip to content

refactor: eliminate module-level functions repo-wide (organise into repos/stores/services/methods) #144

@rorybyrne

Description

@rorybyrne

Rule

As a rule, we do NOT allow top-level functions. We organise things into repos, services, etc.

A module-level function that carries behaviour — especially one taking a session/engine/conn — is a repository/store/service method wearing a function costume. It hides where behaviour lives, can't be injected/swapped, and erodes the entry-point → service → repository layering.

Precedent (already done for /data/)

PR #137 set the pattern for the /data/ adapters:

  • Shared schema→feature resolution (was schema_features.py, four loose functions, two doing session.execute) → SchemaFeatureReader (a session-holding class) composed into both read stores.
  • Per-file helpers (statement_timeout_sql, _escape_like, _invalid_cursor, _apply_scalar_op, _feature_column_specs) → @staticmethods on their store classes.

grep '^def\|^async def' osa/infrastructure/data/*.py now returns nothing. This issue is to apply the same standard everywhere else (~170 module-level functions remain across osa/).

Decision needed first

Two categories are structurally forced to be module-level by the frameworks we use. Decide whether they are sanctioned exceptions before sweeping:

  1. FastAPI route handlers@router.get(...)-decorated functions (e.g. routes/data/catalog.py::get_node_catalog). FastAPI registers free functions; they can't trivially become methods.
  2. Dishka @provide DI factories — provider functions in util/di / infrastructure/persistence/di.py.

My recommendation: explicitly exempt these two (document the exemption in CLAUDE.md), and enforce the rule on everything else.

Where to look (explicitly non-exhaustive — a map, not a plan)

Highest priority — functions taking a session/engine/conn (clearest repository smell):

  • infrastructure/persistence/seed.py::ensure_system_user(engine)
  • infrastructure/runner_utils.py::parse_session_from_s3(s3, prefix), parse_session_file(output_dir)
  • infrastructure/persistence/database.py::{create_db_engine, create_session_factory, get_session} — bootstrap factories; may warrant a Database class (or be exempted alongside DI).

Domain parse/codec functions that belong on their value objects:

  • domain/data/model/filter.py::parse_field_ref → classmethod on FieldRef
  • domain/data/model/query_plan.py::{encode_cursor, decode_cursor} → likely onto PaginationCursor / Keyset

Service-private recursion/helpers that could be methods:

  • domain/data/service/data_query.py::{_tree_depth, _iter_predicates}
  • domain/data/query/read_table.py::_pagination

Route-wiring helpers (in routes/data/, plus the rest of routes/):

  • routes/data/tables.py::{format_key, path_for, register_table_routes, _existing_operation_ids}
  • routes/data/_params.py::parse_sort, _streaming.py::{build_table_response, _next_cursor, ...}, serializers/csv.py::_stringify

Broader counts by area (excluding the now-clean infrastructure/data/): application/api/v1/routes ~18, infrastructure/persistence (+repository/mappers/adapter) ~20, domain/shared* ~8, plus scattered singles in infrastructure/{k8s,s3,oci,event,auth}, domain/{validation,semantics,ingest,deposition}. Run grep -rn '^def \|^async def ' osa --include='*.py' for the live list.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    design-neededNeeds architectural discussion before implementationrefactorInternal restructuring, no behavior changetech-debtKnown shortcuts to address later

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions