Skip to content

Refactor registry, handler, and config builder for auditability and static typing #1

@Michael628

Description

@Michael628

Target branch: feature/issue-1-registry-refactor (based off develop)
All PRs for this work should target this branch, not main.

Problem Statement

The current config handler system (ConfigHandler, HandlerRegistry) uses dynamic setattr/inspect.signature/functools.partial machinery to attach functions to handler objects at runtime. This makes handlers opaque to static analysis and type checkers — you cannot know what methods a handler has without tracing registration calls. The handler also couples data and behavior by storing a mutable config property that gets injected into method calls via partial, creating hidden state dependencies. The distinction between configs and config handlers is ad-hoc: configs carry key ClassVars that only serve handler identity, and the same config type can end up registered under multiple scopes (nanny, a2a) with unclear ownership of build lifecycle hooks (preprocess, postprocess, validate).

For AI-driven development workflows, this design is difficult to audit — an AI assistant cannot reliably determine handler capabilities, and the dynamic dispatch makes refactoring risky.

Solution

Split the current monolithic ConfigHandler/HandlerRegistry system into three cleanly separated concerns:

  1. Build Hooks Registry — a global singleton mapping config types to their build lifecycle functions (preprocess, postprocess, validate). Enforced 1:1 at registration time. This is what the config builder uses during recursive construction.

  2. Handler Registry — a global singleton mapping explicit string keys to TaskHandler frozen dataclasses containing a config type and optional domain callable fields. No scopes, no dynamic method attachment. Protocol checks at lookup time determine handler capabilities.

  3. Revised config builderbuild_config(config_type, params, file_params) with no handler awareness. Recurses on composite config structure, looks up build hooks from the singleton registry by config type.

Config types lose their key ClassVar. Handler identity is an explicit string provided at registration. The register_a2a function is eliminated since the a2a module only needs the builder and build hooks, not handlers. Config is always passed explicitly to handler functions, never stored on the handler.

User Stories

  1. As a developer adding a new task, I want to register my config type's build hooks and handler functions in a single call, so that registration is simple but the two concerns are stored separately.
  2. As a developer, I want handler keys to be explicit strings at registration, so that I don't need to put identity on my config classes.
  3. As a developer, I want the build hooks registry to error if I accidentally register hooks for a config type that already has them, so that I catch conflicts at import time.
  4. As a developer, I want TaskHandler to be a frozen dataclass with typed callable fields, so that I can see what domain functions a handler provides without tracing runtime registration.
  5. As a developer, I want to check handler capabilities via Protocol (TaskHandlerProtocol, AggregatorProtocol, etc.) at lookup time, so that callers declare what they need and get clear errors if the handler doesn't satisfy it.
  6. As a developer, I want config passed explicitly to every handler function call, so that there's no hidden mutable state on the handler.
  7. As a developer, I want build_config to take only (config_type, params, file_params) with no handler parameter, so that the builder is decoupled from the handler system.
  8. As a developer, I want composite configs to keep their subconfig field declarations, so that the config's data structure remains self-describing.
  9. As a developer, I want the build hooks registry to be a global singleton populated at import time, so that the builder can find hooks without them being threaded through recursive calls.
  10. As a developer using the a2a module, I want to build configs using only the builder and build hooks registry, so that I don't need to register handlers for configs that have no domain functions.
  11. As a developer, I want handler and build hooks registries to have clear() methods, so that tests can run in isolation.
  12. As a developer, I want to reuse the same config type across different handlers without conflicts, since build hooks are keyed by config type (1:1) and handlers are keyed by explicit string.
  13. As a developer reading the codebase, I want the separation between "how to build a config" (build hooks) and "what to do with a built config" (handler) to be obvious from the module structure.
  14. As a developer, I want no scope system in the handler registry, since handler keys are already unique strings and scopes added indirection without value.
  15. As a developer, I want the dynamic format_string registration in nanny/core.py eliminated, since ConfigBase already has format_string as a method and config is passed explicitly to all calls.

Implementation Decisions

  • Build Hooks Registry: New module. Global singleton dict[type[ConfigBase], BuildHooks]. BuildHooks is a frozen dataclass with optional preprocess, postprocess, validate callable fields. register() raises ValueError if the config type already has hooks. get() returns BuildHooks | None. clear() for test isolation.
  • Handler Registry: Rewrite of existing registry module. Global singleton dict[str, TaskHandler]. TaskHandler is a frozen dataclass with config_type: type[ConfigBase] plus optional callable fields (build_input_params, create_outfile_catalog, build_aggregator_params). No ConfigHandler class. No setattr, inspect, or partial. Interface: register(key, config_type, **callables), get(key), clear().
  • Protocols: Rewrite of existing protocols module. InputBuilderProtocol, OutfileCatalogProtocol, AggregatorProtocol as individual capability protocols. TaskHandlerProtocol composing the required ones. All protocol methods expect config as an explicit first parameter (not self-injected). Runtime-checkable for isinstance checks at lookup time.
  • Config Builder: Rewrite of build_config function. Signature becomes build_config(config_type, params, file_params). Looks up build hooks from the global singleton. Recurses on composite config fields. ConfigBuilder class internals are unchanged in this refactor (noted for future cleanup — it has too many responsibilities).
  • Registration convenience functions: register_task(key, config_type, ...) splits registration across both registries. Module-specific wrappers can prepend prefixes to the key (e.g., hadrons tasks prepend "hadrons_"). register_a2a is eliminated.
  • Config types: Remove key: ClassVar[str] from all config dataclasses. Composite configs keep their subconfig field declarations unchanged.
  • Nanny core: Remove dynamic format_string registration from nanny/core.py. Callers use config.format_string() directly.

Testing Decisions

Good tests for this refactor should test external behavior through the public interfaces, not implementation details like internal dict structure. Tests should verify registration semantics, lookup behavior, error conditions, and end-to-end config building.

Modules to test:

  1. Build Hooks Registry — test registration, 1:1 enforcement (error on duplicate), lookup for registered and unregistered types, clear() isolation.
  2. Handler Registry — test registration with explicit keys, lookup, Protocol satisfaction checks at lookup time, missing handler errors, clear() isolation.
  3. Config Builder — test end-to-end building of simple and composite configs, verify preprocess/postprocess/validate hooks are called in correct order, test recursive subconfig construction with per-type hooks.

Approach: TDD on the new branch, with tests validating behavior against the old branch's integration tests as a baseline.

Out of Scope

  • ConfigBuilder internal refactoring (too many responsibilities — noted for future work)
  • Changes to the nanny job submission system beyond removing handler coupling
  • Changes to the a2a contraction engine beyond eliminating register_a2a
  • New CLI or user-facing API changes
  • Migration of task domain logic (e.g., build_input_params implementations) — only the registration and invocation patterns change

Further Notes

  • The refactor should be done incrementally on a new branch, migrating handlers one at a time rather than a big-bang rewrite.
  • The ConfigBuilder has been identified as having too many responsibilities (type coercion, string formatting, Outfile resolution, field assignment). This is a separate refactor to tackle after the registry/handler/builder boundary is clean.
  • Registration at import time is intentional and must be preserved — the system is designed for extensibility where users add their own configs and handlers.
  • The data/behavior separation (handlers are bags of plain functions, not OOP classes) is a core design value to preserve.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions