Skip to content

refactor(e2e): decompose E2ERunner (998 LoC) into 4 collaborators (#1941)#1990

Merged
mvillmow merged 2 commits into
mainfrom
1941-decompose-runner-core
May 17, 2026
Merged

refactor(e2e): decompose E2ERunner (998 LoC) into 4 collaborators (#1941)#1990
mvillmow merged 2 commits into
mainfrom
1941-decompose-runner-core

Conversation

@mvillmow
Copy link
Copy Markdown
Collaborator

Summary

The prior decomposition (PR #27034cf0) moved runner.py's 998-line E2ERunner class into runner_internals/runner_core.py without actually splitting responsibilities. This PR does the real decomposition:

Collaborator Responsibility LoC
RunnerSetup init, config validation, workspace prep 106
RunnerExecution execution loop, tier dispatch, subtest fan-out 261
RunnerFinalization result writing, summary, cleanup 154
RunnerResume resume-from-checkpoint logic 143

E2ERunner is now a thin orchestrator (runner_core.py = 440 lines, was 998). The remaining size is the public run() method, the _action_exp_* state-machine callbacks, and back-compat delegating wrappers required by existing tests that use patch.object(runner, "_method", ...). Public API unchanged — all 1697 e2e unit tests pass without modification.

Deferred (tracked in #1941)

  • tier_manager_internals/workspace.py 432-line prepare_workspace (currently # noqa: C901)
  • TierManager(WorkspaceMixin, ResourcesMixin, BaselineMixin, TierManagerBase) — mixins → composition
  • Five other >650-line files: stages.py, executor/runner.py, checkpoint.py, subtest_executor.py, cli/main.py, judge/evaluator.py

Test plan

  • pixi run pytest tests/unit/e2e/ — 1697 passed, 1 skipped
  • New smoke tests in tests/unit/e2e/test_runner_collaborators.py exercise each collaborator
  • pre-commit run --files <changed> (ruff, mypy strict, C901) passes

Refs #1941

🤖 Generated with Claude Code

@mvillmow mvillmow enabled auto-merge (squash) May 12, 2026 04:00
@mvillmow mvillmow force-pushed the 1941-decompose-runner-core branch from 60a84bc to 43e5ef3 Compare May 16, 2026 15:52
runner_internals/runner_core.py was 998 lines in a single E2ERunner class
— exactly what the prior "cosmetic" decomposition (PR #27034cf0) was
supposed to address. This PR splits responsibility into 4 collaborators:

- RunnerSetup — initialization, config validation, workspace prep
- RunnerExecution — experiment-execution loop, tier dispatch
- RunnerFinalization — result writing, summary, cleanup
- RunnerResume — resume-from-checkpoint logic

E2ERunner is now a thin orchestrator. Public API unchanged.

Deferred to follow-ups (tracked in #1941):
- tier_manager_internals/workspace.py decomposition
- TierManager mixins → composition
- The other 5 oversized files (stages.py, executor/runner.py, etc.)

Semantic rebase note: rebased onto post-#1940 main, which extracted the
scylla.persistence package. Migrated runner_internals collaborators from
scylla.e2e.checkpoint / scylla.e2e.experiment_result_writer /
scylla.e2e.rehydrate to their scylla.persistence equivalents (the e2e
paths are now back-compat shims; internal e2e modules should import
directly from the implementation package per #1940 policy).
load_checkpoint and validate_checkpoint_config remain module-scope
imports in runner_core so existing tests that patch them via
scylla.e2e.runner_internals.runner_core.<name> continue to work.

Refs #1941

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mvillmow mvillmow force-pushed the 1941-decompose-runner-core branch from 43e5ef3 to cd5fa71 Compare May 17, 2026 01:42
…_body dispatch

After the #1941 decomposition, ``RunnerExecution.run_tier`` called the
collaborator's own ``run_tier_body`` and dropped the per-tier duration
gauge that main's ``E2ERunner._run_tier`` emits. Two regressions resulted:

1. ``scylla_tier_duration_seconds`` was never emitted, breaking
   ``test_tier_duration_emitted_in_run_tier``.
2. The same test monkeypatches ``runner._run_tier_body``; dispatching
   through the collaborator bypassed that patch and let the real body
   run, which tripped ``save_checkpoint`` on a ``/dev/null`` path.

Restore both: dispatch via ``runner._run_tier_body`` (delegate stays in
runner_core) and emit ``scylla_tier_duration_seconds`` in the finally
block, exactly as main's pre-decomp implementation did.

Also pick up ruff's auto-applied import-order/whitespace fixes across
the five decomposed runner_internals modules.
@mvillmow mvillmow disabled auto-merge May 17, 2026 03:52
@mvillmow mvillmow merged commit 605559f into main May 17, 2026
23 checks passed
@mvillmow mvillmow deleted the 1941-decompose-runner-core branch May 17, 2026 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant