refactor(e2e): decompose E2ERunner (998 LoC) into 4 collaborators (#1941)#1990
Merged
Conversation
60a84bc to
43e5ef3
Compare
runner_internals/runner_core.py was 998 lines in a single E2ERunner class — exactly what the prior "cosmetic" decomposition (PR #27034cf0) was supposed to address. This PR splits responsibility into 4 collaborators: - RunnerSetup — initialization, config validation, workspace prep - RunnerExecution — experiment-execution loop, tier dispatch - RunnerFinalization — result writing, summary, cleanup - RunnerResume — resume-from-checkpoint logic E2ERunner is now a thin orchestrator. Public API unchanged. Deferred to follow-ups (tracked in #1941): - tier_manager_internals/workspace.py decomposition - TierManager mixins → composition - The other 5 oversized files (stages.py, executor/runner.py, etc.) Semantic rebase note: rebased onto post-#1940 main, which extracted the scylla.persistence package. Migrated runner_internals collaborators from scylla.e2e.checkpoint / scylla.e2e.experiment_result_writer / scylla.e2e.rehydrate to their scylla.persistence equivalents (the e2e paths are now back-compat shims; internal e2e modules should import directly from the implementation package per #1940 policy). load_checkpoint and validate_checkpoint_config remain module-scope imports in runner_core so existing tests that patch them via scylla.e2e.runner_internals.runner_core.<name> continue to work. Refs #1941 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
43e5ef3 to
cd5fa71
Compare
…_body dispatch After the #1941 decomposition, ``RunnerExecution.run_tier`` called the collaborator's own ``run_tier_body`` and dropped the per-tier duration gauge that main's ``E2ERunner._run_tier`` emits. Two regressions resulted: 1. ``scylla_tier_duration_seconds`` was never emitted, breaking ``test_tier_duration_emitted_in_run_tier``. 2. The same test monkeypatches ``runner._run_tier_body``; dispatching through the collaborator bypassed that patch and let the real body run, which tripped ``save_checkpoint`` on a ``/dev/null`` path. Restore both: dispatch via ``runner._run_tier_body`` (delegate stays in runner_core) and emit ``scylla_tier_duration_seconds`` in the finally block, exactly as main's pre-decomp implementation did. Also pick up ruff's auto-applied import-order/whitespace fixes across the five decomposed runner_internals modules.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The prior decomposition (PR #27034cf0) moved
runner.py's 998-lineE2ERunnerclass intorunner_internals/runner_core.pywithout actually splitting responsibilities. This PR does the real decomposition:RunnerSetupRunnerExecutionRunnerFinalizationRunnerResumeE2ERunneris now a thin orchestrator (runner_core.py= 440 lines, was 998). The remaining size is the publicrun()method, the_action_exp_*state-machine callbacks, and back-compat delegating wrappers required by existing tests that usepatch.object(runner, "_method", ...). Public API unchanged — all 1697 e2e unit tests pass without modification.Deferred (tracked in #1941)
tier_manager_internals/workspace.py432-lineprepare_workspace(currently# noqa: C901)TierManager(WorkspaceMixin, ResourcesMixin, BaselineMixin, TierManagerBase)— mixins → compositionstages.py,executor/runner.py,checkpoint.py,subtest_executor.py,cli/main.py,judge/evaluator.pyTest plan
pixi run pytest tests/unit/e2e/— 1697 passed, 1 skippedtests/unit/e2e/test_runner_collaborators.pyexercise each collaboratorpre-commit run --files <changed>(ruff, mypy strict, C901) passesRefs #1941
🤖 Generated with Claude Code