Run standard venv operator tests DB-free so they parallelize under xdist#68533
Run standard venv operator tests DB-free so they parallelize under xdist#68533potiuk wants to merge 1 commit into
Conversation
PythonVirtualenvOperator/ExternalPythonOperator tests dominate the `Providers[standard]` suite (~450s of ~510s) because they build real virtualenvs, and they were forced to run serially: they were marked `db_test`, so pytest-xdist could not parallelize them. Their only real DB coupling was the test harness, not the metadata DB — but the venv subprocess reconnects to the supervisor over a socket, so they cannot use the plain `run_task` mock (which has no real socket and fails with `OSError: Socket operation on non-socket`). Add `tests_common.test_utils.in_process_taskrun.run_task_no_db`, which drives `InProcessTestSupervisor`'s real socketpair with an in-memory stub Execution-API client, so a venv operator runs with a working supervisor socket and no metadata DB. Convert `TestPythonVirtualenvOperator` and `TestExternalPythonOperator` onto it via a `_DBFreeVenvRun` mixin and drop their `db_test` mark, so they parallelize under `--skip-db-tests --use-xdist`. The module-level `db_test`/`need_serialized_dag` mark moves to the remaining DB classes. Branch venv classes keep `db_test` (their multi-task state assertions need a real DAG run). Measured: the two classes run ~3x faster at `-n 4` (157 tests, 89s vs ~266s serial); larger gains at CI worker counts.
jscheffl
left a comment
There was a problem hiding this comment.
One comment to this PR - I remember back when VirtualEnvCaching was added that we made it a way that all tests use a cached Venv - have no oversight - is this still true? Or do we "waste" time in the venv tests because the venv is re-installed multiple times?
| return ErrorResponse(error=ErrorType.CONNECTION_NOT_FOUND) | ||
|
|
||
|
|
||
| class _InMemoryExecutionClient: |
There was a problem hiding this comment.
Most of this in-memory client already exists, just not collected in one place:
Client(dry_run=True)installsnoop_handler, a no-network/no-DB client that already backsInProcessTestSupervisor(its_api_clientbuilds the client withdry_run=True). The only thing it lacks is remembering writes:noop_handlerdiscards them ("It doesn't make sense for returning connections etc.").run_taskalready round-trips XCom in-memory by spyingXCom.set/XCom.get_one(pytest_plugin.py), which is what_StubXComsre-implements.
So the genuinely new capability is "a dry-run client that remembers." Teaching noop_handler/dry-run to round-trip from an in-memory dict would let this client and the three _Stub* classes collapse into the existing one instead of a test-only parallel.
Separately, the __getattr__ MagicMock fallback (L113) returns a MagicMock for any unmodeled resource (assets, dag_runs, task_store), so a future venv test that round-trips one of those would pass against a mock instead of real data. Raising on unmodeled names (keeping the __-dunder guard) makes that fail loudly.
| return mock.MagicMock(name=f"stub_client.{name}") | ||
|
|
||
|
|
||
| class TaskRunResultNoDB: |
There was a problem hiding this comment.
This duplicates the existing TaskRunResult, which exposes the same state/msg/error and is exactly what _StubBackendSupervisor.start() returns (captured as result at L189). Only xcom_get is new. Returning the existing TaskRunResult plus a small xcom accessor avoids a second result type that has to be kept in sync with the first.
|
|
||
| client = _InMemoryExecutionClient(ti_context, variables=variables, connections=connections) | ||
|
|
||
| class _StubBackendSupervisor(InProcessTestSupervisor): |
There was a problem hiding this comment.
The "override _api_client to skip the DB" seam is already an established pattern (test_supervisor.py does exactly this with a fake client), and run_task_in_process already wraps .start(). Rather than a subclass per call, consider an optional client= param on InProcessTestSupervisor.start() (it currently hardcodes client=cls._api_client(task.dag) at #L2038). Callers inject the in-memory client, the subclass goes away, and run_task_no_db becomes a thin wrapper over run_task_in_process.
PythonVirtualenvOperator/ExternalPythonOperatortests dominate theProviders[standard]suite (~450s of ~510s) because they build real virtualenvs, yet they were forced to run serially: they were markeddb_test, so pytest-xdist could not parallelize them.Their only real DB coupling was the test harness, not the metadata DB — but the venv subprocess reconnects to the supervisor over a socket, so they can't use the plain
run_taskmock (no real socket →OSError: Socket operation on non-socket).This adds
tests_common.test_utils.in_process_taskrun.run_task_no_db, which drivesInProcessTestSupervisor's real socketpair with an in-memory stub Execution-API client — so a venv operator runs with a working supervisor socket and no metadata DB.TestPythonVirtualenvOperatorandTestExternalPythonOperatorare converted onto it via a_DBFreeVenvRunmixin and drop theirdb_testmark, so they parallelize under--skip-db-tests --use-xdist. The module-leveldb_test/need_serialized_dagmark moves to the remaining DB classes; branch venv classes keepdb_test(their multi-task state assertions need a real DAG run).Measured: the two classes run ~3× faster at
-n 4(157 tests, 89s vs ~266s serial); larger gains at CI worker counts. Validated DB-free runs (External 43, Virtualenv 114, 0 failures) and confirmed no regression in the remaining DB classes.Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (Opus 4.8) following the guidelines