Stabilize backend unit smoke tests on Windows#7790
Conversation
Greptile SummaryThis PR stabilizes the backend unit test suite on Windows by patching several categories of import-time and runtime fragility, without changing any production code (except aligning documentation and test assertions with an already-live
Confidence Score: 4/5Safe to merge — all changes are confined to test infrastructure and documentation; no production logic is altered. The changes correctly address real portability gaps (UTF-8 encoding, missing stubs, timezone data, Jinja2 skipping). The only concern is the catch-all backend/tests/unit/test_action_item_date_validation.py — the Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[pytest collects test module] --> B{Existing stubs in sys.modules?}
B -- yes --> C[Save _preserved_stub_modules\nRemove from sys.modules]
B -- no --> D[Skip save step]
C --> E[Install _Finder in sys.meta_path]
D --> E
E --> F[Import module under test\n_Finder intercepts stub-path imports]
F --> G[finally block]
G --> H[sys.meta_path.remove _finder]
H --> I[Remove stubs created during import]
I --> J[sys.modules.update preserved stubs]
J --> K[Test functions run\nwith clean stub state]
subgraph Windows-specific fixes
W1[ZoneInfo fails — no IANA data] --> W2[_zoneinfo_for_test returns\nfixed-offset datetime.timezone]
W3[open without encoding] --> W4[open with encoding utf-8]
W5[jinja2 not installed] --> W6[pytest.importorskip skips test]
end
Reviews (1): Last reviewed commit: "test(backend): stabilize unit smoke on W..." | Re-trigger Greptile |
| def _zoneinfo_for_test(key): | ||
| try: | ||
| return ZoneInfo(key) | ||
| except Exception: | ||
| if key == "UTC": | ||
| return timezone.utc | ||
| if key == "Asia/Kolkata": | ||
| return timezone(timedelta(hours=5, minutes=30), key) | ||
| raise |
There was a problem hiding this comment.
Overly-broad exception catch in
_zoneinfo_for_test
except Exception swallows any error that ZoneInfo might raise, not just the expected ZoneInfoNotFoundError/KeyError. If someone passes a known key like "UTC" but ZoneInfo fails for an unrelated reason (e.g., corrupted module state), the function silently returns timezone.utc instead of surfacing the real problem. The fallback raise at the bottom only helps for unknown keys. Using except (ZoneInfoNotFoundError, KeyError) (both from zoneinfo) would constrain the catch to the specific Windows/no-tzdata scenario this helper targets.
There was a problem hiding this comment.
Addressed in fb41d75ab. _zoneinfo_for_test now imports ZoneInfoNotFoundError and only catches (ZoneInfoNotFoundError, KeyError), so unexpected ZoneInfo failures are no longer swallowed by the Windows fallback.
Revalidated on the Windows backend venv:
python -m pytest tests\unit\test_action_item_date_validation.py -q-> 26 passedpython -m black --line-length 120 --skip-string-normalization tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py --checkpython -m py_compile tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py
| async def _run_blocking(_executor, fn, *args, **kwargs): | ||
| return fn(*args, **kwargs) | ||
|
|
||
|
|
||
| sys.modules["utils.executors"].run_blocking = _run_blocking |
There was a problem hiding this comment.
_run_blocking stub executes fn in the calling thread, not in the executor
The stub bypasses executor dispatch entirely. If fn is itself synchronous but relies on the GIL being released (e.g., for lock-ordering), or if future production code passes an async callable by mistake, the behaviour under the stub diverges silently from production. A slightly safer alternative that still avoids real executor overhead is loop.run_in_executor(None, functools.partial(fn, *args, **kwargs)) — that preserves executor semantics in the event loop without spawning threads. Minor for unit tests, but worth noting since run_blocking is a shared abstraction used across many production paths.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Addressed in fb41d75ab. The stub now preserves executor semantics with:
loop = _asyncio.get_running_loop()
return await loop.run_in_executor(_executor, functools.partial(fn, *args, **kwargs))It also uses real ThreadPoolExecutor instances for the executor stubs so run_in_executor can call submit() normally.
Revalidated on the Windows backend venv:
python -m pytest tests\unit\test_async_app_integrations.py -q-> 9 passedpython -m black --line-length 120 --skip-string-normalization tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py --checkpython -m py_compile tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py
Summary
sys.modulespollution.jinja2is not installed, while still running when the dependency is present.storage_executortest/docs expectations with the current production configuration of 128 workers.Why
While running backend unit tests from a minimal Windows venv, several tests failed during collection or import before reaching their actual assertions. Most failures came from test-local stubs leaking into later tests, Windows lacking system IANA timezone data, Windows defaulting
open()to a non-UTF-8 code page, or optional template dependencies not being installed.Testing
python -m pytest tests\unit\test_action_item_date_validation.py tests\unit\test_action_item_dedup.py tests\unit\test_apps_review_reply_validation.py -q->39 passedpython -m pytest tests\unit\test_async_app_integrations.py tests\unit\test_async_geocoding.py tests\unit\test_async_http_infrastructure.py tests\unit\test_async_webhooks.py -q->83 passedpython -m pytest tests\unit\test_auth_redirect_uri.py tests\unit\test_available_plans_resilience.py tests\unit\test_sync_v2.py::TestBulkheadExecutors::test_executor_worker_counts -q->49 passed, 3 skippedpython -m black --line-length 120 --skip-string-normalization ... --checkpython -m py_compile ...git diff --check -- ...