Stabilize backend unit smoke tests on Windows by tianmind-studio · Pull Request #7790 · BasedHardware/omi

tianmind-studio · 2026-06-10T10:49:23Z

Summary

Make backend unit tests more robust in lightweight Windows environments by tightening test stubs and avoiding cross-test sys.modules pollution.
Add UTF-8 source reads for tests that inspect Python files, avoiding Windows default-codepage decode failures.
Let auth callback template rendering tests skip when optional jinja2 is not installed, while still running when the dependency is present.
Align storage_executor test/docs expectations with the current production configuration of 128 workers.

Why

While running backend unit tests from a minimal Windows venv, several tests failed during collection or import before reaching their actual assertions. Most failures came from test-local stubs leaking into later tests, Windows lacking system IANA timezone data, Windows defaulting open() to a non-UTF-8 code page, or optional template dependencies not being installed.

Testing

python -m pytest tests\unit\test_action_item_date_validation.py tests\unit\test_action_item_dedup.py tests\unit\test_apps_review_reply_validation.py -q -> 39 passed
python -m pytest tests\unit\test_async_app_integrations.py tests\unit\test_async_geocoding.py tests\unit\test_async_http_infrastructure.py tests\unit\test_async_webhooks.py -q -> 83 passed
python -m pytest tests\unit\test_auth_redirect_uri.py tests\unit\test_available_plans_resilience.py tests\unit\test_sync_v2.py::TestBulkheadExecutors::test_executor_worker_counts -q -> 49 passed, 3 skipped
python -m black --line-length 120 --skip-string-normalization ... --check
python -m py_compile ...
git diff --check -- ...

greptile-apps · 2026-06-10T10:54:25Z

Greptile Summary

This PR stabilizes the backend unit test suite on Windows by patching several categories of import-time and runtime fragility, without changing any production code (except aligning documentation and test assertions with an already-live storage_executor worker count increase from 96 → 128).

Windows compatibility fixes: adds encoding='utf-8' to every source-file read, introduces a _zoneinfo_for_test helper that falls back to fixed-offset datetime.timezone objects when IANA timezone data is absent, and stubs fastapi.templating before importing routers.auth so Jinja2 is never required at collection time.
Cross-test sys.modules pollution fixes: saves and restores pre-existing stubs around guarded import blocks (including adding the previously-missing sys.meta_path.remove(_finder) in test_apps_review_reply_validation.py), sets __path__ on stub packages so they don't silently block real-submodule imports in later test files, and eagerly pops stale google-auth and subscription entries before importing real modules.
Missing stub attributes: adds database.webhook_health, utils.subscription, utils.executors.db_executor, get_maps_semaphore, and several notification functions that new production code now references at import time.

Confidence Score: 4/5

Safe to merge — all changes are confined to test infrastructure and documentation; no production logic is altered.

The changes correctly address real portability gaps (UTF-8 encoding, missing stubs, timezone data, Jinja2 skipping). The only concern is the catch-all except Exception in _zoneinfo_for_test, which could silently mask unexpected errors from ZoneInfo for the hard-coded key names; narrowing it to the specific ZoneInfoNotFoundError/KeyError types would make failures more visible if something unrelated goes wrong.

backend/tests/unit/test_action_item_date_validation.py — the _zoneinfo_for_test exception catch scope is worth tightening before this pattern is copied elsewhere.

Important Files Changed

Filename	Overview
backend/tests/unit/test_action_item_date_validation.py	Adds `_zoneinfo_for_test` fallback for Windows IANA-less environments; adds `__path__` to stub packages to prevent blocking real-submodule imports in later test files; expands notification/embeddings stubs to cover attributes discovered missing on Windows. One P2: catch-all `except Exception` in `_zoneinfo_for_test` should be narrowed to `ZoneInfoNotFoundError`/`KeyError`.
backend/tests/unit/test_apps_review_reply_validation.py	Adds save/restore of pre-existing stub modules around the `_Finder`-guarded import block, and correctly calls `sys.meta_path.remove(_finder)` (which was missing before). This is a clean fix for cross-test `sys.modules` pollution.
backend/tests/unit/test_async_app_integrations.py	Adds missing stubs for `database.webhook_health`, `utils.subscription`, `utils.executors.db_executor`, and `get_maps_semaphore`. Introduces a synchronous `_run_blocking` stub that skips actual executor dispatch — acceptable for unit tests but behaviorally diverges from production's async executor handoff.
backend/tests/unit/test_async_geocoding.py	Adds module-level attribute bindings (`utils.conversations`, `utils.conversations.location`) after the import so attribute-access style references work if those modules were already stubbed with `__path__`. Guarded by presence checks; safe.
backend/tests/unit/test_async_http_infrastructure.py	Clears stale stubs for `utils.http_client` / `utils.executors` at module level so the real implementations are imported fresh; adds `encoding='utf-8'` to source-reading tests; fixes flaky `time.sleep(0.01)` timing by spinning until `time.monotonic()` advances; updates worker count assertion from 96 to 128.
backend/tests/unit/test_async_webhooks.py	Adds `database.webhook_health` stub with the three functions now called from `utils.webhooks`; adds `encoding='utf-8'` to source-reading helpers. Straightforward additions.
backend/tests/unit/test_auth_redirect_uri.py	Adds careful save/restore of parent package attributes for stubs (including `fastapi.templating`) so Jinja2 is never needed at import time; refactors Jinja2 template tests to use `pytest.importorskip("jinja2")` so they skip cleanly on minimal Windows envs while still running when Jinja2 is installed.
backend/tests/unit/test_available_plans_resilience.py	Eagerly pops `utils.subscription` and all `google.*` modules before importing `utils.subscription` for real; prevents stale google-auth stubs from leaking into the subscription module import path.
backend/tests/unit/test_sync_v2.py	Updates `storage_executor` worker count assertion from 96 to 128 to match the current production configuration.
backend/AGENTS.md	Documentation-only: updates `storage_executor` description from 96w to 128w to stay in sync with the production executor configuration.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[pytest collects test module] --> B{Existing stubs in sys.modules?}
    B -- yes --> C[Save _preserved_stub_modules\nRemove from sys.modules]
    B -- no --> D[Skip save step]
    C --> E[Install _Finder in sys.meta_path]
    D --> E
    E --> F[Import module under test\n_Finder intercepts stub-path imports]
    F --> G[finally block]
    G --> H[sys.meta_path.remove _finder]
    H --> I[Remove stubs created during import]
    I --> J[sys.modules.update preserved stubs]
    J --> K[Test functions run\nwith clean stub state]

    subgraph Windows-specific fixes
        W1[ZoneInfo fails — no IANA data] --> W2[_zoneinfo_for_test returns\nfixed-offset datetime.timezone]
        W3[open without encoding] --> W4[open with encoding utf-8]
        W5[jinja2 not installed] --> W6[pytest.importorskip skips test]
    end

_{Reviews (1): Last reviewed commit: "test(backend): stabilize unit smoke on W..." | Re-trigger Greptile}

greptile-apps · 2026-06-10T10:54:29Z

+def _zoneinfo_for_test(key):
+    try:
+        return ZoneInfo(key)
+    except Exception:
+        if key == "UTC":
+            return timezone.utc
+        if key == "Asia/Kolkata":
+            return timezone(timedelta(hours=5, minutes=30), key)
+        raise


Overly-broad exception catch in _zoneinfo_for_test

except Exception swallows any error that ZoneInfo might raise, not just the expected ZoneInfoNotFoundError/KeyError. If someone passes a known key like "UTC" but ZoneInfo fails for an unrelated reason (e.g., corrupted module state), the function silently returns timezone.utc instead of surfacing the real problem. The fallback raise at the bottom only helps for unknown keys. Using except (ZoneInfoNotFoundError, KeyError) (both from zoneinfo) would constrain the catch to the specific Windows/no-tzdata scenario this helper targets.

Addressed in fb41d75ab. _zoneinfo_for_test now imports ZoneInfoNotFoundError and only catches (ZoneInfoNotFoundError, KeyError), so unexpected ZoneInfo failures are no longer swallowed by the Windows fallback.

Revalidated on the Windows backend venv:

python -m pytest tests\unit\test_action_item_date_validation.py -q -> 26 passed

python -m black --line-length 120 --skip-string-normalization tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py --check

python -m py_compile tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py

greptile-apps · 2026-06-10T10:54:29Z

+async def _run_blocking(_executor, fn, *args, **kwargs):
+    return fn(*args, **kwargs)
+
+
+sys.modules["utils.executors"].run_blocking = _run_blocking


_run_blocking stub executes fn in the calling thread, not in the executor

The stub bypasses executor dispatch entirely. If fn is itself synchronous but relies on the GIL being released (e.g., for lock-ordering), or if future production code passes an async callable by mistake, the behaviour under the stub diverges silently from production. A slightly safer alternative that still avoids real executor overhead is loop.run_in_executor(None, functools.partial(fn, *args, **kwargs)) — that preserves executor semantics in the event loop without spawning threads. Minor for unit tests, but worth noting since run_blocking is a shared abstraction used across many production paths.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Addressed in fb41d75ab. The stub now preserves executor semantics with:

loop = _asyncio.get_running_loop() return await loop.run_in_executor(_executor, functools.partial(fn, *args, **kwargs))

It also uses real ThreadPoolExecutor instances for the executor stubs so run_in_executor can call submit() normally.

Revalidated on the Windows backend venv:

python -m pytest tests\unit\test_async_app_integrations.py -q -> 9 passed

python -m black --line-length 120 --skip-string-normalization tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py --check

python -m py_compile tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py

test(backend): stabilize unit smoke on Windows

b893ad0

greptile-apps Bot reviewed Jun 10, 2026

View reviewed changes

test(backend): tighten Windows unit stubs

fb41d75

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize backend unit smoke tests on Windows#7790

Stabilize backend unit smoke tests on Windows#7790
tianmind-studio wants to merge 2 commits into
BasedHardware:mainfrom
tianmind-studio:codex/windows-backend-unit-smoke

tianmind-studio commented Jun 10, 2026

Uh oh!

greptile-apps Bot commented Jun 10, 2026

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

tianmind-studio Jun 10, 2026

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

tianmind-studio Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tianmind-studio commented Jun 10, 2026

Summary

Why

Testing

Uh oh!

greptile-apps Bot commented Jun 10, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

tianmind-studio Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

tianmind-studio Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant