Skip to content

Stabilize backend unit smoke tests on Windows#7790

Open
tianmind-studio wants to merge 2 commits into
BasedHardware:mainfrom
tianmind-studio:codex/windows-backend-unit-smoke
Open

Stabilize backend unit smoke tests on Windows#7790
tianmind-studio wants to merge 2 commits into
BasedHardware:mainfrom
tianmind-studio:codex/windows-backend-unit-smoke

Conversation

@tianmind-studio

Copy link
Copy Markdown
Contributor

Summary

  • Make backend unit tests more robust in lightweight Windows environments by tightening test stubs and avoiding cross-test sys.modules pollution.
  • Add UTF-8 source reads for tests that inspect Python files, avoiding Windows default-codepage decode failures.
  • Let auth callback template rendering tests skip when optional jinja2 is not installed, while still running when the dependency is present.
  • Align storage_executor test/docs expectations with the current production configuration of 128 workers.

Why

While running backend unit tests from a minimal Windows venv, several tests failed during collection or import before reaching their actual assertions. Most failures came from test-local stubs leaking into later tests, Windows lacking system IANA timezone data, Windows defaulting open() to a non-UTF-8 code page, or optional template dependencies not being installed.

Testing

  • python -m pytest tests\unit\test_action_item_date_validation.py tests\unit\test_action_item_dedup.py tests\unit\test_apps_review_reply_validation.py -q -> 39 passed
  • python -m pytest tests\unit\test_async_app_integrations.py tests\unit\test_async_geocoding.py tests\unit\test_async_http_infrastructure.py tests\unit\test_async_webhooks.py -q -> 83 passed
  • python -m pytest tests\unit\test_auth_redirect_uri.py tests\unit\test_available_plans_resilience.py tests\unit\test_sync_v2.py::TestBulkheadExecutors::test_executor_worker_counts -q -> 49 passed, 3 skipped
  • python -m black --line-length 120 --skip-string-normalization ... --check
  • python -m py_compile ...
  • git diff --check -- ...

@greptile-apps

greptile-apps Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR stabilizes the backend unit test suite on Windows by patching several categories of import-time and runtime fragility, without changing any production code (except aligning documentation and test assertions with an already-live storage_executor worker count increase from 96 → 128).

  • Windows compatibility fixes: adds encoding='utf-8' to every source-file read, introduces a _zoneinfo_for_test helper that falls back to fixed-offset datetime.timezone objects when IANA timezone data is absent, and stubs fastapi.templating before importing routers.auth so Jinja2 is never required at collection time.
  • Cross-test sys.modules pollution fixes: saves and restores pre-existing stubs around guarded import blocks (including adding the previously-missing sys.meta_path.remove(_finder) in test_apps_review_reply_validation.py), sets __path__ on stub packages so they don't silently block real-submodule imports in later test files, and eagerly pops stale google-auth and subscription entries before importing real modules.
  • Missing stub attributes: adds database.webhook_health, utils.subscription, utils.executors.db_executor, get_maps_semaphore, and several notification functions that new production code now references at import time.

Confidence Score: 4/5

Safe to merge — all changes are confined to test infrastructure and documentation; no production logic is altered.

The changes correctly address real portability gaps (UTF-8 encoding, missing stubs, timezone data, Jinja2 skipping). The only concern is the catch-all except Exception in _zoneinfo_for_test, which could silently mask unexpected errors from ZoneInfo for the hard-coded key names; narrowing it to the specific ZoneInfoNotFoundError/KeyError types would make failures more visible if something unrelated goes wrong.

backend/tests/unit/test_action_item_date_validation.py — the _zoneinfo_for_test exception catch scope is worth tightening before this pattern is copied elsewhere.

Important Files Changed

Filename Overview
backend/tests/unit/test_action_item_date_validation.py Adds _zoneinfo_for_test fallback for Windows IANA-less environments; adds __path__ to stub packages to prevent blocking real-submodule imports in later test files; expands notification/embeddings stubs to cover attributes discovered missing on Windows. One P2: catch-all except Exception in _zoneinfo_for_test should be narrowed to ZoneInfoNotFoundError/KeyError.
backend/tests/unit/test_apps_review_reply_validation.py Adds save/restore of pre-existing stub modules around the _Finder-guarded import block, and correctly calls sys.meta_path.remove(_finder) (which was missing before). This is a clean fix for cross-test sys.modules pollution.
backend/tests/unit/test_async_app_integrations.py Adds missing stubs for database.webhook_health, utils.subscription, utils.executors.db_executor, and get_maps_semaphore. Introduces a synchronous _run_blocking stub that skips actual executor dispatch — acceptable for unit tests but behaviorally diverges from production's async executor handoff.
backend/tests/unit/test_async_geocoding.py Adds module-level attribute bindings (utils.conversations, utils.conversations.location) after the import so attribute-access style references work if those modules were already stubbed with __path__. Guarded by presence checks; safe.
backend/tests/unit/test_async_http_infrastructure.py Clears stale stubs for utils.http_client / utils.executors at module level so the real implementations are imported fresh; adds encoding='utf-8' to source-reading tests; fixes flaky time.sleep(0.01) timing by spinning until time.monotonic() advances; updates worker count assertion from 96 to 128.
backend/tests/unit/test_async_webhooks.py Adds database.webhook_health stub with the three functions now called from utils.webhooks; adds encoding='utf-8' to source-reading helpers. Straightforward additions.
backend/tests/unit/test_auth_redirect_uri.py Adds careful save/restore of parent package attributes for stubs (including fastapi.templating) so Jinja2 is never needed at import time; refactors Jinja2 template tests to use pytest.importorskip("jinja2") so they skip cleanly on minimal Windows envs while still running when Jinja2 is installed.
backend/tests/unit/test_available_plans_resilience.py Eagerly pops utils.subscription and all google.* modules before importing utils.subscription for real; prevents stale google-auth stubs from leaking into the subscription module import path.
backend/tests/unit/test_sync_v2.py Updates storage_executor worker count assertion from 96 to 128 to match the current production configuration.
backend/AGENTS.md Documentation-only: updates storage_executor description from 96w to 128w to stay in sync with the production executor configuration.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[pytest collects test module] --> B{Existing stubs in sys.modules?}
    B -- yes --> C[Save _preserved_stub_modules\nRemove from sys.modules]
    B -- no --> D[Skip save step]
    C --> E[Install _Finder in sys.meta_path]
    D --> E
    E --> F[Import module under test\n_Finder intercepts stub-path imports]
    F --> G[finally block]
    G --> H[sys.meta_path.remove _finder]
    H --> I[Remove stubs created during import]
    I --> J[sys.modules.update preserved stubs]
    J --> K[Test functions run\nwith clean stub state]

    subgraph Windows-specific fixes
        W1[ZoneInfo fails — no IANA data] --> W2[_zoneinfo_for_test returns\nfixed-offset datetime.timezone]
        W3[open without encoding] --> W4[open with encoding utf-8]
        W5[jinja2 not installed] --> W6[pytest.importorskip skips test]
    end
Loading

Reviews (1): Last reviewed commit: "test(backend): stabilize unit smoke on W..." | Re-trigger Greptile

Comment on lines +65 to +73
def _zoneinfo_for_test(key):
try:
return ZoneInfo(key)
except Exception:
if key == "UTC":
return timezone.utc
if key == "Asia/Kolkata":
return timezone(timedelta(hours=5, minutes=30), key)
raise

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Overly-broad exception catch in _zoneinfo_for_test

except Exception swallows any error that ZoneInfo might raise, not just the expected ZoneInfoNotFoundError/KeyError. If someone passes a known key like "UTC" but ZoneInfo fails for an unrelated reason (e.g., corrupted module state), the function silently returns timezone.utc instead of surfacing the real problem. The fallback raise at the bottom only helps for unknown keys. Using except (ZoneInfoNotFoundError, KeyError) (both from zoneinfo) would constrain the catch to the specific Windows/no-tzdata scenario this helper targets.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in fb41d75ab. _zoneinfo_for_test now imports ZoneInfoNotFoundError and only catches (ZoneInfoNotFoundError, KeyError), so unexpected ZoneInfo failures are no longer swallowed by the Windows fallback.

Revalidated on the Windows backend venv:

  • python -m pytest tests\unit\test_action_item_date_validation.py -q -> 26 passed
  • python -m black --line-length 120 --skip-string-normalization tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py --check
  • python -m py_compile tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py

Comment on lines +157 to +161
async def _run_blocking(_executor, fn, *args, **kwargs):
return fn(*args, **kwargs)


sys.modules["utils.executors"].run_blocking = _run_blocking

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _run_blocking stub executes fn in the calling thread, not in the executor

The stub bypasses executor dispatch entirely. If fn is itself synchronous but relies on the GIL being released (e.g., for lock-ordering), or if future production code passes an async callable by mistake, the behaviour under the stub diverges silently from production. A slightly safer alternative that still avoids real executor overhead is loop.run_in_executor(None, functools.partial(fn, *args, **kwargs)) — that preserves executor semantics in the event loop without spawning threads. Minor for unit tests, but worth noting since run_blocking is a shared abstraction used across many production paths.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in fb41d75ab. The stub now preserves executor semantics with:

loop = _asyncio.get_running_loop()
return await loop.run_in_executor(_executor, functools.partial(fn, *args, **kwargs))

It also uses real ThreadPoolExecutor instances for the executor stubs so run_in_executor can call submit() normally.

Revalidated on the Windows backend venv:

  • python -m pytest tests\unit\test_async_app_integrations.py -q -> 9 passed
  • python -m black --line-length 120 --skip-string-normalization tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py --check
  • python -m py_compile tests\unit\test_action_item_date_validation.py tests\unit\test_async_app_integrations.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant