ci: extend quality-gate matrix to Windows; fix codex backend test isolation#44
Merged
Conversation
…lation Linux-only quality-gate hid three Windows-only test failures on main. The runner-test bug was fixed in #43; the remaining two were `Path.home()` isolation gaps in `tests/test_evaluator_backends.py`: setting `HOME` does not redirect `Path.home()` on Windows (which reads `USERPROFILE`), so `_has_chatgpt_oauth()` leaked into the dev's real `~/.codex/auth.json` and flipped the strip/promote branches in `_build_env`. - `tests/test_evaluator_backends.py`: replace `monkeypatch.setenv("HOME", ...)` with `monkeypatch.setattr("pathlib.Path.home", lambda: tmp_path)` in the five codex-oauth tests. Matches the pattern already used in `tests/test_runner.py:315,398`. Cross-platform; tests no longer touch real env state. - `.github/workflows/quality-gate.yml`: pytest job now runs on `[ubuntu-latest, windows-latest] × ["3.12", "3.13"]` with `fail-fast: false`. Lint/typecheck/audit stay single-OS — pure-Python checks with no platform variance. Verified locally on Windows: all 147 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 9, 2026
Merged
szjanikowski
added a commit
that referenced
this pull request
May 9, 2026
* docs(changelog): record Windows compat + skill bundling under [Unreleased] Captures everything that landed since v0.3.2 (PRs #42, #43, #44, #45) so the release notes are queued up. Keeps the section under [Unreleased] deliberately — more fixes are still expected before cutting v0.3.3, and release docs say the version header should be added at release-cut time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): record .gitattributes LF enforcement (#47) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): record harbor[cloud] dependency switch (#48) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: release v0.3.3 Cuts v0.3.3 from the accumulated [Unreleased] section. Patch bump (pre-1.0 policy): no breaking changes to user-facing CLI, nasde.toml schema, or benchmark project layout — Windows compat, OAuth-script bundling, harbor[cloud] dep switch are all transparent to users. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Szymon Janikowski <szymon.janikowski@itlibrium.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #43. Extends the quality-gate to actually catch Windows-only breakage and fixes the two remaining Windows-only test isolation bugs on
main..github/workflows/quality-gate.yml— pytest job now runs on[ubuntu-latest, windows-latest] × [3.12, 3.13]withfail-fast: false. Lint/typecheck/audit stay single-OS (pure-Python checks, no platform variance).tests/test_evaluator_backends.py— replacemonkeypatch.setenv("HOME", str(tmp_path))withmonkeypatch.setattr("pathlib.Path.home", lambda: tmp_path)in the five codex-oauth tests.HOMEdoesn't drivePath.home()on Windows (which readsUSERPROFILE), so_has_chatgpt_oauth()was leaking into the dev's real~/.codex/auth.jsonand flipping the strip/promote branches in_build_env. The new pattern matches whattests/test_runner.py:315,398already uses and is cross-platform.The ticket flagged a missing
monkeypatch.delenv("CODEX_API_KEY")fortest_codex_backend_env_strips_api_keys_when_oauth_present, but the test already doesmonkeypatch.setenv("CODEX_API_KEY", ...)which overrides the dev shell. The actual cause was the samePath.home()issue —_has_chatgpt_oauth()returnedFalse(real~/.codexnot visible), so the OAuth-strip branch never fired andCODEX_API_KEYsurvived inenv.No production-code change.
Test plan
uv run pytest tests/ -von local Windows — 147 passed, including the three previously failing tests.uv run ruff check tests/test_evaluator_backends.py— clean.uv run ruff format --check tests/test_evaluator_backends.py— clean.{ubuntu, windows} × {3.12, 3.13}) green on this PR.Out of scope
🤖 Generated with Claude Code