Skip to content

ci: run all integration tests on Windows + unify build under conan build .#86

Open
omjeiva wants to merge 16 commits into
mainfrom
feature/devops/integration-tests-on-windows
Open

ci: run all integration tests on Windows + unify build under conan build .#86
omjeiva wants to merge 16 commits into
mainfrom
feature/devops/integration-tests-on-windows

Conversation

@omjeiva
Copy link
Copy Markdown
Contributor

@omjeiva omjeiva commented Jun 2, 2026

Stacked on top of #83. After that PR merges, GitHub auto-retargets this to main.

Summary

Two related changes to the integration-test CI:

  1. Windows coverage for all 5 Linux-only integration testsfastdds-xrce-interop, gateway-end-to-end, protoc-arrow-bridge, protoc-gateway-client-ts, pubsub-arrow-fastdds. Each workflow now has linux: + windows: jobs side by side (matching the existing pattern in ci.integration-test.protoc-{plugin-cli,gen-fletcher-npm}.yml).

  2. Unify build step under conan build . — each integration-test conanfile gets a build() method that calls CMake.configure() + CMake.build() + (when ctest cases exist) CMake.test(). The workflows + READMEs collapse the four-step (conan install + cmake --preset + cmake --build + ctest) sequence into one conan build . call. Aligns the integration-test build pattern with how components are built (conan create .) and makes the Conan profile drive configuration directly, no implicit toolchain wiring needed.

Out of scope (intentionally)

  • --build=missing stays on both conan create and conan build steps. Conan Center binaries will cover most third-party deps for our Linux profile; the new msvc 194 profile may have gaps. Removing --build=missing is a deliberate "loud failure on cache miss" choice that we can make in a follow-up once we see what actually misses on Windows.
  • MicroXRCEAgent via Conan — kept as ExternalProject_Add in fastdds-xrce-interop. The UAGENT_SUPERBUILD=ON build deliberately isolates the Agent's fast-dds copy from our test's Conan-resolved fast-dds; moving to Conan would either force matching versions or introduce a separate Conan profile for the Agent. Out of scope here.

Test plan

  • CI passes on both Linux and Windows for all 7 integration tests
  • Manual: conan build . in each integration-test directory produces the expected artifacts and runs ctest

omjeiva and others added 6 commits June 2, 2026 08:42
Before this commit, only the protoc-plugin-cli and protoc-gen-fletcher-npm
integration tests ran on Windows. The remaining 5 (fastdds-xrce-interop,
gateway-end-to-end, protoc-arrow-bridge, protoc-gateway-client-ts,
pubsub-arrow-fastdds) were Linux-only — gaps where a Windows-specific
regression would not be caught until release.

For each workflow:
- Existing `<test-name>:` job renamed to `linux:` (matches the
  protoc-plugin-cli / protoc-gen-fletcher-npm naming).
- New `windows:` job added that runs natively on windows-2022 using
  ./.github/actions/conan-install + the Windows-msvc194-x86_64-Release
  profile, mirroring the Linux build/test sequence with bash/Git Bash.
- No docker on Windows — components are conan-create'd directly into
  the runner's local cache, then conan install / cmake / ctest / (npm
  for the gateway-client-ts and gateway-end-to-end tests) run from the
  integration-test directory.

The Windows-side build may surface integration-test CMake or FastDDS
build issues that have never been exercised on Windows; iterate on
those in follow-ups if CI reveals them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the integration-test workflows' four-step sequence (conan install
+ cmake configure + cmake build + ctest) with a single `conan build .`
call. Each integration-test conanfile gets a `build()` method that wraps
`CMake.configure()` + `CMake.build()` + (when the test has ctest cases)
`CMake.test(cli_args=["--output-on-failure"])`. Two consumers without
CMake-level tests (gateway-end-to-end, protoc-gateway-client-ts) only
configure + build; their assertions live in vitest and run via a
separate `npm test` step.

Effect:
- Linux + Windows workflow jobs each drop ~4 lines of cmake commands
  per integration test (5 workflows × 2 platforms = 10 jobs total).
- `conan build` consumes the Conan toolchain produced by the same
  conanfile, so the active profile drives configuration directly —
  no implicit "the toolchain has it" reasoning needed.
- READMEs for each integration test collapse the 3-command manual
  build sequence to one `conan build .` call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…to env var

Conan's CMake.test() invokes `cmake --build --target test`, which delegates
to ctest internally. cli_args/build_tool_args of cmake.test() are passed to
`cmake --build`, NOT to ctest — so `cli_args=["--output-on-failure"]` ended
up after `--target test` where MSBuild/Make rejected it as unknown.

CI failed identically on Linux and Windows for the three integration tests
that have ctest cases (protoc-arrow-bridge, pubsub-arrow-fastdds,
fastdds-xrce-interop) with:
  cmake --build ... --target test --output-on-failure
  Unknown argument --output-on-failure

ctest reads the CTEST_OUTPUT_ON_FAILURE env var directly, so set it via
os.environ before cmake.test() and drop the cli_args.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… binary

Test fixtures hardcoded a `build/Release/...` lookup that matched Conan's
cmake_layout output on Linux (single-config Make/Ninja) but not on
Windows, where the multi-config MSBuild layout places binaries under
`build/<profile-name>/Release/...` (or similar). Replace the candidate
list with a recursive walk for the named binary so the same lookup
works on both layouts.

Affected fixtures:
- gateway-end-to-end/test/{end-to-end,protoc-gen}.test.ts (gateway[.exe])
- protoc-gateway-client-ts/test/byte-compat.test.ts (emit_vectors[.exe])

The protoc-gateway-client-ts test passed on Windows by coincidence
(emit_vectors landed at `build/Release/`, matching the old hardcoded
path); the gateway-end-to-end test failed because gateway lives under
a `gateway_build/` subdir from `add_subdirectory(gateway gateway_build)`
which combined with Windows' layout pushed the binary outside the
candidate list. The recursive walk handles both cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…FIX on Windows

CI's `fastdds-xrce-interop / windows` job failed during the
MicroXRCEAgent superbuild with:
  FileTracker : error FTK1011: could not create the new file
  tracking log file: ...spdlog\src\spdlog-build\CMakeFiles\CMakeScratch
  \TryCompile-...\Debug\cmTC_...tlog\link-cvtres.write.1.tlog

The tlog path was 300+ chars — past Windows' default 260-char MAX_PATH.
The Agent's UAGENT_SUPERBUILD=ON cascades nested ExternalProjects
(spdlog/fast-dds/asio/tinyxml2/micro-cdr) which add layers; combined
with GitHub's default workspace path (`D:\a\Fletcher\Fletcher\integration
-tests\fastdds-xrce-interop\build\microxrcedds_agent_proj-prefix\...`)
the deepest tlog can't fit. MSBuild's FileTracker uses its own path
handling that doesn't respect Windows' opt-in long-path support, so we
can't rely on LongPathsEnabled.

Anchor the Agent's ExternalProject PREFIX at `C:/_x` on Windows to save
~60-80 chars of path budget. Linux keeps the default
${CMAKE_BINARY_DIR}/microxrcedds_agent_proj-prefix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from feature/pubsub-split-and-typed-qos to main June 2, 2026 08:54
…ing ABI fix

The six tests that exercise Publisher::Publish (and the symmetric
Subscriber callback) crash on Windows with SEH 0xc0000005 / 0xc0000409
inside the lambda the test passes as a RowEncoder. Reproducible locally
with no useful stack trace from gtest.

Most likely root cause: ABI mismatch on the std::function<void
(WriteBuffer&)> RowEncoder type-erasure that crosses the fletcher-pubsub
DLL boundary. MSVC + DLLs + std::function with non-trivial captures is
a known hazard — template instantiations can diverge between the
producer (the .dll) and the consumer (the test .exe) when build flags
or runtime settings drift.

The other 50 ctest cases in this suite pass on Windows; skipping just
these six via GTEST_FILTER unblocks Windows CI for this test workflow
while the underlying ABI issue is tracked separately. The same tests
pass on Linux unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 2, 2026 09:09
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands integration-test CI coverage to Windows for the previously Linux-only workflows and standardizes integration-test builds around a single conan build . entrypoint by adding build() methods to each integration-test conanfile. It also updates developer docs and test harnesses to accommodate platform-specific build layout differences.

Changes:

  • Add side-by-side linux and windows jobs to 5 integration-test workflows (Windows runs natively on windows-2022).
  • Implement build() in each integration-test conanfile.py to run CMake configure/build (and ctest where applicable) via conan build ..
  • Update READMEs and TypeScript tests to reflect/handle Conan cmake_layout build directory differences across platforms.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
integration-tests/pubsub-arrow-fastdds/README.md Switch docs from conan install + CMake presets to conan build . (including ctest).
integration-tests/pubsub-arrow-fastdds/conanfile.py Add build() that runs configure/build/test via Conan CMake helper.
integration-tests/protoc-gateway-client-ts/test/byte-compat.test.ts Make emitter binary discovery robust across Windows/Linux build layouts by searching build/ recursively.
integration-tests/protoc-gateway-client-ts/README.md Update build instructions to conan build . and clarify outputs.
integration-tests/protoc-gateway-client-ts/conanfile.py Add build() to drive configure/build for the emitter binary.
integration-tests/protoc-arrow-bridge/README.md Switch docs to conan build . and describe combined build+ctest flow.
integration-tests/protoc-arrow-bridge/conanfile.py Add build() with Windows-specific test-skipping intent (needs adjustment to actually work under gtest_discover_tests).
integration-tests/gateway-end-to-end/test/protoc-gen.test.ts Update gateway binary discovery to recursively search build/.
integration-tests/gateway-end-to-end/test/end-to-end.test.ts Update gateway binary discovery to recursively search build/.
integration-tests/gateway-end-to-end/README.md Switch docs to conan build . and clarify that vitest runs separately.
integration-tests/gateway-end-to-end/conanfile.py Add build() to drive configure/build for the gateway executable.
integration-tests/fastdds-xrce-interop/README.md Switch docs to conan build . and clarify the Agent superbuild cost.
integration-tests/fastdds-xrce-interop/conanfile.py Add build() to run configure/build/test via Conan CMake helper.
integration-tests/fastdds-xrce-interop/CMakeLists.txt Work around Windows MAX_PATH for the Agent ExternalProject by shortening the prefix path (should be configurable).
.github/workflows/ci.integration-test.pubsub-arrow-fastdds.yml Split workflow into linux (devcontainer) and windows (native) jobs; use conan build . for integration-test step.
.github/workflows/ci.integration-test.protoc-gateway-client-ts.yml Add Windows job; use conan build . then run npm ci/npm test.
.github/workflows/ci.integration-test.protoc-arrow-bridge.yml Add Windows job and switch integration-test step to conan build ..
.github/workflows/ci.integration-test.gateway-end-to-end.yml Add Windows job; switch build step to conan build . then run npm ci/npm test.
.github/workflows/ci.integration-test.fastdds-xrce-interop.yml Add Windows job; switch integration-test step to conan build ..

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +54 to +76
# Conan's CMake.test() runs `cmake --build --target test`, which
# delegates to ctest internally; cli_args/build_tool_args of
# cmake.test() go to `cmake --build`, NOT to ctest. The portable
# way to make ctest emit failed-test output is the env var.
os.environ["CTEST_OUTPUT_ON_FAILURE"] = "1"
# On Windows, the six PubSubProtoTest cases crash with SEH
# 0xc0000005 (access violation) inside Publisher::Publish. Root
# cause is most likely an ABI mismatch on the std::function<void
# (WriteBuffer&)> RowEncoder type-erasure that crosses the
# fletcher-pubsub DLL boundary (templates + DLLs on MSVC tend
# to instantiate inconsistently when build flags differ between
# the producer and the consumer). The other 50 ctest cases in
# this suite pass on Windows; skip just these six to keep
# Windows coverage moving while the underlying ABI issue is
# tracked separately. The same tests pass on Linux.
if self.settings.os == "Windows":
os.environ["GTEST_FILTER"] = "-PubSubProtoTest.PublishEncodesAndDeliversToProvider" \
":PubSubProtoTest.MultiplePublishesAccumulate" \
":PubSubProtoTest.SubscriberReceivesTypedMessageFromPublishedRows" \
":PubSubProtoTest.UnsubscribeStopsDelivery" \
":PubSubProtoTest.PublishWithAttachmentsDeliversBlobToSubscriber" \
":PubSubProtoTest.PublishWithoutAttachmentsHasEmptyAttachments"
cmake.test()
Comment on lines +37 to +41
if(WIN32)
set(AGENT_PREFIX "C:/_x")
else()
set(AGENT_PREFIX "${CMAKE_BINARY_DIR}/microxrcedds_agent_proj-prefix")
endif()
Comment on lines +64 to +75
function findBinaryRecursive(root: string, names: string[]): string | undefined {
if (!existsSync(root)) return undefined;
for (const entry of readdirSync(root, { withFileTypes: true })) {
const full = resolve(root, entry.name);
if (entry.isFile() && names.includes(entry.name)) return full;
if (entry.isDirectory()) {
const nested = findBinaryRecursive(full, names);
if (nested) return nested;
}
}
return undefined;
}
Comment on lines 70 to 81
function findBinaryRecursive(root: string, names: string[]): string | undefined {
if (!existsSync(root)) return undefined;
for (const entry of readdirSync(root, { withFileTypes: true })) {
const full = resolve(root, entry.name);
if (entry.isFile() && names.includes(entry.name)) return full;
if (entry.isDirectory()) {
const nested = findBinaryRecursive(full, names);
if (nested) return nested;
}
}
throw new Error(
`gateway binary not found. Checked: ${candidates.join(', ')}. ` +
`Set GATEWAY_BIN to override.`,
);
return undefined;
}
Comment on lines 49 to 60
function findBinaryRecursive(root: string, names: string[]): string | undefined {
if (!existsSync(root)) return undefined;
for (const entry of readdirSync(root, { withFileTypes: true })) {
const full = resolve(root, entry.name);
if (entry.isFile() && names.includes(entry.name)) return full;
if (entry.isDirectory()) {
const nested = findBinaryRecursive(full, names);
if (nested) return nested;
}
}
throw new Error(
`gateway binary not found. Checked: ${candidates.join(', ')}. ` +
`Set GATEWAY_BIN to override.`,
);
return undefined;
}
omjeiva and others added 9 commits June 2, 2026 11:51
…unction

CI's `fastdds-xrce-interop / windows` build failed with:
  error C2440: 'return': cannot convert from 'void' to 'std::basic_string<...>'

`BuildChildEnvBlockWithAugmentedPath()` returns std::string but called
`FAIL() << "..."` in the GetEnvironmentStringsA-null branch. gtest's
FAIL() macro expands to a bare `return;` — incompatible with this
function's non-void return type on MSVC (g++ apparently tolerates it,
hence the latent issue surfacing only now that the Windows job runs).

Replace with ADD_FAILURE() (records the failure without returning)
followed by an explicit `return std::string();` so the function
signature is honoured on both compilers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Appends a single commit restoring the working tree to commit 9463bc5 ("ci: add Windows jobs to all 5 Linux-only integration test workflows") without rewriting history. Undoes everything done after that point on this branch: the integration-test build-step refactor and fixes, the fastdds-xrce-interop fixes, the prettier style pass, and the PubSubProtoTest Windows skip.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Windows jobs configured CMake with `cmake --preset conan-release`,
which fails on the windows-2022 runner:

  CMake Error: No such preset ...: "conan-release"
  Available configure presets:
    "conan-default" - 'conan-default' config

Conan's CMakeToolchain emits per-build-type *configure* presets only
for single-config generators (the Linux devcontainer's Ninja/Make). On
the MSVC multi-config generator (Visual Studio 17 2022) it emits a
single conan-default configure preset; Release/Debug exist there as
*build* and *test* presets instead.

Switch the Windows configure step to `cmake --preset conan-default` in
all five integration-test workflows. `cmake --build --preset
conan-release` and `ctest --preset conan-release` are unchanged — those
build/test presets exist on multi-config. The Linux jobs keep
conan-release (its configure preset on single-config).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the per-OS, multi-step build (`conan install` + `cmake --preset`
+ `cmake --build` + `ctest`) with a single profile-driven `conan build .`
in all five integration-test workflows, on both Linux and Windows.

Why: the configure preset name depends on the CMake generator, not the
profile — single-config (the Linux devcontainer's Ninja/Make) exposes a
`conan-release` configure preset, while the Windows multi-config (MSVC)
generator exposes only `conan-default`. Naming the preset in the workflow
forced per-OS branching and broke Windows (`No such preset conan-release`).
`conan build .` lets Conan read the active profile, pick the generator,
and invoke the matching configure/build/test preset itself — so the same
command works everywhere and the Conan profile is the single source of
truth for the build.

Changes:
- Each integration-test conanfile gets a `build()` method (CMake
  configure + build, plus ctest via `cmake.test()` for the three suites
  that have CTest cases). CTEST_OUTPUT_ON_FAILURE surfaces failing-test
  output the portable way.
- The npm suites (gateway-end-to-end, protoc-gateway-client-ts) locate
  the spawned C++ binary by name via a recursive search of `build/`
  (shared `test/find-binary.ts`) instead of hardcoded `build/Release/...`
  paths, which don't match the Windows multi-config layout — the cause
  of the gateway-end-to-end Windows failure ("gateway binary not found").
- READMEs updated to the single `conan build .` step.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop the `set(CMAKE_CXX_STANDARD 20)` / `set(CMAKE_CXX_STANDARD_REQUIRED
ON)` (and one stray `set(CMAKE_CXX_EXTENSIONS OFF)`) from every
CMakeLists. The Conan profile already carries `compiler.cppstd`
(gnu20 on Linux, 20 on Windows-MSVC); its generated conan_toolchain.cmake
sets the CMake standard, so these lines only duplicated — and could
diverge from — the profile. A directory-scope `set()` placed after
`project()` silently overrides the profile (Conan only warns), which is
exactly what `protoc` did: it pinned C++17 while the profile (and the
protobuf it links) used C++20.

Component libraries now state the public minimum at target scope —
`target_compile_features(<lib> PUBLIC cxx_std_20)`, matching the existing
`fletcher-core` pattern — so the requirement still propagates to
consumers and composes with the profile (a profile asking for a newer
standard wins) instead of overriding it. Executables/test packages get
the standard from the profile-driven toolchain and transitively from the
libraries they link.

Behavioural note: `protoc` (and its plugin tests) now build at C++20
instead of 17, aligning it with the rest of the tree and with protobuf.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
style: apply prettier to gateway-end-to-end test files

The shortened "binary not found" throw fits within printWidth, so
prettier collapses the string concatenation onto one line.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@
…AX_PATH)

The MicroXRCEAgent builds with UAGENT_SUPERBUILD=ON, which nests its own
fast-dds / fast-cdr / asio / tinyxml2 / spdlog ExternalProjects several
levels under microxrcedds_agent_proj's prefix. With the default prefix
(deep under the test's build/ dir) those nested superbuild paths exceed
the Windows 260-char MAX_PATH limit, so MSBuild's FileTracker fails
(FTK1011 "could not create ... .tlog") and CL.exe fails (C1083) during
the superbuild's own configure step.

Set a short ExternalProject PREFIX on Windows (C:/fl-uxa), exposed via
the AGENT_PREFIX cache variable so local builds can point it elsewhere.
The Agent install dir stays under the build tree, so the test still
finds MicroXRCEAgent.exe where it expects.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…VC C2440)

With the MAX_PATH prefix fix in place, the MicroXRCEAgent superbuild now
completes on Windows and the build reaches the test compile, where MSVC
rejects:

  test_interop.cpp(178): error C2440: 'return': cannot convert from
  'void' to 'std::basic_string<...>'

BuildChildEnvBlockWithAugmentedPath() returns std::string but used
gtest's FAIL() in its GetEnvironmentStringsA-null branch. FAIL() expands
to a bare `return;`, which is incompatible with a non-void return type
on MSVC (gcc tolerates it, so this was latent until the Windows job ran).

Use ADD_FAILURE() (records the failure without returning) followed by an
explicit `return {};` so the signature is honoured on both compilers.
The other two FAIL() sites in this file are in void functions and are
fine as-is.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The 6 PubSubProtoTest cases crashed on Windows (SEH 0xc0000005 /
0xc0000409, flaky) because both pubsub.proto and telemetry.proto declared
`package integration` with a `message Telemetry`. The plugin emits
`namespace fletcher_gen::integration { class Telemetry ... }` for each, so
the combined integration_tests binary held two different definitions of
`fletcher_gen::integration::Telemetry`:
  - telemetry.proto's has `repeated int32 readings` -> a std::vector member
  - pubsub.proto's has only optional scalars (no vector)

That is an ODR violation. The linker kept one inline constructor
(telemetry's); test_pubsub.cpp sized its stack for the smaller (pubsub)
Telemetry but called the larger constructor, which initialized the
std::vector member past the object -> stack-buffer-overflow. Undefined
behaviour: gcc/Linux tolerated it, MSVC corrupted the stack.
AddressSanitizer pinned it to the std::vector ctor reached via
telemetry.fletcher.pb.h inside the PubSubProtoTest frame.

Move pubsub.proto to `package integration.pubsub` so its generated
symbols live in `fletcher_gen::integration::pubsub`, removing the
collision. Update test_pubsub.cpp's namespace references and the
package-derived topic-segment assertions to match.

Verified locally on Windows: all 56 protoc-arrow-bridge ctest cases pass
(previously 6 crashed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants