Skip to content

[HIPIFY][Infra] Enable dynamic drive allocation to Windows build#2383

Merged
kirthana14m merged 4 commits intoamd-stagingfrom
amd/dev/kirthana14m/windows-path-shrink-staging
Mar 17, 2026
Merged

[HIPIFY][Infra] Enable dynamic drive allocation to Windows build#2383
kirthana14m merged 4 commits intoamd-stagingfrom
amd/dev/kirthana14m/windows-path-shrink-staging

Conversation

@kirthana14m
Copy link
Collaborator

@kirthana14m kirthana14m commented Feb 11, 2026

Following changes are accumulated in this PR

  1. Dynamic check of available drive on windows machine.
  2. Test cluster name is reverted to linux-mi325-1gpu-ossci-rocm
  3. Base ROCm CI is updated to de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
  4. aomp smoke tests are added (similar to llvm-project updated)

Copy link
Collaborator

@emankov emankov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my review of #2384.

@emankov emankov added the Windows Windows only label Feb 11, 2026
@kirthana14m kirthana14m force-pushed the amd/dev/kirthana14m/windows-path-shrink-staging branch from 6ef1e8b to a656dfa Compare March 11, 2026 10:30
@kirthana14m kirthana14m changed the title [HIPIFY][Infra] workaround for long paths on Windows build [HIPIFY][Infra] Enable dynamic drive allocation to Windows build Mar 11, 2026
Copy link
Collaborator

@emankov emankov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing some of my comments in the already closed #2384; however, there are still some issues to address in this PR.

[IMPORTANT]
I'd still suggest that you revise the coupled PRs approach for amd-staging and amd-mainline for the following reasons:

  1. error-prone
  2. common source base vs code duplication
  3. difficulties in reviewing them both altogether, keeping in mind different target branches

@searlmc1, @kzhuravl, jfyi.

run: |
./build_tools/health_status.py

- name: Test build_tools
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both the Linux and Windows build workflows, the Test build_tools was removed.

Unless these tests were moved to a separate dedicated workflow not included in this patch, you are losing CI coverage for them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, tests now live in a dedicated workflow and will be re-enabled once the testing phase is complete in a new patch.

These tests were temporarily removed to accommodate infrastructure changes.

# clear cache before build and after download
cd "${{ steps.subst.outputs.drive }}/"
ccache -z
cmake -B "B:\build" -GNinja -S "${{ steps.subst.outputs.drive }}/" --preset windows-release -DTHEROCK_AMDGPU_FAMILIES=gfx1151 "-DCMAKE_C_COMPILER=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207/bin/Hostx64/x64/cl.exe" "-DCMAKE_CXX_COMPILER=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207/bin/Hostx64/x64/cl.exe" "-DCMAKE_LINKER=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.44.35207/bin/Hostx64/x64/link.exe" -DTHEROCK_BACKGROUND_BUILD_JOBS=4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already saw and reviewed that in 2384: #2384 (comment).
Those comments were left unaddressed:

  1. You should always use "" for paths instead of '', as in Windows CMD, single quotes are not string delimiters.
  2. Targetting a very specific MSVS version, like 14.44.35207, is a bad solution. You should use vcvars64.bat beforehand.
  3. Could you explain why and where the drive letter B has appeared?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Paths: Already using "" for paths on Line 217; happy to fix any remaining spots you flag.

  2. MSVS : We’re using a dedicated Windows runner image that’s built and maintained by ROCKCI DevOps team. That image has a fixed, supported MSVC install (hence the 14.44.35207 path). Relying on that image is a deliberate choice for consistency and supportability in our environment. We’re not invoking a loose MSVC install on a generic host; the compiler path reflects what’s actually on the image. If we move to a different image or setup (e.g. using vcvars64.bat on a generic runner), we’ll revisit this and document it.

  3. Drive letter B: B:\ is the standard build drive on our Windows runner images. For example, BUILD_DIR is set to B:\build in this workflow at line 71 so all builds use that path. It’s defined by the runner image/infrastructure, not by this repo, and keeps paths consistent across runs and easier to support.

As briefly described in the 2384 comment (Point 3)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hence the 14.44.35207 path

FMPOV, it is even worse than magic numbers. All Environment-related variables and constants should be set in one place: the higher, the better. Otherwise, with any environment changes, cloning, whatever else, you'll need to fix those multiple paths, drive letters, and consts in multiple places.

runs-on: ${{ inputs.test_runs_on }}
# Running docker with cap-add and -v /lib/modiles, by recommendation of Github: https://rocm.docs.amd.com/projects/amdsmi/en/amd-staging/how-to/setup-docker-container.html
container:
image: ${{ inputs.platform == 'linux' && 'ghcr.io/rocm/no_rocm_image_ubuntu24_04@sha256:4150afe4759d14822f0e3f8930e1124f26e11f68b5c7b91ec9a02b20b1ebbb98' || null }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows: null.

Just ensure that the script aomp/bin/run_theRockCI.sh includes fail-safes in case it accidentally executes directly on a Windows runner host.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the note.
We’ll check and add the fail-safes to aomp/bin/run_theRockCI.sh as suggested in a follow-up change.

Copy link
Collaborator

@skganesan008 skganesan008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the comments on ROCm/SPIRV-LLVM-Translator#52

@skganesan008
Copy link
Collaborator

@emankov
Thanks for taking time to review the github action files.

Just to give you some background, we use the CI framework from the ROCK. The backend, which is the Python framework from the ROCK is used as-is . Our goal is to not diverge from what they are doing on the backend.

The front-end which is all these yaml files, where taken from ROCK CI and leveraged for our needs. We also let temporary workarounds and changes in to these yaml files for our needs. For example, the Windows code has a hardcoded drive (L:) as a map to source code in order to address a command line length issue seen during MIOpen compilation. The Windows builds run on containers where we know from the infrastructure side this drive L: is not already taken up for sure. subst command inside a container is completely invisible to the host and other containers. The "B:" comes from ROCI CI end and that should also be not a problem as it is running inside a container and should not have an issue as these containers are managed internally. When containers change, the change itself goes through a psdb. At a high level, when we look at it from better software development practice, these hardcoding looks bad.

Also, please note that merging changes from Rock would become hard if we combine amd-staging and amd-mainline yaml files. There is always room for improvement and abstraction.

@skganesan008
Copy link
Collaborator

skganesan008 commented Mar 13, 2026

@emankov, Let's get this PR landed and Kirthana can help follow up on improvements to this PR.

@emankov
Copy link
Collaborator

emankov commented Mar 17, 2026

Hello @skganesan008,

please note that merging changes from Rock would become hard if we combine amd-staging and amd-mainline yaml files.

Does a coupled reviewing and merging of two almost-but-not-identical changes look easier/simpler?
Or will all the upcoming changes in the dependent theROC components, including HIPIFY tools, inherit theROC issues implicitly and without review?

[Conclusion]
As long as the PR is marked as do not merge, you can merge it without my approval or anyone else's. I don't mind.
I reviewed it by mistake. It has to be reviewed and fixed, not in HIPIFY, but higher in theROC.

@emankov
Copy link
Collaborator

emankov commented Mar 17, 2026

@emankov, Let's get this PR landed and Kirthana can help follow up on improvements to this PR.

I don't mind, but the first place to start improvements and fixes looks like theROC itself.

@kirthana14m kirthana14m merged commit df71038 into amd-staging Mar 17, 2026
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructural related to the Repo infrastructure Windows Windows only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants