Use hints in CCL and X by mawad-amd · Pull Request #432 · ROCm/iris

mawad-amd · 2026-03-06T23:31:34Z

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull request overview

Adds hint parameters to IRIS collective communication store operations in both ccl and x implementations to enable hint-aware behavior.

Changes:

Pass hint=(BLOCK_SIZE_M, BLOCK_SIZE_N) to iris.store(...) across CCL collectives.
Pass hint=(tile.block_m, tile.block_n) to iris.store(...) in X collectives.
Reformat one iris.store(...) call site while adding the new hint argument.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
iris/x/all_reduce.py	Adds `hint` to `iris.store` call in `all_reduce_two_shot`.
iris/x/all_gather.py	Adds `hint` to `iris.store` call in `all_gather`.
iris/ccl/all_to_all.py	Adds `hint` to `iris.store` in both fast and masked paths.
iris/ccl/all_reduce.py	Adds `hint` to multiple `iris.store` sites; reformats one call.
iris/ccl/all_gather.py	Adds `hint` to `iris.store` for gather implementations.

You can also share your feedback on Copilot code review. Take the survey.

iris/ccl/all_reduce.py

iris/ccl/all_to_all.py

iris/ccl/all_gather.py

Copilot · 2026-03-06T23:40:17Z

@mawad-amd I've opened a new pull request, #433, to work on those changes. Once the pull request is ready, I'll request review from you.

mawad-amd · 2026-03-07T13:33:53Z

@copilot Please investigate this bug: the Test x (all_gather) jobs are failing on PR 432. In tests/x/test_all_gather.py line 163, Iris x.all_gather output doesn't match PyTorch's all_gather for shape 128-64-64-32 (max difference 1.0). All 6 failing cases are that shape (dtype float16/float32/bfloat16); larger shapes pass. You have access to AMD GPUs, so you should be able to run the tests and debug locally. Focus on why this specific shape triggers the mismatch (e.g. hints, rank layout, or kernel path for small dims).

Copilot · 2026-03-07T13:33:59Z

@mawad-amd I've opened a new pull request, #434, to work on those changes. Once the pull request is ready, I'll request review from you.

…h partial-width tiles (#434)

mawad-amd · 2026-03-07T15:35:38Z

@copilot I merged #434 Can you use gh to understand the comments there and suggest tests to make sure all APIs are tested against that behavior? Maybe the existing tests are enough but we need some new shapes in the parametrized tests

Copilot · 2026-03-07T15:35:43Z

@mawad-amd I've opened a new pull request, #435, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>

mawad-amd · 2026-03-07T16:07:05Z

@copilot Can you add additional tests for the other APIs we added hints to, in a similar fashion to what PR #435 did for x.all_gather?

PR #435 extended tests/x/test_all_gather.py with partial-width tile shapes that exercise the hint behavior:

(128, 128, 64, 32) — BLOCK_N < N/world_size (multi-block partial-width)
(256, 128, 64, 16) — minimum BLOCK_N=16 (16-bit vectorization path)

We added hints in this PR (432) to:

iris/ccl/all_gather.py
iris/ccl/all_reduce.py
iris/ccl/all_to_all.py
iris/x/all_gather.py (already got extra tests in Add partial-width tile shapes to x.all_gather tests #435)
iris/x/all_reduce.py

Please add analogous partial-width / edge-case test coverage for all_reduce and all_to_all (and any other CCL/X tests that exist for these ops), so we have the same kind of regression coverage for the hint usage there.

Copilot · 2026-03-07T16:07:11Z

@mawad-amd I've opened a new pull request, #436, to work on those changes. Once the pull request is ready, I'll request review from you.

Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>

…nd X collectives (#436)

mawad-amd · 2026-03-07T20:28:10Z

@copilot Please address the following:

Other hints may need another look. CI is failing on CCL and X multi-rank tests.

Errors we saw:

Failing jobs: Test ccl (2/4/8 ranks) and Test x (2/4/8 ranks), git install.
All failures are in tests/ccl/test_all_gather.py for the 256-128-32-16 shape/block only (all dtypes), in both test_all_gather and test_all_gather_partitioned.
Rank 1: Iris output has a row of 0s where PyTorch has 1s — [1., 0., 0., ...] vs [1., 1., 1., ...] (max diff 1.0).
Rank 0: Iris output has rows of 0s where PyTorch has 2s — [2., 0., 0., ...] vs [2., 2., 2., ...] (max diff 2.0).

So for shape 256×128, block 32×16, Iris all_gather doesn't match PyTorch's all_gather_into_tensor; some elements stay zero. Likely a hint/block indexing or masking bug for this config — please check other hints for similar edge cases.

Copilot · 2026-03-07T20:28:17Z

@mawad-amd I've opened a new pull request, #437, to work on those changes. Once the pull request is ready, I'll request review from you.

…cross all CCL/X collectives (#437) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>

Use hints in CCL and X

8273d16

mawad-amd requested review from BKP and neoblizz as code owners March 6, 2026 23:31

Copilot AI review requested due to automatic review settings March 6, 2026 23:31

github-actions bot added in-progress We are working on it iris Iris project issue labels Mar 6, 2026

Apply Ruff auto-fixes

f5eb7fd

Copilot AI reviewed Mar 6, 2026

View reviewed changes

iris/ccl/all_reduce.py Outdated Show resolved Hide resolved

iris/ccl/all_reduce.py Show resolved Hide resolved

iris/ccl/all_to_all.py Show resolved Hide resolved

iris/ccl/all_gather.py Outdated Show resolved Hide resolved

Copilot AI mentioned this pull request Mar 6, 2026

[WIP] Update implementation to simplify hints in CCL and X #433

Closed

Trigger CI

fc42f05

Copilot AI mentioned this pull request Mar 7, 2026

Fix x.all_gather hint: use (1, block_n) for correct vectorization with partial-width tiles #434

Merged

Fix x.all_gather hint: use (1, block_n) for correct vectorization wit…

c50f442

…h partial-width tiles (#434)

Initial plan

626fa7b

Copilot AI mentioned this pull request Mar 7, 2026

Add partial-width tile shapes to x.all_gather tests #435

Merged

Copilot AI and others added 2 commits March 7, 2026 15:40

Add partial-width tile shapes to all_gather tests

177aec9

Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>

Add partial-width tile shapes to x.all_gather tests (#435)

4324247

Initial plan

b034f46

Copilot AI mentioned this pull request Mar 7, 2026

Add partial-width tile shape tests for hint-exercising paths in CCL and X collectives #436

Merged

Copilot AI and others added 2 commits March 7, 2026 16:15

Add partial-width tile shapes to all_reduce and all_to_all tests

a2b6f90

Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>

Add partial-width tile shape tests for hint-exercising paths in CCL a…

fb84e10

…nd X collectives (#436)

Copilot AI mentioned this pull request Mar 7, 2026

Fix iris.store hint: use (1, BLOCK_N) instead of (BLOCK_M, BLOCK_N) across all CCL/X collectives #437

Merged

Fix iris.store hint: use (1, BLOCK_N) instead of (BLOCK_M, BLOCK_N) a…

1c55396

…cross all CCL/X collectives (#437) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>

mawad-amd merged commit 2afde85 into main Mar 8, 2026
75 checks passed

mawad-amd deleted the muhaawad/hints branch March 8, 2026 04:41

mawad-amd restored the muhaawad/hints branch March 8, 2026 04:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use hints in CCL and X#432

Use hints in CCL and X#432
mawad-amd merged 11 commits intomainfrom
muhaawad/hints

mawad-amd commented Mar 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 6, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mawad-amd commented Mar 6, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 6, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

mawad-amd commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants