Conversation
There was a problem hiding this comment.
Pull request overview
Adds hint parameters to IRIS collective communication store operations in both ccl and x implementations to enable hint-aware behavior.
Changes:
- Pass
hint=(BLOCK_SIZE_M, BLOCK_SIZE_N)toiris.store(...)across CCL collectives. - Pass
hint=(tile.block_m, tile.block_n)toiris.store(...)in X collectives. - Reformat one
iris.store(...)call site while adding the newhintargument.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| iris/x/all_reduce.py | Adds hint to iris.store call in all_reduce_two_shot. |
| iris/x/all_gather.py | Adds hint to iris.store call in all_gather. |
| iris/ccl/all_to_all.py | Adds hint to iris.store in both fast and masked paths. |
| iris/ccl/all_reduce.py | Adds hint to multiple iris.store sites; reformats one call. |
| iris/ccl/all_gather.py | Adds hint to iris.store for gather implementations. |
You can also share your feedback on Copilot code review. Take the survey.
|
@mawad-amd I've opened a new pull request, #433, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@copilot Please investigate this bug: the Test x (all_gather) jobs are failing on PR 432. In |
|
@mawad-amd I've opened a new pull request, #434, to work on those changes. Once the pull request is ready, I'll request review from you. |
…h partial-width tiles (#434)
|
@mawad-amd I've opened a new pull request, #435, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
|
@copilot Can you add additional tests for the other APIs we added hints to, in a similar fashion to what PR #435 did for PR #435 extended
We added hints in this PR (432) to:
Please add analogous partial-width / edge-case test coverage for all_reduce and all_to_all (and any other CCL/X tests that exist for these ops), so we have the same kind of regression coverage for the hint usage there. |
|
@mawad-amd I've opened a new pull request, #436, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
…nd X collectives (#436)
|
@copilot Please address the following: Other hints may need another look. CI is failing on CCL and X multi-rank tests. Errors we saw:
So for shape 256×128, block 32×16, Iris all_gather doesn't match PyTorch's |
|
@mawad-amd I've opened a new pull request, #437, to work on those changes. Once the pull request is ready, I'll request review from you. |
…cross all CCL/X collectives (#437) Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mawad-amd <112003944+mawad-amd@users.noreply.github.com>
Motivation
Technical Details
Test Plan
Test Result
Submission Checklist