-
Notifications
You must be signed in to change notification settings - Fork 73
Skip certain configs in LowerCollectiveCudaAndNcclTest #5743
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| TEST_P(LowerCollectiveCudaAndNcclTest, Allgather) { | ||
| const auto& [msg_size_bytes, protocol_enum] = GetParam(); | ||
| const int64_t kMsgSize = msg_size_bytes / sizeof(float); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a real constant: https://google.github.io/styleguide/cppguide.html#Constant_Names
| at::Tensor runBenchmark( | ||
| MultiDeviceExecutor& executor, | ||
| const std::vector<c10::IValue>& inputs, | ||
| int64_t msg_size_bytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
|
|
||
| if (message_size_bytes > 32LL * 1024 * 1024) { | ||
| GTEST_SKIP() << "Takes >30 seconds to run in CI: http://nv/e.)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skips here
| } | ||
|
|
||
| if (message_size_bytes > 32LL * 1024 * 1024) { | ||
| GTEST_SKIP() << "Takes >5 seconds to run in CI: http://nv/e.)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... and here
Greptile SummaryThis PR reduces CI runtime by skipping large message size tests (128MB and 256MB) in
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Test as Test Harness
participant Allgather as Allgather Test
participant Broadcast as Broadcast Test
participant Skip as GTEST_SKIP
Test->>Allgather: Execute with params (message_size, protocol)
Allgather->>Allgather: Check message_size_bytes > 32MB?
alt message_size > 32MB
Allgather->>Skip: Skip test (128MB, 256MB configs)
Note over Skip: "Takes >30 seconds in CI"
else message_size <= 32MB
Allgather->>Allgather: Run test (2MB, 8MB, 32MB)
end
Test->>Broadcast: Execute with params (message_size, protocol)
Broadcast->>Broadcast: Check message_size_bytes > 32MB?
alt message_size > 32MB
Broadcast->>Skip: Skip test (128MB, 256MB configs)
Note over Skip: "Takes >5 seconds in CI"
else message_size <= 32MB
Broadcast->>Broadcast: Run test (2MB, 8MB, 32MB)
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 2 comments
| } | ||
|
|
||
| if (message_size_bytes > 32LL * 1024 * 1024) { | ||
| GTEST_SKIP() << "Takes >30 seconds to run in CI: http://nv/e.)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: malformed URL in skip message - http://nv/e.) appears incomplete
| GTEST_SKIP() << "Takes >30 seconds to run in CI: http://nv/e.)"; | |
| GTEST_SKIP() << "Takes >30 seconds to run in CI"; |
| } | ||
|
|
||
| if (message_size_bytes > 32LL * 1024 * 1024) { | ||
| GTEST_SKIP() << "Takes >5 seconds to run in CI: http://nv/e.)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: malformed URL in skip message - http://nv/e.) appears incomplete
| GTEST_SKIP() << "Takes >5 seconds to run in CI: http://nv/e.)"; | |
| GTEST_SKIP() << "Takes >5 seconds to run in CI"; |
Description
|
| Relevant files |
|---|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Incomplete skip reason URL
|
... to speed up CI and local runs
The way forward could be to reduce
warmup_itersandtiming_itersand move this tobenchmarks/cppso it doesn't run by default.