-
Notifications
You must be signed in to change notification settings - Fork 139
feat(scheduler): Add gpu dra accounting #900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
aacfcd7 to
82689aa
Compare
📊 Performance Benchmark ResultsComparing PR ( Legend
Raw benchmark dataPR branch: Main branch: |
bc294e1 to
71fb510
Compare
Merging this branch changes the coverage (5 decrease, 2 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
71fb510 to
aacb5bd
Compare
Merging this branch changes the coverage (5 decrease, 2 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
… claims counter section 2- Added a DRA claim lister to the scheduler. DID NOT update the snapshot accordingly. 3- Added the gpu DRA claims counters to the GetGpusQuota function, to make sure that gpu DRA claims are reflected in the gpu accounting calculations, but not in the extended resource comparison. 4- Added the option to set up an e2e cluster with fake DRA gpus for testing
1- Add extra integration tests for dra reclaim and allocate. 2- Fix integration tests utils to properly support DRA 3- Count Gpu resourceSlices for fairShare 4- Resource structs contain gpu dra counters as a separate field 5- Small common dra utils refactor This DOES NOT support: 1- Gpu DRA claims with FirstAvailable 2- Gpu DRA claims shared by several pods
aacb5bd to
e1ea189
Compare
Merging this branch changes the coverage (6 decrease, 1 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
2- Do not add a draGpuCounts count to the resource object (count as float gpus) 3- Update the integration test utils to incorporate node slices.
Merging this branch changes the coverage (4 decrease, 2 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
pkg/scheduler/actions/integration_tests/integration_tests_utils/integration_tests_utils.go
Show resolved
Hide resolved
Merging this branch changes the coverage (4 decrease, 2 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
2- Update fake gpu operator in hack/setup-e2e-cluster.sh
Merging this branch changes the coverage (5 decrease, 2 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. Changed unit test files
|
Description
1- Add extra integration tests for dra reclaim and allocate.
2- Fix integration tests utils to properly support DRA
3- Count Gpu resourceSlices for fairShare
4- Resource structs contain gpu dra counters as a separate field
5- Small common dra utils refactor
6- Added the gpu DRA claims counters to the GetGpusQuota function, to make sure that gpu DRA claims are reflected in the gpu accounting calculations, but not in the extended resource comparison.
7- Added the option to set up an e2e cluster with fake DRA gpus for testing
This DOES NOT support:
1- Gpu DRA claims with FirstAvailable
2- Gpu DRA claims shared by several pods
Related Issues
Fixes #
Checklist
Breaking Changes
Additional Notes