-
Notifications
You must be signed in to change notification settings - Fork 164
refactor(BA-4143): Remove slot-based device partitioning methods #8433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Refactors multi-agent ResourceAllocator to remove slot-based/mode-specific device partitioning so that SHARED, AUTO_SPLIT, and MANUAL all expose full device resources as a baseline ahead of BEP-1041.
Changes:
- Removes AUTO_SPLIT/MANUAL partitioning logic in
ResourceAllocator, making all modes behave like SHARED. - Simplifies agent partition/scaling-factor computation accordingly.
- Skips mode-specific tests and adds new tests asserting all modes fall back to SHARED behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/ai/backend/agent/resources.py |
Removes mode-specific slot partitioning and makes partition/scaling-factor SHARED-only baseline. |
tests/unit/agent/test_resource_allocation.py |
Skips AUTO_SPLIT/MANUAL partitioning tests and adds fallback-to-SHARED tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Simplify ResourceAllocator by removing mode-specific partitioning logic. All allocation modes (SHARED, AUTO_SPLIT, MANUAL) now behave like SHARED, where agents see all devices with no reserved slots between them. This establishes a baseline for implementing BEP-1041's device-centric partitioning design. Removed methods: _calculate_device_slot, _calculate_device_slot_shared, _calculate_device_slot_auto_split, _calculate_device_slot_manual, and _ensure_slots_are_not_overallocated. Scaling factor now always returns 1.0 instead of mode-specific calculations. Test changes: Skipped tests for AUTO_SPLIT and MANUAL modes with explanations. Added TestAllocationModesFallbackToShared class to verify all modes behave identically as SHARED.
- Remove docstrings from private methods - Inline _calculate_device_slots() and _calculate_resource_scaling_factor() - Fix implicit string concatenation in pytest skip reasons
- Add Background section explaining SHARED/AUTO_SPLIT/MANUAL modes - Add Design Overview section for high-level narrative flow - Restructure Proposed Design for organic flow instead of feature list - Update to match actual implementation (ResourcePartitioner, Partition types) - Update GitHub PR numbers to correct values (#8433, #8440, #8447, #8463) - Add Implementation Notes section (scaling factors, memory handling) - Clarify slot-based design was incorrect implementation, not deliberate - Update config examples to show actual format (cpu, mem, devices fields) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add Background section explaining SHARED/AUTO_SPLIT/MANUAL modes - Add Design Overview section for high-level narrative flow - Restructure Proposed Design for organic flow instead of feature list - Update to match actual implementation (ResourcePartitioner, Partition types) - Update GitHub PR numbers to correct values (#8433, #8440, #8447, #8463) - Add Implementation Notes section (scaling factors, memory handling) - Clarify slot-based design was incorrect implementation, not deliberate - Update config examples to show actual format (cpu, mem, devices fields)
- Add Background section explaining SHARED/AUTO_SPLIT/MANUAL modes - Add Design Overview section for high-level narrative flow - Restructure Proposed Design for organic flow instead of feature list - Update to match actual implementation (ResourcePartitioner, Partition types) - Update GitHub PR numbers to correct values (#8433, #8440, #8447, #8463) - Add Implementation Notes section (scaling factors, memory handling) - Clarify slot-based design was incorrect implementation, not deliberate - Update config examples to show actual format (cpu, mem, devices fields)
resolves #8425 (BA-4143)
Overview
Simplify
ResourceAllocatorby removing mode-specific partitioning logic. All allocation modes (SHARED, AUTO_SPLIT, MANUAL) now behave like SHARED, where agents see all devices with no reserved slots between them. This establishes a clean baseline before implementing BEP-1041's device-centric partitioning design.Problem Statement
_calculate_device_slot_*methods that tightly couple slot calculation with allocation modesImplementation
Removed methods from ResourceAllocator:
_calculate_device_slot()- the mode dispatch method_calculate_device_slot_shared()- SHARED mode calculation_calculate_device_slot_auto_split()- AUTO_SPLIT mode calculation_calculate_device_slot_manual()- MANUAL mode calculation_ensure_slots_are_not_overallocated()- MANUAL mode validationSimplified methods:
_calculate_device_slots()- now always returns full available slots (SHARED behavior)_calculate_resource_scaling_factor()- now always returns 1.0_calculate_agent_partition()- removed unused parametersTest changes:
TestAutoSplitModeandTestManualModeclasses with BA-4143/BEP-1041 explanationsTestMultiDeviceScenariosTestAllocationModesFallbackToSharedclass with 3 tests verifying all modes behave identicallyChecklist: (if applicable)
ai.backend.testdocsdirectory