Skip to content

Conversation

@hhoikoo
Copy link
Member

@hhoikoo hhoikoo commented Dec 2, 2025

resolves #7046 (BA-3212)

Overview

Implements BEP-1016 compliant device partitioning for multi-agent resource allocation. Devices are now assigned as whole units to agents using divmod distribution, ensuring each physical device belongs to exactly one agent (device mutual exclusivity).

Problem Statement

  • Device contention: Multiple agents on the same host could allocate kernels using the same physical device, causing oversubscription
  • BEP-1016 violation: The previous "fill-from-front" approach allowed partial device sharing, where the same DeviceId could appear in multiple agents' partitions
  • Resource isolation: No clear boundary between agents' device ownership made debugging and capacity planning difficult

Architecture

flowchart TB
    subgraph "Device Pool (N=5 devices)"
        D0[cuda0]
        D1[cuda1]
        D2[cuda2]
        D3[cuda3]
        D4[cuda4]
    end

    subgraph "divmod(5, 3) = (1, 2)"
        direction LR
        Q["q=1 base devices"]
        R["r=2 remainder"]
    end

    subgraph "Agent Partitions (M=3 agents)"
        A1[Agent1<br/>q+1 = 2 devices]
        A2[Agent2<br/>q+1 = 2 devices]
        A3[Agent3<br/>q = 1 device]
    end

    D0 --> A1
    D1 --> A1
    D2 --> A2
    D3 --> A2
    D4 --> A3
Loading

Distribution algorithm:

  • For N devices across M agents: q, r = divmod(N, M)
  • First r agents receive q + 1 devices each
  • Remaining M - r agents receive q devices each
  • Edge case: When M > N, first N agents get 1 device each, remaining agents get empty device masks

Implementation Notes

  • _compute_device_partitions() now assigns whole devices instead of partial slot amounts
  • _compute_device_partition() calculates per-agent device count using divmod
  • SHARED mode unchanged - all agents still see all devices
  • AUTO_SPLIT and MANUAL modes now enforce device mutual exclusivity

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

@github-actions github-actions bot added size:XL 500~ LoC comp:agent Related to Agent component labels Dec 2, 2025
@hhoikoo hhoikoo force-pushed the feat/BA-3212/multi-agent-device-split branch 2 times, most recently from 0a23c75 to 9c8fb64 Compare December 2, 2025 08:08
Base automatically changed from fix/BA-3199 to main December 2, 2025 14:25
@HyeockJinKim HyeockJinKim force-pushed the main branch 2 times, most recently from 9552aac to 4af738e Compare December 31, 2025 15:41
@hhoikoo hhoikoo marked this pull request as draft January 19, 2026 04:16
@hhoikoo
Copy link
Member Author

hhoikoo commented Jan 26, 2026

BEP-1016 Compatibility Review

After reviewing this PR against BEP-1016 (Accelerator Interface v2), there is one key fix needed to ensure V1 implementation doesn't violate V2 assumptions.

Issue: Device Mutual Exclusivity

BEP-1016 Requirement:

"(NOTE: partitioning here means per-device, not within-device!)"

"Decision: Cross-agent device sharing is not supported. Devices are always mutually exclusive between agents within a node."

In BEP-1016, each agent's allowed devices will be expressed as a device_mask: frozenset[DeviceId] — a simple allowlist. This design fundamentally requires that each DeviceId belongs to exactly ONE agent.

Current PR Behavior:
The fill-from-front algorithm in _compute_device_partition() currently allows the same DeviceId to appear in multiple agents' partitions when devices have fractional capacity. For example, the test shows:

# 3 agents all accessing cuda0 with amount=1 each
for i in range(1, 4):
    ctx = allocator.get_computers(AgentId(f"agent{i}"))[DeviceName("cuda")]
    assert ctx.alloc_map.device_slots[DeviceId("cuda0")].amount == Decimal("1")

This means cuda0 is shared across all 3 agents, which cannot be expressed as a simple frozenset[DeviceId] allowlist per agent (since the same device would need to be in multiple sets).

Recommended Fix

Modify _compute_device_partition() to assign whole devices to agents rather than fractional shares. Use divmod distribution:

  • For N devices across M agents: q, r = divmod(N, M)
  • First r agents get q + 1 devices each
  • Remaining agents get q devices each

Edge case (more agents than devices):
When M > N, the first N agents get 1 device each, and the remaining M - N agents get an empty device mask. This is acceptable — better to have some agents with no devices than to violate mutual exclusivity.

Example

5 devices, 3 agents → divmod(5, 3) = (1, 2)

  • Agent 1: devices 0, 1 (1 + 1 = 2 devices)
  • Agent 2: devices 2, 3 (1 + 1 = 2 devices)
  • Agent 3: device 4 (1 device)

3 devices, 5 agents → divmod(3, 5) = (0, 3)

  • Agent 1: device 0
  • Agent 2: device 1
  • Agent 3: device 2
  • Agents 4, 5: empty device mask

This ensures each agent's device allocation can be cleanly represented as frozenset[DeviceId] in the future BEP-1016 implementation.

@hhoikoo hhoikoo force-pushed the feat/BA-3212/multi-agent-device-split branch from 9c8fb64 to 136430a Compare January 27, 2026 09:14
Copy link
Member Author

@hhoikoo hhoikoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally your change is extremely bloated with unnecessarily long implementations of simple algorithms.

Copy link
Member Author

@hhoikoo hhoikoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitting pending review

Add device partitioning logic to allocate GPU resources across
multiple agents using fill-from-front assignment strategy. Each
agent receives a calculated share of device slots based on the
resource split configuration.

Key changes include adding set_device_slot_amounts method to
AbstractAllocMap for per-device slot updates, implementing
DevicePartition class to track device shares per agent, and
updating ResourceAllocator to calculate and distribute device
allocations across agents. Tests verify multi-agent scenarios.
@hhoikoo hhoikoo force-pushed the feat/BA-3212/multi-agent-device-split branch from 7725685 to 396472a Compare January 29, 2026 06:39
Tests were using old SlotName + Decimal format but devices now use
DeviceName + Sequence[DeviceId] format.
Update resource allocation mode tests to use the new
DeviceName/Sequence[DeviceId] format instead of the legacy
SlotName/Decimal format. This includes renaming slot-related
test methods to device-related names and updating error message
assertions.

Changes include test method renames from "slots" to "device
names" terminology and adding assertions for the second agent
in the empty device list test to ensure consistent validation
across all agents.
@hhoikoo hhoikoo closed this Feb 2, 2026
@hhoikoo hhoikoo deleted the feat/BA-3212/multi-agent-device-split branch February 2, 2026 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:agent Related to Agent component size:XL 500~ LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Divide devices among multi agents on a fill-from-front basis based on resource split

2 participants