[None][refactor] Modularize resource_manager.py into a package by eopXD · Pull Request #12883 · NVIDIA/TensorRT-LLM

eopXD · 2026-04-09T07:01:39Z

Summary

Split the 2,956-line monolithic resource_manager.py into a resource_manager/ package with 7 focused submodules
All 36+ existing importers work unchanged via __init__.py re-exports — zero external changes required
Separates v1 (KVCacheManager) and v2 (KVCacheManagerV2) into independent files, reducing merge conflict surface
Extracts VSWA calculation utilities, spec-dec KV relocation ops, PeftCacheManager, and simple managers into dedicated modules

Module breakdown

Module	Contents	Lines
`base.py`	`BaseResourceManager` ABC, `ResourceManager` coordinator, enums	~150
`kv_cache_manager.py`	`KVCacheManager` (v1, C++ binding-backed)	~910
`kv_cache_manager_v2.py`	`KVCacheManagerV2` (Python runtime-backed)	~1000
`vswa.py`	Variable Sliding Window Attention calculation utilities	~350
`kv_cache_spec_ops.py`	Spec-dec × KV-cache cross-cutting operations	~140
`peft_cache_manager.py`	`PeftCacheManager` for LoRA adapters	~170
`simple_managers.py`	`SlotManager`, `BlockManager`	~140
`__init__.py`	Full re-exports (backward compat)	~60

Motivation

See docs/rfcs/resource-manager-modularization.md in this repo for the full RFC with problem analysis, design rationale, and migration strategy.

Test plan

All 8 new files pass Python syntax validation (py_compile)
All relative import depths verified correct (4-dot for tensorrt_llm, 3-dot for _torch, 2-dot for pyexecutor)
Every symbol imported by the 36+ external consumers is re-exported in __init__.py
CacheTypeCpp and DataType binding aliases re-exported for mamba_cache_manager.py compatibility
Full CI validation needed

Split the 2,956-line monolithic resource_manager.py into a resource_manager/ package with focused submodules: - base.py: BaseResourceManager ABC, ResourceManager coordinator, enums - kv_cache_manager.py: KVCacheManager (v1, C++ binding-backed) - kv_cache_manager_v2.py: KVCacheManagerV2 (Python runtime-backed) - vswa.py: Variable Sliding Window Attention calculation utilities - kv_cache_spec_ops.py: Spec-dec x KV-cache cross-cutting operations - peft_cache_manager.py: PeftCacheManager for LoRA adapters - simple_managers.py: SlotManager and BlockManager utilities - __init__.py: Full re-exports for backward compatibility All 36+ existing importers work unchanged. No runtime behavior changes. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

…ation - Fix P0: Restore `mpi_rank() == 0` check (was incorrectly changed to `mpi_disabled()`) for KV cache event manager creation on rank 0 - Fix P0: Remove stale `model_config=` kwarg in vswa.py call to `adjust_window_sizes_for_vswa` (would cause TypeError at runtime) - Fix P1: Update copyright year to 2026 on all new files - Fix P1: Remove `_locate_accepted_draft_tokens` from __init__.py re-exports (private helper, no external consumers) Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

Add DecodingBaseConfig and AttentionMetadata under TYPE_CHECKING to fix F821 (undefined name) errors in kv_cache_manager.py and kv_cache_manager_v2.py. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

Break long log/error message strings to comply with 120-char line limit enforced by ruff in CI. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

These files are not in the legacy-files list, so CI runs ruff-format instead of yapf. Apply ruff-format as the authoritative formatter. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

eopXD · 2026-04-09T07:39:09Z

/bot run

tensorrt-cicd · 2026-04-09T07:45:49Z

PR_Github #42499 [ run ] triggered by Bot. Commit: 348cff3 Link to invocation

tensorrt-cicd · 2026-04-09T12:59:38Z

PR_Github #42499 [ run ] completed with state SUCCESS. Commit: 348cff3
/LLM/main/L0_MergeRequest_PR pipeline #33246 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

…nager The method was removed during the VSWA extraction refactor but is still called by disaggregated serving code (kv_extractor, test_mamba_transfer). Re-add it as a thin wrapper around the extracted standalone function. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

eopXD · 2026-04-10T03:32:12Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-04-10T03:39:12Z

PR_Github #42646 [ run ] triggered by Bot. Commit: 7bfc3d0 Link to invocation

github-actions bot assigned eopXD Apr 9, 2026

eopXD added 5 commits April 9, 2026 15:10

[None][refactor] Apply yapf/ruff formatting to resource_manager package

88493f1

Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

[None][refactor] Add missing TYPE_CHECKING imports for ruff F821

9635005

Add DecodingBaseConfig and AttentionMetadata under TYPE_CHECKING to fix F821 (undefined name) errors in kv_cache_manager.py and kv_cache_manager_v2.py. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

[None][refactor] Fix E501 line-too-long and apply formatting

852bbb2

Break long log/error message strings to comply with 120-char line limit enforced by ruff in CI. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

[None][refactor] Apply ruff-format (not yapf) for non-legacy files

348cff3

These files are not in the legacy-files list, so CI runs ruff-format instead of yapf. Apply ruff-format as the authoritative formatter. Signed-off-by: Yueh-Ting Chen <yueh.ting.chen@gmail.com>

eopXD force-pushed the yuehtingc/modularize-resource-manager branch from 3e6802d to 7bfc3d0 Compare April 10, 2026 03:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][refactor] Modularize resource_manager.py into a package#12883

[None][refactor] Modularize resource_manager.py into a package#12883
eopXD wants to merge 7 commits intoNVIDIA:mainfrom
eopXD:yuehtingc/modularize-resource-manager

eopXD commented Apr 9, 2026 •

edited

Loading

Uh oh!

eopXD commented Apr 9, 2026

Uh oh!

tensorrt-cicd commented Apr 9, 2026

Uh oh!

tensorrt-cicd commented Apr 9, 2026

Uh oh!

eopXD commented Apr 10, 2026

Uh oh!

tensorrt-cicd commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eopXD commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Module breakdown

Motivation

Test plan

Uh oh!

eopXD commented Apr 9, 2026

Uh oh!

tensorrt-cicd commented Apr 9, 2026

Uh oh!

tensorrt-cicd commented Apr 9, 2026

Uh oh!

eopXD commented Apr 10, 2026

Uh oh!

tensorrt-cicd commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eopXD commented Apr 9, 2026 •

edited

Loading