Commit 9d41318
feat(kv_cache): enable asymmetric store/retrieve storages in PD backend (LMCache#2509)
* feat(kv_cache): enable asymmetric save/remote storage in PD backend
Remove the restriction that prevented using `save_decode_cache` and
`remote_backend` simultaneously in Prefill-Decode (PD) separation scenarios.
This change introduces `pd_retrieve_locations` and `pd_store_location`
parameters to decouple the KV cache retrieval and storage logic. This
enables an asymmetric cache flow:
1. Prefill nodes transmit KV cache to Decode nodes via the PDBackend.
2. Decode nodes write back their generated KV cache to a remote backend
for subsequent prefill reuse.
3. In multi-turn dialogue scenarios, subsequent
prefill requests retrieve historical KV cache from the remote backend,
significantly increasing Prefix Cache hit rates and reducing TTFT
This decoupling provides greater flexibility for cross-instance cache
management and improves overall pipeline efficiency in distributed
inference.
[ Compute Layer ]
+----------------------+ +------------------+
| Prefill Node | ===============>| Decode Node |
| (Hit-Remote & GenKV) | (1) PDBackend | (Hit-PD & GenKV) |
+-------^--------------+ +-------+----------+
| |
: :
------------|-----------------------------------|------------
[ Storage Layer ] |
| | (2) pd_store_location
| (3) pd_retrieve_locations | (Decode -> Pool)
| (Pool -> Prefill) |
| v
+-------+--------------------------------------------+
| Distributed Storage Pool |
| [Node A] [Node B] [Node C] [Node D] |
| <======= (Object Storage / NFS / DFS) =======> |
+----------------------------------------------------+
Workflow:
1. Prefill -> Decode (PDBackend): Initial KV transfer for the current turn.
2. Decode -> Remote (Store): Decode saves updated KV to NFS for persistence.
3. Remote -> Prefill (Retrieve): Next-turn prefill pulls from Remote,
drastically increasing Prefix Cache hit rate for multi-turn dialogues.
Signed-off-by: Tony Lin <tony.lin@intel.com>
* small refactor
Signed-off-by: Tony Lin <tony.lin@intel.com>
* config examples for pd + remote backends
Signed-off-by: Tony Lin <tony.lin@intel.com>
* refactor: rename pd_retrieve_locations/pd_store_location to retrieve_locations/store_location
Remove the PD-specific prefix to make the retrieve/store locations
generic instead of being limited to PD only.
This breaks the PD-only feature restriction and allows the mechanism
to be reused by other roles/components.
Signed-off-by: Tony Lin <tony.lin@intel.com>
* move retrieve & store locations from storage manger to cache engine
Signed-off-by: Tony Lin <tony.lin@intel.com>
* add para validation check
Signed-off-by: Tony Lin <tony.lin@intel.com>
* config: replace hardcoded IP with placeholder in decoder remote configs
Signed-off-by: Tony Lin <tony.lin@intel.com>
* resolve conflicts and rebase to the latest
Signed-off-by: Tony Lin <tony.lin@intel.com>
* address review comments
Signed-off-by: Tony Lin <tony.lin@intel.com>
* add description in configurations.rst
Signed-off-by: Tony Lin <tony.lin@intel.com>
---------
Signed-off-by: Tony Lin <tony.lin@intel.com>
Co-authored-by: deng451e <57919305+deng451e@users.noreply.github.com>1 parent d6661f1 commit 9d41318
8 files changed
Lines changed: 149 additions & 11 deletions
File tree
- docs/source/api_reference
- examples/disagg_prefill
- 1p1d/configs
- xpyd/configs
- lmcache/v1
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
82 | 88 | | |
83 | 89 | | |
84 | 90 | | |
| |||
475 | 481 | | |
476 | 482 | | |
477 | 483 | | |
478 | | - | |
| 484 | + | |
Lines changed: 21 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
Lines changed: 19 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
Lines changed: 22 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
Lines changed: 21 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
Lines changed: 19 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
180 | 180 | | |
181 | 181 | | |
182 | 182 | | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
183 | 188 | | |
184 | 189 | | |
185 | 190 | | |
| |||
532 | 537 | | |
533 | 538 | | |
534 | 539 | | |
535 | | - | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
536 | 544 | | |
537 | 545 | | |
538 | 546 | | |
| |||
640 | 648 | | |
641 | 649 | | |
642 | 650 | | |
643 | | - | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
644 | 654 | | |
645 | 655 | | |
646 | 656 | | |
| |||
715 | 725 | | |
716 | 726 | | |
717 | 727 | | |
718 | | - | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
719 | 731 | | |
720 | 732 | | |
721 | 733 | | |
| |||
848 | 860 | | |
849 | 861 | | |
850 | 862 | | |
851 | | - | |
| 863 | + | |
852 | 864 | | |
853 | 865 | | |
854 | 866 | | |
| |||
956 | 968 | | |
957 | 969 | | |
958 | 970 | | |
959 | | - | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
960 | 974 | | |
961 | 975 | | |
962 | 976 | | |
| |||
1082 | 1096 | | |
1083 | 1097 | | |
1084 | 1098 | | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
1085 | 1102 | | |
1086 | 1103 | | |
1087 | 1104 | | |
| |||
1243 | 1260 | | |
1244 | 1261 | | |
1245 | 1262 | | |
| 1263 | + | |
| 1264 | + | |
| 1265 | + | |
1246 | 1266 | | |
1247 | 1267 | | |
1248 | 1268 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
120 | 120 | | |
121 | 121 | | |
122 | 122 | | |
| 123 | + | |
| 124 | + | |
123 | 125 | | |
124 | 126 | | |
125 | 127 | | |
| |||
544 | 546 | | |
545 | 547 | | |
546 | 548 | | |
547 | | - | |
548 | | - | |
549 | | - | |
550 | | - | |
551 | | - | |
552 | 549 | | |
553 | 550 | | |
554 | 551 | | |
| |||
568 | 565 | | |
569 | 566 | | |
570 | 567 | | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
571 | 581 | | |
572 | 582 | | |
573 | 583 | | |
| |||
0 commit comments