[Feat]Adapt Deepseek-v4-Flash on cuda and ascend by qyh111 · Pull Request #950 · ModelEngine-Group/unified-cache-management

qyh111 · 2026-05-09T01:12:28Z

Purpose

Modifications

Test

ygwpz · 2026-05-09T06:35:49Z

+    """
+
+    group_ucm_block_ids: list[list[bytes]] = field(default_factory=list)
+    group_vllm_block_ids: list[list[int]] = field(default_factory=list)


This _build_layout method duplicates significant logic from HMAKVCacheLayout._build_layout. Consider extracting common ptr/tensor_size extraction into a helper method to reduce code duplication.

ygwpz · 2026-05-09T06:35:52Z

    for (size_t i = 0, offset = 0; i < number; i++) {
        auto pHost = (void*)(((int8_t*)host) + offset);
        auto pDevice = device[i];
+        if (!pDevice) { continue; }


The if (!pDevice) check is good, but should pHost also be checked for null? The symmetry with dump_queue.cc would improve safety.

ygwpz · 2026-05-09T06:36:33Z

+                load_tok_end = total_hit_tokens
+                start_blk = load_tok_start // group.block_size
+                end_blk = load_tok_end // group.block_size
+                if start_blk >= end_blk:


For SW groups, load_tok_start = total_hit_tokens - group.sliding_window. Consider adding a comment explaining why this calculation ensures the SW window tail is loaded correctly on resume.

ygwpz · 2026-05-09T06:36:34Z

+
+
+class AscendDSV4Layout(HMAKVCacheLayout):
+    def __init__(


AscendDSV4Layout duplicates significant logic from HMAKVCacheLayout._build_layout. The ptr/tensor_size extraction loop is nearly identical. Consider extracting into a shared helper method.

ygwpz · 2026-05-13T06:06:48Z

+      inherited ``ucm_block_ids``.
+    - ``group_vllm_block_ids[gid]``: per-group VLLM physical block ids; this
+      is initialised as an empty list per group here and populated later by
+      the dispatch path (still a TODO for HMA dump/load).


This TODO indicates incomplete implementation for HMA dump/load. Should this be tracked as a separate issue or addressed before merge?

qyh111 requested review from Infinite666, harrisonyhq, mag1c-h and ygwpz as code owners May 9, 2026 01:12

qyh111 added 2 commits May 9, 2026 06:33

Adapt deepseek v4

a9bb548

[Feat]Adapt Deepseek-V4-Flash on ascend and cuda

d80e601

qyh111 force-pushed the dev-deepseek-v4 branch from af24a14 to d80e601 Compare May 9, 2026 06:34

qyh111 changed the title ~~Dev deepseek v4~~ [Feat]Adapt Deepseek-v4-Flash on cuda and ascend May 9, 2026

ygwpz reviewed May 9, 2026

View reviewed changes

ygwpz reviewed May 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat]Adapt Deepseek-v4-Flash on cuda and ascend#950

[Feat]Adapt Deepseek-v4-Flash on cuda and ascend#950
qyh111 wants to merge 2 commits into
developfrom
dev-deepseek-v4

qyh111 commented May 9, 2026

Uh oh!

ygwpz May 9, 2026

Uh oh!

ygwpz May 9, 2026

Uh oh!

ygwpz May 9, 2026

Uh oh!

ygwpz May 9, 2026

Uh oh!

ygwpz May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

qyh111 commented May 9, 2026

Purpose

Modifications

Test

Uh oh!

ygwpz May 9, 2026

Choose a reason for hiding this comment

Uh oh!

ygwpz May 9, 2026

Choose a reason for hiding this comment

Uh oh!

ygwpz May 9, 2026

Choose a reason for hiding this comment

Uh oh!

ygwpz May 9, 2026

Choose a reason for hiding this comment

Uh oh!

ygwpz May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants