Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

update: update kvcm client
#918 opened Apr 21, 2026 by lucky-zzz Collaborator Loading…
feat: refactor py model device
#917 opened Apr 21, 2026 by JackTan25 Collaborator Loading…
Defer engine and RPC loop start until after full server init
#916 opened Apr 21, 2026 by xinfei-shi Collaborator Loading…
Support batch_prefill && TPS bench mode
#914 opened Apr 21, 2026 by alibaba-miji Collaborator Loading…
6 tasks done
fix: split CI timeout logic for PENDING and RUNNING states
#913 opened Apr 21, 2026 by guoj14 Contributor Loading…
Feature/p2p connector complete
#910 opened Apr 17, 2026 by ZhihanYan Collaborator Loading…
refactor: refactor codes
#909 opened Apr 17, 2026 by JackTan25 Collaborator Loading…
perf: optimize MoE model weight loading (8.6x speedup)
#908 opened Apr 17, 2026 by netaddi Collaborator Loading…
3 tasks
Feat/hybrid cp gdn
#906 opened Apr 17, 2026 by yang1556 Collaborator Loading…
feat: support input_embeddings in inference pipeline
#905 opened Apr 17, 2026 by KrisCheng9 Collaborator Loading…
feat: upgrade rocm6.4.3 to rocm7.2.0
#904 opened Apr 16, 2026 by liaocz Collaborator Loading…
optimize beam search
#903 opened Apr 16, 2026 by parkerpang Loading…
feat: support xgrammer
#902 opened Apr 16, 2026 by wanglining97 Collaborator Loading…
[ROCm] Optimize Qwen3.5 with fused kernel and allreduce merging
#900 opened Apr 16, 2026 by chengshu-lcc Collaborator Loading…
feat: add Kimi Linear (KDA) model support
#899 opened Apr 16, 2026 by theNiemand Collaborator Loading…
feat: Qwen3.5 Blackwell GDN prefill optimization
#897 opened Apr 15, 2026 by netaddi Collaborator Loading…
3 tasks
限制性解码修改
#893 opened Apr 14, 2026 by Glen11111Z Loading…
feature: support tpsize > kv heads
#891 opened Apr 14, 2026 by ZhangZhiPku Collaborator Loading…
Gb200 Qwen3.5 NVFP4
#888 opened Apr 14, 2026 by qqbbiu Collaborator Loading…
fix: fix nvfp4 dp2 cuda graph smoke crash bug
#887 opened Apr 14, 2026 by JackTan25 Collaborator Loading…
feat: more production robust
#885 opened Apr 13, 2026 by yyhclimacool Loading…
Implement true EP (Expert Parallelism) mode for Qwen3 ROCm MoE
#884 opened Apr 13, 2026 by Xu-Sheng-lin Collaborator Loading…
feat: [ROCm] support FP8 PTPC/PerBlock quantization for Qwen3.5
#882 opened Apr 13, 2026 by chengshu-lcc Collaborator Loading…
feat - optimize gemm weights load logic
#880 opened Apr 13, 2026 by alibaba-miji Collaborator Loading…
ProTip! Filter pull requests by the default branch with base:main.