Skip to content

Commit 24ead8c

Browse files
authored
Merge pull request #6 from MLSys-UCSD/hao_ai_lab_updates
haoailab project update: ltr
2 parents f714c2c + 87405ea commit 24ead8c

2 files changed

Lines changed: 7 additions & 1 deletion

File tree

data/projectsData.ts

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,13 @@ const projectsData: Project[] = [
4141
description: `DistServe is goodput-optmized LLM serving system that supports prefill-decode disaggregation, a.k.a. splitting prefill from decode into different GPUs, to account for both cost and user satisfaction. DistServe achieves up to 4.48x goodput or 10.2x tighter SLO compared to exiting state-of-the-art serving systems, while staying within tight latency constraints.`,
4242
imgSrc: '/static/images/projects/distserve_anime-crop.gif',
4343
href: 'https://hao-ai-lab.github.io/blogs/distserve',
44-
}
44+
},
45+
{
46+
title: 'Efficient LLM Scheduling by Learning to Rank',
47+
description: `Traditional Large Language Model (LLM) serving systems use first-come-first-serve (FCFS) scheduling, leading to delays when longer requests block shorter ones. The unpredictability of LLM workloads and output lengths further complicates scheduling. We introduced a learning-to-rank method to predict output length rankings, enabling a Shortest Job First-like policy and reducing chatbot latency by 6.9x under high load compared to FCFS.`,
48+
imgSrc: '/static/images/projects/llm-ltr-cover.jpg',
49+
href: 'https://hao-ai-lab.github.io/blogs/vllm-ltr',
50+
},
4551
]
4652

4753
export default projectsData
265 KB
Loading

0 commit comments

Comments
 (0)