Skip to content
@llm-d

llm-d

llm-d enables high performance distributed inference in production on Kubernetes

Welcome to llm-d: a Kubernetes-native high-performance distributed LLM inference framework

GitHub Org's stars Documentation License

Join Slack X (formerly Twitter) Follow Bluesky LinkedIn Reddit YouTube

llm-d is a Kubernetes-native high-performance distributed LLM inference framework that provides the fastest time-to-value and competitive performance per dollar. Built on vLLM, Kubernetes, and Inference Gateway, llm-d offers modular solutions for distributed inference with features like KV-cache aware routing and disaggregated serving.

🚀 Quick Start Guide

New to llm-d? Here's how to get started:

  1. Join our Slack 💬Get your invite and visit llm-d.slack.com
  2. Explore our code 📂GitHub Organization
  3. Join a meeting 📅Add calendar
  4. Pick your area 🎯 → Browse Special Interest Groups.

📚 Key Resources

💬 Communication Channels

🗓️ Regular Meetings

All meetings are open to the public! 🌟

  • 📅 Weekly Standup: Every Wednesday at 12:30pm ET - Project updates and open discussion
  • 🎯 SIG Meetings: Various times throughout the week - See SIG details for schedules

Join to participate, ask questions, or just listen and learn!

🎯 Special Interest Groups (SIGs)

Want to dive deeper into specific areas? Our Special Interest Groups are focused teams working on different aspects of llm-d:

  • Inference Scheduler - Intelligent request routing and load balancing
  • Benchmarking - Performance testing and optimization
  • PD-Disaggregation - Prefill/decode separation patterns
  • KV-Disaggregation - KV caching and distributed storage
  • Installation - Kubernetes integration and deployment
  • Autoscaling - Traffic-aware autoscaling and resource management
  • Observability - Monitoring, logging, and metrics

View more SIG Details →

🤝 How to Contribute

Getting Involved

Contributing Code

  1. Read Guidelines: Review our Code of Conduct and contribution process
  2. Sign Commits: All commits require DCO sign-off (git commit -s)

Ways to Contribute

  • 🐛 Bug fixes and small features - Submit PRs directly to component repos
  • 🚀 New features with APIs - Require project proposals
  • 📚 Documentation - Help improve guides and examples
  • 🧪 Testing & Benchmarking - Contribute to our test coverage
  • 💡 Experimental features - Start in llm-d-incubation org

🔒 Security & Safety

🌐 Connect With Us

Follow llm-d across social platforms for updates, discussions, and community highlights:

❓ Need Help?

Questions? Ideas? Just want to chat? We're here to help! The llm-d community team is friendly and responsive.


License: Apache 2.0

Pinned Loading

  1. llm-d llm-d Public

    Achieve state of the art inference performance with modern accelerators on Kubernetes

    Shell 2.6k 342

  2. llm-d-inference-scheduler llm-d-inference-scheduler Public

    Inference scheduler for llm-d

    Go 140 139

  3. llm-d-kv-cache llm-d-kv-cache Public

    Distributed KV cache scheduling & offloading libraries

    Go 111 98

  4. llm-d-benchmark llm-d-benchmark Public

    llm-d benchmark scripts and tooling

    Python 49 53

Repositories

Showing 10 of 16 repositories
  • llm-d-inference-scheduler Public

    Inference scheduler for llm-d

    llm-d/llm-d-inference-scheduler’s past year of commit activity
    Go 140 Apache-2.0 139 54 (2 issues need help) 23 Updated Mar 10, 2026
  • llm-d-kv-cache Public

    Distributed KV cache scheduling & offloading libraries

    llm-d/llm-d-kv-cache’s past year of commit activity
    Go 111 Apache-2.0 98 58 (7 issues need help) 37 Updated Mar 10, 2026
  • llm-d Public

    Achieve state of the art inference performance with modern accelerators on Kubernetes

    llm-d/llm-d’s past year of commit activity
    Shell 2,592 Apache-2.0 342 129 (5 issues need help) 98 Updated Mar 10, 2026
  • llm-d-prism Public
    llm-d/llm-d-prism’s past year of commit activity
    Makefile 0 Apache-2.0 1 0 0 Updated Mar 10, 2026
  • llm-d-infra Public

    repo for CI and infrastructure required to maintain llm-d org member repos

    llm-d/llm-d-infra’s past year of commit activity
    Shell 0 Apache-2.0 2 19 0 Updated Mar 10, 2026
  • llm-d-benchmark Public

    llm-d benchmark scripts and tooling

    llm-d/llm-d-benchmark’s past year of commit activity
    Python 49 Apache-2.0 53 81 (10 issues need help) 10 Updated Mar 10, 2026
  • llm-d/llm-d-pd-utils’s past year of commit activity
    Makefile 5 5 10 4 Updated Mar 10, 2026
  • llm-d-workload-variant-autoscaler Public

    Variant optimization autoscaler for distributed inference workloads

    llm-d/llm-d-workload-variant-autoscaler’s past year of commit activity
    Go 33 Apache-2.0 41 113 (1 issue needs help) 29 Updated Mar 10, 2026
  • llm-d-go-template Public template

    Go microservice template for llm-d repos. Use 'Use this template' to create a new Go project with standard CI, linting, Prow, and governance.

    llm-d/llm-d-go-template’s past year of commit activity
    Makefile 0 Apache-2.0 3 1 8 Updated Mar 10, 2026
  • llm-d.github.io Public

    Website for llm-d: This repository builds the website seen at llm-d.ai

    llm-d/llm-d.github.io’s past year of commit activity
    JavaScript 11 Apache-2.0 24 10 3 Updated Mar 9, 2026