Skip to content
Change the repository type filter

All

    Repositories list

    • A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
      Python
      Apache License 2.0
      1.3k17k4281Updated May 3, 2026May 3, 2026
    • Mooncake

      Public
      Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
      C++
      Apache License 2.0
      7205.2k293125Updated May 3, 2026May 3, 2026
    • sglang

      Public
      SGLang is a fast serving framework for large language models and vision language models.
      Python
      Apache License 2.0
      5.7k1106Updated May 3, 2026May 3, 2026
    • accelerate

      Public
      🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-…
      Python
      Apache License 2.0
      1.3k100Updated Apr 29, 2026Apr 29, 2026
    • 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference an…
      Python
      Apache License 2.0
      33k000Updated Apr 29, 2026Apr 29, 2026
    • SGLang is a fast serving framework for large language models and vision language models.
      Python
      Apache License 2.0
      5.7k200Updated Apr 22, 2026Apr 22, 2026
    • evalscope

      Public
      A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
      Python
      Apache License 2.0
      322000Updated Apr 13, 2026Apr 13, 2026
    • HTML
      MIT License
      2200Updated Mar 20, 2026Mar 20, 2026
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      16k1500Updated Feb 18, 2026Feb 18, 2026
    • DeepEP_fault_tolerance

      Public
      DeepEP: an efficient expert-parallel communication library that supports fault tolerance
      Cuda
      MIT License
      1.2k300Updated Jan 5, 2026Jan 5, 2026
    • gpustack

      Public
      GPU cluster manager for optimized AI model deployment
      Python
      Apache License 2.0
      516000Updated Dec 7, 2025Dec 7, 2025
    • TrEnv-X

      Public
      Go
      Apache License 2.0
      68100Updated Sep 15, 2025Sep 15, 2025
    • SGLang is a fast serving framework for large language models and vision language models.
      Python
      Apache License 2.0
      5.7k000Updated Aug 12, 2025Aug 12, 2025
    • FlashInfer: Kernel Library for LLM Serving
      Cuda
      Apache License 2.0
      949700Updated Jul 24, 2025Jul 24, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.