Skip to content
Change the repository type filter

All

    Repositories list

    • LightLLM

      Public
      LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed perf…
      Python
      Apache License 2.0
      3204k8548Updated Apr 19, 2026Apr 19, 2026
    • LightX2V

      Public
      Light Image Video Generation Inference Framework
      Python
      Apache License 2.0
      1872.2k1463Updated Apr 18, 2026Apr 18, 2026
    • LightX2V-BLOG

      Public
      CSS
      0000Updated Apr 14, 2026Apr 14, 2026
    • LightTTS

      Public
      LightTTS is a lightweight TTS inference framework optimized for CosyVoice2 and CosyVoice3, enabling fast and scalable speech synthesis in Python and supports st…
      Python
      Apache License 2.0
      73820Updated Apr 14, 2026Apr 14, 2026
    • general-sam-py

      Public
      Python bindings for general-sam and some utilities
      Python
      Apache License 2.0
      0500Updated Apr 14, 2026Apr 14, 2026
    • general-sam

      Public
      A general suffix automaton implementation in Rust with Python bindings
      Rust
      Apache License 2.0
      01001Updated Apr 13, 2026Apr 13, 2026
    • mtc-token-healing

      Public
      Token healing implementation in Rust
      Rust
      Apache License 2.0
      0402Updated Apr 13, 2026Apr 13, 2026
    • lightx2v_examples

      Public
      0000Updated Apr 9, 2026Apr 9, 2026
    • ComfyUI-Lightx2vWrapper

      Public
      ComfyUI custom node for lightx2v
      Python
      MIT License
      78140Updated Apr 8, 2026Apr 8, 2026
    • LightCompress

      Public
      [EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
      Python
      Apache License 2.0
      78704422Updated Apr 1, 2026Apr 1, 2026
    • [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across la…
      Cuda
      Apache License 2.0
      399300Updated Mar 27, 2026Mar 27, 2026
    • MoDES

      Public
      [CVPR 2026] This is the official PyTorch implementation of "MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping"…
      Python
      Apache License 2.0
      02600Updated Mar 16, 2026Mar 16, 2026
    • LightMem

      Public
      Python
      Apache License 2.0
      0401Updated Mar 4, 2026Mar 4, 2026
    • Qwen-Image-Edit-Causal

      Public
      In our implementation of Qwen-Image-Edit, we employ block causal attention to improve inference speed.
      Python
      Apache License 2.0
      24800Updated Feb 16, 2026Feb 16, 2026
    • GenRL

      Public
      Reinforcement Learning Framework for Visual Generation
      Python
      Apache License 2.0
      410500Updated Feb 13, 2026Feb 13, 2026
    • QVGen

      Public
      [ICLR 2026] This is the official PyTorch implementation of "QVGen: Pushing the Limit of Quantized Video Generative Models".
      Python
      Apache License 2.0
      02900Updated Feb 11, 2026Feb 11, 2026
    • Python
      MIT License
      1200Updated Feb 10, 2026Feb 10, 2026
    • HPSv3

      Public
      Python
      MIT License
      1100Updated Feb 10, 2026Feb 10, 2026
    • SageAttention

      Public
      Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and v…
      Cuda
      Apache License 2.0
      399401Updated Feb 10, 2026Feb 10, 2026
    • HTML
      Apache License 2.0
      0400Updated Feb 4, 2026Feb 4, 2026
    • Prototype

      Public
      Python
      Apache License 2.0
      31400Updated Feb 3, 2026Feb 3, 2026
    • HTML
      0000Updated Jan 14, 2026Jan 14, 2026
    • SpargeAttn

      Public
      [ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
      Cuda
      Apache License 2.0
      91000Updated Jan 12, 2026Jan 12, 2026
    • Qwen-Image-Lightning

      Public
      Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
      Python
      Apache License 2.0
      441.3k300Updated Jan 1, 2026Jan 1, 2026
    • verl

      Public
      verl: Volcano Engine Reinforcement Learning for LLMs
      Python
      Apache License 2.0
      3.7k100Updated Dec 15, 2025Dec 15, 2025
    • slime

      Public
      slime is an LLM post-training framework for RL Scaling.
      Python
      Apache License 2.0
      733000Updated Dec 8, 2025Dec 8, 2025
    • SCSS
      MIT License
      1101Updated Nov 26, 2025Nov 26, 2025
    • greedy-tokenizer

      Public
      Greedily tokenize strings with the longest tokens iteratively.
      Python
      Apache License 2.0
      0003Updated Nov 24, 2025Nov 24, 2025
    • Wan2.2-Lightning: Speed up wan2.2 model with distillation
      Python
      Apache License 2.0
      1.9k283220Updated Nov 7, 2025Nov 7, 2025
    • Python
      18000Updated Nov 6, 2025Nov 6, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.