All

73 repositories

LightLLM
Public
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed perf…
nlp deep-learning llama
nlp deep-learning llama gpt model-serving llm openai-triton
Python
•
Apache License 2.0
•320•4k•85•48•Updated Apr 19, 2026Apr 19, 2026
LightX2V
Public
Light Image Video Generation Inference Framework
video-generation diffusion-models wan-video
video-generation diffusion-models wan-video auto-regressive-diffusion-model
Python
•
Apache License 2.0
•187•2.2k•146•3•Updated Apr 18, 2026Apr 18, 2026
LightX2V-BLOG
Public
CSS
•0•0•0•0•Updated Apr 14, 2026Apr 14, 2026
LightTTS
Public
LightTTS is a lightweight TTS inference framework optimized for CosyVoice2 and CosyVoice3, enabling fast and scalable speech synthesis in Python and supports st…
text-to-speech real-time tts
text-to-speech real-time tts speech-synthesis low-latency tensorrt inference-optimization audio-generation cosyvoice cosyvoice2
Python
•
Apache License 2.0
•7•38•2•0•Updated Apr 14, 2026Apr 14, 2026
general-sam-py
Public
Python bindings for general-sam and some utilities
Python
•
Apache License 2.0
•0•5•0•0•Updated Apr 14, 2026Apr 14, 2026
general-sam
Public
A general suffix automaton implementation in Rust with Python bindings
Rust
•
Apache License 2.0
•0•10•0•1•Updated Apr 13, 2026Apr 13, 2026
mtc-token-healing
Public
Token healing implementation in Rust
Rust
•
Apache License 2.0
•0•4•0•2•Updated Apr 13, 2026Apr 13, 2026
lightx2v_examples
Public
0•0•0•0•Updated Apr 9, 2026Apr 9, 2026
ComfyUI-Lightx2vWrapper
Public
ComfyUI custom node for lightx2v
comfyui comfyui-nodes
comfyui comfyui-nodes
Python
•
MIT License
•7•81•4•0•Updated Apr 8, 2026Apr 8, 2026
LightCompress
Public
[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLMs, VLMs, and video generative models.
benchmark deployment tool
benchmark deployment tool evaluation pruning quantization wan awq large-language-models llm
Python
•
Apache License 2.0
•78•704•42•2•Updated Apr 1, 2026Apr 1, 2026
SageAttention3-sparse
Public
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across la…
Cuda
•
Apache License 2.0
•399•3•0•0•Updated Mar 27, 2026Mar 27, 2026
MoDES
Public
[CVPR 2026] This is the official PyTorch implementation of "MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping"…
moe cvpr vlm
moe cvpr vlm multimodal mixture-of-experts kimi-vl qwen3-vl cvpr-2026
Python
•
Apache License 2.0
•0•26•0•0•Updated Mar 16, 2026Mar 16, 2026
LightMem
Public
Python
•
Apache License 2.0
•0•4•0•1•Updated Mar 4, 2026Mar 4, 2026
Qwen-Image-Edit-Causal
Public
In our implementation of Qwen-Image-Edit, we employ block causal attention to improve inference speed.
Python
•
Apache License 2.0
•2•48•0•0•Updated Feb 16, 2026Feb 16, 2026
GenRL
Public
Reinforcement Learning Framework for Visual Generation
reinforcement-learning infra rl
reinforcement-learning infra rl wan dpo imagegeneration videogeneration grpo wan-video wan21
Python
•
Apache License 2.0
•4•105•0•0•Updated Feb 13, 2026Feb 13, 2026
QVGen
Public
[ICLR 2026] This is the official PyTorch implementation of "QVGen: Pushing the Limit of Quantized Video Generative Models".
wan iclr qat
wan iclr qat video-generation diffusion-models videogen model-quantization quantization-aware-training generative-ai text-to-video-generation
Python
•
Apache License 2.0
•0•29•0•0•Updated Feb 11, 2026Feb 11, 2026
VideoAlign
Public
Python
•
MIT License
•1•2•0•0•Updated Feb 10, 2026Feb 10, 2026
HPSv3
Public
Python
•
MIT License
•1•1•0•0•Updated Feb 10, 2026Feb 10, 2026
SageAttention
Public
Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and v…
Cuda
•
Apache License 2.0
•399•4•0•1•Updated Feb 10, 2026Feb 10, 2026
LightKernel
Public
HTML
•
Apache License 2.0
•0•4•0•0•Updated Feb 4, 2026Feb 4, 2026
Prototype
Public
Python
•
Apache License 2.0
•3•14•0•0•Updated Feb 3, 2026Feb 3, 2026
modeltc.github.io
Public
HTML
•0•0•0•0•Updated Jan 14, 2026Jan 14, 2026
SpargeAttn
Public
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
Cuda
•
Apache License 2.0
•91•0•0•0•Updated Jan 12, 2026Jan 12, 2026
Qwen-Image-Lightning
Public
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
Python
•
Apache License 2.0
•44•1.3k•30•0•Updated Jan 1, 2026Jan 1, 2026
verl
Public
verl: Volcano Engine Reinforcement Learning for LLMs
Python
•
Apache License 2.0
•3.7k•1•0•0•Updated Dec 15, 2025Dec 15, 2025
slime
Public
slime is an LLM post-training framework for RL Scaling.
Python
•
Apache License 2.0
•733•0•0•0•Updated Dec 8, 2025Dec 8, 2025
lightllm-blog
Public
SCSS
•
MIT License
•1•1•0•1•Updated Nov 26, 2025Nov 26, 2025
greedy-tokenizer
Public
Greedily tokenize strings with the longest tokens iteratively.
Python
•
Apache License 2.0
•0•0•0•3•Updated Nov 24, 2025Nov 24, 2025
Wan2.2-Lightning
Public
Wan2.2-Lightning: Speed up wan2.2 model with distillation
Python
•
Apache License 2.0
•1.9k•283•22•0•Updated Nov 7, 2025Nov 7, 2025
LTX-Video-Q8-Kernels
Public
Python
•18•0•0•0•Updated Nov 6, 2025Nov 6, 2025

ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ModelTC

All

All

73 repositories

LightLLM

LightX2V

LightX2V-BLOG

LightTTS

general-sam-py

general-sam

mtc-token-healing

lightx2v_examples

ComfyUI-Lightx2vWrapper

LightCompress

SageAttention3-sparse

MoDES

LightMem

Qwen-Image-Edit-Causal

GenRL

QVGen

VideoAlign

HPSv3

SageAttention

LightKernel

Prototype

modeltc.github.io

SpargeAttn

Qwen-Image-Lightning

verl

slime

lightllm-blog

greedy-tokenizer

Wan2.2-Lightning

LTX-Video-Q8-Kernels

All

All

Repositories list

73 repositories