- Kentucky
Popular repositories Loading
-
sm120-moe-bench
sm120-moe-bench PublicSM120 MoE Inference Benchmark: Qwen3.5-397B on RTX PRO 6000 Blackwell — K=64 CUTLASS kernel fix + real-world legal prompt benchmarks
Cuda 3
-
flashinfer
flashinfer PublicForked from flashinfer-ai/flashinfer
FlashInfer: Kernel Library for LLM Serving
Python 1
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python 1
-
cutlass
cutlass PublicForked from NVIDIA/cutlass
CUDA Templates and Python DSLs for High-Performance Linear Algebra
C++ 1
-
free-code
free-code PublicForked from paoloanzn/free-code
The free build of Claude Code. All telemetry removed, security-prompt guardrails stripped, all experimental features enabled.
TypeScript 1
-
verdict-warp-decode
verdict-warp-decode PublicNeuron-centric fused MoE kernel for SM120 NVFP4 — 17.5μs/layer, 1.02x faster than VerdictMoE, 5.6x faster than CUTLASS
Cuda 1
If the problem persists, check the GitHub status page or contact support.
