Popular repositories Loading
-
-
ParallelBench
ParallelBench Public[ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs
-
eta-inversion
eta-inversion Public[ECCV 2024] Official Pytorch Implementation for "Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing"
-
draft-based-approx-llm
draft-based-approx-llm Public[ICLR 2026] Draft-based Approximate Inference for LLMs
Repositories
- krew-index Public
furiosa-ai/krew-index’s past year of commit activity - kubectl-view-rngd Public
furiosa-ai/kubectl-view-rngd’s past year of commit activity - unicorn Public Forked from unicorn-engine/unicorn
Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)
furiosa-ai/unicorn’s past year of commit activity - llm-compressor-compression-part Public Forked from vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
furiosa-ai/llm-compressor-compression-part’s past year of commit activity - hf-hub Public Forked from huggingface/hf-hub
Rust client for the huggingface hub aiming for minimal subset of features over `huggingface-hub` python package
furiosa-ai/hf-hub’s past year of commit activity - K-EXAONE-evaluation-public Public
furiosa-ai/K-EXAONE-evaluation-public’s past year of commit activity - K-EXAONE-quantization-public Public
furiosa-ai/K-EXAONE-quantization-public’s past year of commit activity - vllm_compressed_tensor_custom Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
furiosa-ai/vllm_compressed_tensor_custom’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…