Skip to content
View carlosfundora's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report carlosfundora

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. llama.cpp-1-bit-turbo llama.cpp-1-bit-turbo Public

    Forked from ggml-org/llama.cpp

    HIP/ROCm fork optimized for AMD RDNA2 (gfx1030) with PrismML Q1_0_G128 1-bit quant support, RotorQuant, TurboQuant, EAGLE3 and P-EAGLE speculative decoding, and full Wave32 kernel optimizations.

    C++ 8

  2. sglang-1-bit-turbo sglang-1-bit-turbo Public

    Forked from sgl-project/sglang

    AMD ROCm (gfx1030) inference fork with RotorQuant/TurboQuant KV compression, PHANTOM-X zero-copy draft speculation, EAGLE3 speculative decoding, 12 RDNA2 crash fixes, and PrismML Bonsai Q1_0_G128 1…

    Python 5

  3. vllm-1-bit-turbo vllm-1-bit-turbo Public

    Forked from vllm-project/vllm

    HIP/ROCm fork optimized for AMD RDNA2 (gfx1030) with EAGLE3 speculative decoding, TurboQuant KV compression, PrismML Bonsai Q1_0_G128 1-bit GGUF support, and gfx1031 compatibility enablement.

    Python 1

  4. SpecForge SpecForge Public

    Forked from sgl-project/SpecForge

    Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

    Python 1

  5. gfxGRAPH gfxGRAPH Public

    CUDA Graph → HIP Graph translation layer for AMD gfx1030 (RDNA2). Bridges all 4 CUDA Graph parity gaps on ROCm.

    Python 1

  6. ATLAS ATLAS Public

    Forked from itigges22/ATLAS

    Adaptive Test-time Learning and Autonomous Specialization

    Python 1