I'm part of Nayhein AI, an open research org. Most of my work is around LLM compression, quantization pipelines, and the tooling that makes running large models on consumer hardware less painful.
| Repo | Description |
|---|---|
| GGUF Forge | Web app that automates the full HuggingFace → GGUF pipeline. Download, convert, quantize, upload. Hosted free at gguforge.com (Hosted one Temporarily Offline). |
| REAM-MoE | Generic REAM/REAP expert compression for MoE LLMs. Supports 15+ families — Qwen3, DeepSeek V3, Kimi K2, MiniMax M2, Mixtral and more. |
| rl-coding-agent | RL loop that trains an LLM into a coding agent. Self-generating problems, sandboxed multi-language execution, zero human labels. |
| hf-local-hub | Local HuggingFace Hub alternative written in Go. |
29 models on HuggingFace — REAP/REAM compressed MoEs and GGUF quants for models like MiniMax M2, Qwen3 235B, and Solar Open 100B.
Python · Go · FastAPI · llama.cpp · PyTorch · Docker



