Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.
-
Updated
Apr 30, 2026 - Rust
Local AI anywhere, for everyone — LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. No cloud, no subscriptions.
This is a mirror of the Strix Halo HomeLab wiki, to browse the wiki click on the link below
Experimental support for many TTS/STT LLMs wrapped in a Wyoming API for consumption via Homeassistant
Strix Halo local LLM guide: 65-87 t/s on Ryzen AI Max+ 395 128GB mini PCs. Benchmarks, setup, backend comparisons, and failure cases.
Sixunited AXB35 EC control & monitoring for Windows
A comprehensive guide to running Linux (Omarchy/Arch) on the 2025 ASUS ROG Flow Z13 (AMD Strix Halo). Includes CachyOS Kernel setup, Tablet Mode fixes, and Power Management for the Ryzen AI Max
Local, ternary-weight LLM inference on AMD Strix Halo. Rust above the kernels, HIP below, zero Python at runtime. https://discord.gg/EhQgmNePg
vLLM + Qwen3.6-27B (BF16) OpenAI-compatible inference server on AMD Strix Halo (Ryzen AI Max+ 395, gfx1151). Vision input, 256K context, /v1/responses with separated reasoning, via TheRock ROCm.
Tools and documentation related to the AMD Strix-Halo AGU family (Ryzen AI Max 395) of systems. Tested on GMKtec EVO-2
llama.cpp + Qwen3.6-27B (Q8_0 GGUF) OpenAI-compatible inference server on AMD Strix Halo (Ryzen AI Max+ 395, gfx1151). 256K context, ~7.5 t/s decode via TheRock ROCm Docker.
llama.cpp setup on dedicated AMD Strix Halo machine
Simple installer script which take a download (if newer) and installs it globally. Sets Vulkan support
ComfyUI on AMD Strix Halo (RDNA 3.5 / gfx1151) via Docker. Ubuntu Rolling + UV-managed Python 3.12 + ROCm preview wheels. Solves the silent CPU fallback Debian/Python 3.13 images hit on gfx1151.
Ansible playbook to configure AMD Strix Halo machines (e.g. Framework Desktop or GMKtec EVO-X2) as local AI inference servers running Fedora 43. Sets up llama.cpp with llama-swap and Open WebUI and downloads GGUF models. With NGINX reverse proxy and TLS via ACME or self-signed certificate.
Claude Code skill for AMD Strix Halo (Ryzen AI MAX+ 395) ML setup. Handles PyTorch installation (official wheels don't work with gfx1151), GTT memory config, and environment setup. Enables 30B parameter models.
Stable Diffusion image generation on AMD Ryzen AI NPUs for Linux
Talos-O (Omni): A sovereign, embodied agentic organism forged on AMD Strix Halo. Integrating the Chimera Kernel (Linux 7.0), Zero-Copy Introspection, and the Phronesis Engine. Built from First Principles.
Production-oriented Docker Compose stack serving openai/gpt-oss-20b via vLLM on AMD Strix Halo (gfx1151, ROCm 7.2). OpenAI Responses API, host-mounted weights, hard-capped KV cache. Verified, no source build.
Native ROCm C++ kernels for Strix Halo (gfx1151): ternary BitNet GEMV, RMSNorm, RoPE, split-KV Flash-Decoding attention. Zero hipBLAS, zero Python.
Add a description, image, and links to the strix-halo topic page so that developers can more easily learn about it.
To associate your repository with the strix-halo topic, visit your repo's landing page and select "manage topics."