Skip to content
@Hal0ai

hal0.dev - Homelab AI Inference & Agent Control

Self-hosted home AI inference platform — multi-backend, multi-slot, extensible. Built for AMD Strix Halo (iGPU + NPU).
hal0

Self-hosted home AI inference platform

Multi-backend, multi-slot, extensible. Built for AMD Strix Halo (iGPU + NPU).

hal0.dev · Install · Docs


What we're building

hal0 is a polished, reliable inference platform for running LLMs at home — a one-command install for any modern Linux box, with model slots, an OpenAI-compatible API, a built-in dashboard, and a prewired chat UI.

It targets AMD "Strix Halo" APUs (Ryzen AI Max, gfx1151) but runs anywhere llama.cpp/llama-server runs. On Strix Halo, hal0 takes full advantage of the iGPU (ROCm + Vulkan) and the XDNA NPU via Foundry Local Manager — model swap, mixed-backend slots, and unified memory addressing all the way up to 124 GiB.

Repos

Repo What it is
hal0 The platform: orchestrator, slot lifecycle, dispatcher, dashboard, installer. Python (FastAPI) + Vue 3 + systemd. Apache-2.0.
amd-strix-halo-toolboxes Friendly fork of kyuz0/amd-strix-halo-toolboxes — adds *-server images (ENTRYPOINT=llama-server) so SlotManager can run them as systemd services. Published to GHCR at ghcr.io/hal0ai/.

Design tenets

  • One-shot installcurl … | bash lands a working, idempotent stack. Re-running converges.
  • Slot lifecycle as a state machine — every inference workload (chat, embed, rerank, STT, TTS, image) is a single-flight, systemd-managed unit with a known port and health probe.
  • Capability-first UX — flat slots are real; the dashboard groups them into user-facing capabilities (Embed / Voice / Image / NPU rollup) so config stays legible.
  • Provider pluralityllama-server, vLLM, Foundry Local Manager (XDNA NPU), moonshine (STT), kokoro/VibeVoice (TTS), Stable-Diffusion-WebUI all behind one OpenAI-compatible gateway.
  • No vendor lock-in — Apache-2.0, open registry, OpenAI-compatible API surface. Bring your own models from HF.

Status

v0.1.0-alpha — shipping. The cosign-keyless-OIDC release pipeline is wired end-to-end (signed tarball + Fulcio cert + manifest, self-verified before publish), and the one-line install actually installs. Expect rough edges: APIs may shift across 0.1.x alpha tags, no upgrade compatibility promised yet. See hal0/PLAN.md for the path to v1.0.

Community & contact

  • Websitehal0.dev
  • Emailhello@hal0.dev
  • Issues — file in the relevant repo above

Popular repositories Loading

  1. hal0 hal0 Public

    Open-source self-hosted home AI inference platform for AMD Strix Halo — multi-backend slots, OpenAI-compatible gateway, Vue 3 + FastAPI + systemd.

    Python 8 1

  2. amd-strix-halo-toolboxes amd-strix-halo-toolboxes Public

    Forked from kyuz0/amd-strix-halo-toolboxes

    OCI container toolboxes for AMD Strix Halo — ROCm, HIP, llama.cpp, FLM, moonshine, kokoro, and other inference runtimes prebuilt for the iGPU/NPU.

    Python

  3. hal0-web hal0-web Public

    hal0.dev — marketing site + Starlight docs for the hal0 home AI inference platform

    MDX

  4. .github .github Public

    Hal0ai organization profile

  5. pi-mono pi-mono Public

    Forked from earendil-works/pi

    AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods

    TypeScript

  6. recipes recipes Public

    Forked from lemonade-sdk/recipes

    Repository of Server model recipes: custom model load settings for specific use cases

    Python

Repositories

Showing 6 of 6 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…