Skip to content

v0.3 wire-up: 7 bugs surfaced by real-hardware CRUD sweep 2026-05-23 #275

@thinmintdev

Description

@thinmintdev

Background

After PRs #270 (Slots wire-up) + #271 (Models wire-up) + #274 (unload-noop fix) landed, a systematic CRUD sweep on the hal0 LXC against the live dashboard surfaced 7 stacking bugs beyond the unload-404 case that #274 already fixed.

Bugs (independent, file as separate sub-issues or fix in one cleanup PR)

  1. POST /api/slots schema mismatch. Body accepts top-level `{"model": "Qwen3-Embedding-0.6B-GGUF"}` (per audit + Lemonade-shape SlotConfig) but writes top-level `model = "..."` to TOML. Serializer in `slots.py:191-195` reads `cfg["model"]["default"]` (nested dict). Result: `model_default` missing from /api/slots response for any slot created via POST. Workaround: hand-write TOML in v0.1 nested shape (`[model] default = "..."`).

  2. POST /api/slots doesn't auto-assign port. New slots get `port=0` in state.json. Per `config.next_free_port()`, should auto-assign in 8081-8099. The POST handler at `slots.py:327` calls `sm.create(name, body)` without port-injection if body lacks one.

  3. `hal0 slot create` CLI uses v0.1 schema. No `--type` flag (embedding / reranking / transcription / tts not creatable). `--hardware` enum is `[vulkan|rocm|cpu]` (legacy backend), not the Lemonade `[gpu-vulkan|gpu-rocm|cpu|npu]` device enum. CLI needs Lemonade-shape update.

  4. Installer doesn't pre-create `/var/lib/hal0/.cache/huggingface/hub`. Lemonade (running as `hal0:hal0` user) needs this dir writable for /v1/pull. /var/lib/hal0 itself is owned root:root with no group-write. First pull fails with "filesystem error: cannot create directories: Permission denied".

  5. Dispatcher /v1/chat/completions doesn't route to Lemonade-loaded models. Models pulled via `POST /v1/pull` to Lemonade aren't in hal0's model registry. Dispatcher's `/v1/chat/completions` route fires first (before PR feat(api): /v1/* reverse-proxy to lemonade #248's /v1/* proxy catch-all) and returns `dispatch.no_route` — proxy fallthrough never happens. Either: (a) dispatcher consults Lemonade /v1/health.loaded[] before failing, or (b) on no_route, dispatcher delegates to the Lemonade proxy instead of 404'ing, or (c) Lemonade-managed routes are removed from the dispatcher in Lemonade mode (cleanest).

  6. Slot watcher flips state to ERROR when Lemonade evicts. Primary slot loaded successfully → watcher saw Lemonade loaded[] empty seconds later → set state=error "model not loaded in lemond". The eviction itself is plausible (memory `hal0_lemonade_gotchas` notes nuclear evict-all on load failure) but the slot state machine should track "model_default_assigned" separately from "model_currently_loaded_in_lemonade" and show OFFLINE + a non-error "not loaded" message instead of ERROR.

  7. Lemonade evicts loaded models spontaneously. Need to confirm whether this is implicit idle-TTL, nuclear-evict-on-other-load, or something else. Memory `hal0_lemonade_gotchas` says no idle TTL by default. May be a different trigger — needs trace.

Triage

  • Bug 5 (dispatcher routing) is the biggest user-visible gap: chat doesn't work end-to-end through hal0-api despite Lemonade serving fine direct on :13305.
  • Bug 1 (schema mismatch) is the biggest dev-experience gap: anyone creating slots via the API gets broken card data.
  • Bugs 2, 3 are CLI / installer polish.
  • Bugs 4, 6, 7 are operational.

Discovered

Direct manual testing of every CRUD path on the live LXC after merging the wire-up PRs. Per memory `feedback_test_ui_on_real_hardware`.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdashboard-v3Dashboard v3 React rewritev0.3v0.3 scope

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions