BrettKinny · BrettKinny · Jun 6, 2026 · Jun 6, 2026 · Jun 6, 2026 · Jun 6, 2026
diff --git a/.github/workflows/firmware-release.yml b/.github/workflows/firmware-release.yml
@@ -78,29 +78,6 @@ jobs:
           echo "--- checksums ---"
           cat SHA256SUMS.txt
 
-      # ─── GPG signing (TODO: enable when GPG_PRIVATE_KEY secret is set) ───
-      # See docs/signed-releases.md for the full setup. Required repo secrets:
-      #   GPG_PRIVATE_KEY, GPG_PASSPHRASE, GPG_KEY_ID
-      # Add the *.asc files to the Release `files:` block below once enabled.
-      #
-      # - name: Import GPG signing key
-      #   if: ${{ secrets.GPG_PRIVATE_KEY != '' }}
-      #   run: |
-      #     echo "${{ secrets.GPG_PRIVATE_KEY }}" | gpg --batch --import
-      #     echo "default-key ${GPG_KEY_ID}" >> ~/.gnupg/gpg.conf
-      #   env:
-      #     GPG_KEY_ID: ${{ secrets.GPG_KEY_ID }}
-      #
-      # - name: Sign release artifacts
-      #   if: ${{ secrets.GPG_PRIVATE_KEY != '' }}
-      #   working-directory: firmware/firmware/build
-      #   run: |
-      #     for f in bootloader.bin partition-table.bin ota_data_initial.bin stack-chan.bin generated_assets.bin human_face_detect.espdl SHA256SUMS.txt; do
-      #       gpg --batch --yes --pinentry-mode loopback \
-      #           --passphrase "${{ secrets.GPG_PASSPHRASE }}" \
-      #           --detach-sign --armor "$f"
-      #     done
-
       - name: Create GitHub Release
         uses: softprops/action-gh-release@v2
         with:
@@ -114,14 +91,6 @@ jobs:
             firmware/firmware/build/generated_assets.bin
             firmware/firmware/build/human_face_detect.espdl
             firmware/firmware/build/SHA256SUMS.txt
-            # Add once signing is enabled:
-            # firmware/firmware/build/bootloader.bin.asc
-            # firmware/firmware/build/partition-table.bin.asc
-            # firmware/firmware/build/ota_data_initial.bin.asc
-            # firmware/firmware/build/stack-chan.bin.asc
-            # firmware/firmware/build/generated_assets.bin.asc
-            # firmware/firmware/build/human_face_detect.espdl.asc
-            # firmware/firmware/build/SHA256SUMS.txt.asc
 
   notify-failure:
     name: Notify on Failure

diff --git a/KEYS.txt b/KEYS.txt
diff --git a/README.md b/README.md
@@ -6,9 +6,11 @@
 
 **Your self-hosted [StackChan](https://github.com/m5stack/StackChan) robot assistant — kid-minded by default, hackable by design, private by architecture.**
 
-> ⚠️ **Heads up: this is not a stable project yet.** Dotty is buggy, frequently broken, and actively changing day-to-day. End-to-end behaviour works on the maintainer's hardware but regressions land all the time, the API and config surface shifts without notice, and a fresh deploy on someone else's gear has not been verified. Treat this as a hobby-grade work-in-progress, not a polished product. Bugs, PRs, and "this didn't work for me" issues all very welcome. 🍺☕ If you do try a fresh end-to-end deploy, please get in touch — I'll buy you a beer or a coffee.
+> ⚠️ **Heads up: this is not a stable project yet.** Dotty is buggy, frequently broken, and actively changing day-to-day. End-to-end behaviour works on the maintainer's hardware but regressions land all the time, the API and config surface shifts without notice, and a fresh deploy on someone else's gear has not been verified. Treat this as a hobby-grade work-in-progress, not a polished product. Bugs, PRs, and "this didn't work for me" issues all very welcome.
 >
-> **Known rough edges:** face emoji rendering is missing visual differentiation for 4 of 9 emotions (sad / surprise / love / laughing); sound-direction localizer has a hardware-AEC-related left-bias on M5Stack CoreS3 (energy detection works, direction is unreliable); kid-voice ASR accuracy on SenseVoice has a kid-speech gap that whisper.cpp will close in a follow-up.
+> **Known rough edges:** face emoji rendering is missing visual differentiation for 4 of 9 emotions (sad / surprise / love / laughing); sound-direction localizer has a hardware-AEC-related left-bias on M5Stack CoreS3 (energy detection works, direction is unreliable); kid-voice ASR on the SenseVoice CPU default still garbles some short utterances (WhisperLocal, auto-selected on GPU hosts, handles high-pitched kid speech better).
+>
+> **Where it's at:** the first stretch of this was vibe-coded pretty hard — I moved fast, chased ideas, and let the scope sprawl to prove out the whole end-to-end loop. That phase is over. I'm now deliberately pulling it back: deleting half-built features, trimming the docs, and hardening what's left so the core is solid rather than sprawling. Fewer things, done properly.
 
 Dotty is a fully self-hosted voice stack for the M5Stack StackChan desktop robot. Open-source firmware on the robot, [xiaozhi-esp32-server](https://github.com/xinnan-tech/xiaozhi-esp32-server) for voice I/O, and a local **pi** coding agent as the brain. ASR, TTS, and session state all run on your own hardware. The LLM is pluggable — the shipped default runs a small fast model for plain conversation and escalates hard questions to a more capable model, with [llama-swap](./docs/cookbook/llama-swap-concurrent-models.md) as the recommended local backend. Swap in [Ollama](./docs/cookbook/run-fully-local.md) for the simpler single-binary option, or point at OpenRouter / any OpenAI-compatible API if you'd rather use the cloud.
 
@@ -23,13 +25,14 @@ So Dotty is the version that passes: every component runs on hardware I own, eve
 ## Features
 
 - **Kid Mode (on by default)** — age-appropriate responses via persona + per-turn prompt steering. An output content filter is planned, not yet shipped ([#138](https://github.com/BrettKinny/dotty-stackchan/issues/138)); Kid Mode is not a substitute for supervision. Toggle off for general-purpose use. See [`docs/kid-mode.md`](./docs/kid-mode.md).
-- **Local ASR** — FunASR SenseVoiceSmall runs on your hardware, no cloud transcription.
+- **Local ASR** — FunASR SenseVoiceSmall by default, no cloud transcription. WhisperLocal auto-selects on GPU hosts (better kid-speech accuracy); SenseVoiceOnnx is a lighter low-RAM option.
 - **Local or cloud TTS** — Piper (offline) or EdgeTTS (cloud). Swap with a config change.
 - **Streaming responses** — the bridge streams LLM output to the voice pipeline for lower perceived latency.
 - **Emoji expressions** — every response starts with an emoji that the firmware maps to a face animation (smile, laugh, sad, surprise, thinking, angry, love, sleepy, neutral).
 - **Voice tools** — the pi agent can search its memory, escalate hard questions to a bigger model, take a photo, and play songs, all mid-conversation.
 - **States, toggles & LEDs** — a six-state mutex (`idle / talk / story_time / security / sleep / dance`) plus two orthogonal toggles (`kid_mode`, `smart_mode`), all owned by the firmware StateManager and surfaced on the 12-pixel LED ring. Shipped on the active firmware fork (commit `d78118b`, 2026-04-27); the `firmware/firmware/` submodule pin in this repo lags, so flash from the active fork to get it. See "States, Toggles & LEDs" below and [`docs/modes.md`](./docs/modes.md).
 - **Vision (camera)** — the robot's built-in camera can capture images for multimodal LLM queries.
+- **Privacy LEDs** — hardware-bound mic (green) and camera (red) indicators on the LED ring. They light from the codec/camera enable signals via RAII guards, so a misbehaving server or model can't capture with the lights off.
 - **Calendar context** — optional calendar integration feeds upcoming events into the conversation context.
 - **Hackable** — every seam is swappable: LLM, TTS, ASR, agent framework. Fork it, rip out what you don't want, wire in your own.
 
@@ -52,6 +55,8 @@ The 12-pixel LED ring shows the current state at a glance. **Left ring 0-5 is th
 
 On the right ring, **indices 8-9 are toggle pips** for kid_mode (salmon pink) and smart_mode (orange), and **index 11 (bottom) lights red while you have the turn** (LISTENING). The `idle → talk` transition fires on `face_detected` from the firmware; VLM identity recognition runs in parallel and feeds the LLM context.
 
+> Heads up: that right-ring layout is the active-fork StateManager. On the firmware **submodule pinned in this repo**, pixels 6 and 11 instead drive the **privacy LEDs** — 6 = mic (green), 11 = camera (red) — and the StateManager pips arrive once the submodule catches up to the active fork.
+
 Full state taxonomy, colour palette, transition diagram, and per-state backing architecture: [`docs/modes.md`](./docs/modes.md).
 
 ## Web dashboard (locally hosted)