diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index e959b70b..b80eb4d9 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -6,66 +6,16 @@ The **browser-native robotics dev environment** — vibe-code robots in a tab, r **What's defensible.** Browser is the dev surface — no install, no SDK download. Browser-resident model serving — perception, detection, fiducial pose all client-side, no GPU server, no inference bill. Layered safety — firmware-bounded motors that the IDE-level planner (user code or Pip) can't bypass; ask-human is the terminal cascade rung (openpilot-panda pattern). Static-site deployable — no backend, no accounts, no data leaving the browser. -**Directions worth pursuing:** -- **Capability schema** — JSON manifest + chip handler + auto-rendered card. The IDE's plugin system; lets a user or Pip ship a new hardware capability without touching dashboard code. -- **Multi-robot orchestration.** Two robots in the same room, scripts or Pip planning across them, no central server. - **Anti-drift guards.** Failure modes to refuse: -- *"Yet another teleop dashboard"* — joystick-shaped UI for human pilots. The wedge is planning-shaped. - *"Yet another fleet manager"* — server-resident cloud for N robots. Viam's space; ours is one operator running their own. - *"The LLM does everything autonomously"* — Pip is one surface inside the IDE. User code is co-equal; both are bounded by the same firmware safety floor. -# Developer reference - -`DEV.md` at the repo root is the canonical list of URL flags, `window.*` handles, IndexedDB stores, and common debug paths. - -# Project layout - -- `docs/` is the GitHub Pages publish root — the dashboard's static ES modules live here directly. (Repo-level docs like HARDWARE.md, SMOKE.md, etc. live at the root or inside subsystems, not in `docs/`.) -- `docs/` is flat by design — file count is manageable, naming prefixes carry the subsystem boundary. Promote a subsystem to its own folder (like `capabilities/`) once it passes ~5 files whose internals shouldn't leak outside. - -# Subsystem map - -- **Pair layer** — `pairing.js`, `phones.js`, `mobile.js` + `phone.html`. Desktop ↔ phone WebRTC. -- **Perception + detection** — `camera-frame.js`, `mediapipe.js` (closed-vocab COCO reflex, powers `watcher.js`), `aruco.js` (overhead ArUco → `entry.arucoPosition`, wired but unproven). Open-vocab queries route through `view_robot_frame` → Claude vision — no in-browser open-vocab model needed. -- **Pip / assistant** — `assistant.js`, `claude.js`, `pip-tools.js`. -- **Robot ops** — `ble.js`, `ops-response.js`, `capabilities/`. -- **Robot lifecycle** — `prepare.js`, `recovery.js`, `pinout.js`. -- **User code** — `scripts.js`. Mirrors the BLE capability surface; persisted in localStorage. See `USER-CODE.md`. -- **App shell** — `app.js`, `dom.js`, `state.js`, `settings.js`, `log.js`, `auth.js`, `passwords.js`, `index.html`, `styles.css`, `icons.svg`. - -# Smoke testing - -Two layers, kept cheap: - -- `make smoke` — pure-function tests via `node --test tests/*.test.js`. Anything in `format.js` (and future pure helpers) earns a row in `tests/format.test.js`. Runs in <1 s. -- `SMOKE.md` — manual checklist for architectural promises (lifecycle, render patterns, capability behavior, Pip flow, recovery). - -Pattern for new pure helpers: extract from `app.js` / cap runtime into `format.js`, import where used, add a test. - -`make install-hooks` wires `.githooks/` as `core.hooksPath`. Pre-commit runs `make smoke`, the gen-uuids drift check (when `protocol/uuids.json` is staged), and the sw.js VERSION stamp (when `docs/*` excluding firmware bins / sw.js is staged — folds the stamp into the user's commit so the dashboard "Reload to update" banner fires on the right commit instead of a CI follow-up). Bypassable with `--no-verify`; CI is the binding layer. - -# Comment discipline - -Default to no comments — every line is context cost in an AI-edited codebase. - -Keep when the comment carries WHY: hidden constraints, kernel/API gotchas, workarounds for past bugs, cross-file invariants ("must match `firmware/pi_robot/pi_robot.py`"), schema/wire-format examples. Cut when it restates WHAT: module preambles, narration, section banners, labels above obvious code. - -# Abstractions earn upstream consumers - -Before adding a logical layer, registry, wrapper, or routing decision, audit who outside its home module will use it. If only one module touches it, it's internal — bar for keeping it is high. The merge layer (item F) shipped with no consumers above the dashboard; deletion (R1) was clean precisely because nothing else depended on it. The cost of an unused abstraction isn't only the lines it adds — it's the explanatory comments, the cross-cutting params plumbed through siblings, and the bug-shaped negative space (the joypad-no-op was a child of the abstraction, not a coincidence). Audit before adding, not before deleting. - -# Dialog vs menu dismiss - -- **Menus + popovers** (robot-menu, avatar-menu, help popovers, Pip's `
`): outside-click + Escape dismiss. -- **Dialogs**: × button or Escape only. Outside-click would nuke session state (recovery terminal, SD prep) for a tiny convenience win. - # Control-loop architecture The "openpilot panda" pattern: safety enforced *below* the intelligent layer, not inside it. - Firmware caps pulse duration and watchdog auto-stop. LLM-issued motion auto-stops at the end of the pulse window (4s on Pi, same on ESP32); the watchdog cuts persistent commands when the dashboard goes silent. The ultrasonic dist_cm clip stops pure-forward motion at walls regardless of who issued the move. The planner can't bypass these — not even via a malformed tool call. -- Magnitude is *not* capped LLM-side anymore. Joypad and Pip share the same signed-byte range; the time-bound is what bounds a single bad decision. Earlier versions used `LLM_MAX_SPEED = 70` as an extra "reduced envelope for the planner" rung, but the duration cap already bounds the wrong-direction excursion, and the cap was making demos / Pip-driven motion artificially slow vs joypad without buying meaningful safety. +- Magnitude is *not* capped LLM-side. Joypad and Pip share the same signed-byte range; the time-bound is what bounds a single bad decision. Earlier versions used `LLM_MAX_SPEED = 70` as an extra "reduced envelope for the planner" rung, but the duration cap already bounds the wrong-direction excursion, and the cap was making Pip-driven motion artificially slow vs joypad without buying meaningful safety. - LLM-issued motion is pulse-bounded (`duration_ms` mandatory; firmware auto-stops). Persistent speed is reserved for human joystick control where there's a 20Hz+ decision loop. - `ask_human_via_phone` is the terminal rung of the decision cascade — the planner asks to be overridden rather than waits for the operator to step in. @@ -73,121 +23,55 @@ The "openpilot panda" pattern: safety enforced *below* the intelligent layer, no Different model shapes are good at different jobs — distinct primitives, not interchangeable "AI". Past planner-layer attempts to paper over capability gaps with prompt-engineering have bitten us. -**Detectors and perception:** - - **Closed-vocab reflex detector** (`mediapipe.js`, EfficientDet-Lite0 via MediaPipe Tasks API): 80 COCO classes, ~10–30 ms on GPU. Powers the per-robot Reflex card (`watcher.js`) and user-code `robot.watchFor` / `robot.detections`. Fire-once-and-disable shape — same terminal-rung pattern as `ask_human`. For backend-vision-capable Pip turns, `view_robot_frame` passes the raw frame straight to the planner — no caption step. - -**Unproven / experimental:** Overhead ArUco localization (`aruco.js`), YOLO26n closed-vocab detector (`yolo26.js`, opt-in via `/detector yolo26` — wired, not the default, no field-validation run logged). See `.claude/notes.md` "Wired but unproven." Keep out of user docs until validated. (Grounding DINO was previously the open-vocab fallback; deleted once Claude vision via `view_robot_frame` absorbed that role with scene reasoning the bbox-only detector couldn't do.) - -**Planners (Pip):** - - **Tool-using LLM via API** (`claude.js`): seconds-latency, multi-turn, tool-calling. Strong at goal decomposition, weak at closed-loop visual servo (2–5 s round-trip). Currently Claude; any tool-using LLM with the same tool surface fits here. +- **Unproven / experimental**: Overhead ArUco localization (`aruco.js`), YOLO26n closed-vocab detector (`yolo26.js`, opt-in via `/detector yolo26`). See `.claude/exploration.md` → "Wired but unproven." Keep out of user docs until validated. Grounding DINO was deleted once Claude vision via `view_robot_frame` absorbed the open-vocab role with scene reasoning the bbox-only detector couldn't do. # Transport channels -Each transport has a distinct job: +Pattern: control = BLE, observe = wifi/discover, recover = USB. - **BLE** — control plane. Low latency, proximity-authenticated, lossy. Anything that sets motor speed, toggles an LED, commits state. - **Typed ops over BLE** — structured verbs on a single characteristic (`get-log`, `get-config`, `restart-service`, `wifi-scan`, `wifi-join`). Each verb is a deliberate, reviewable decision instead of a real-shell transport. - **WebRTC** — two distinct flows. - *Phone ↔ desktop*: signaled via `wss://signal.neevs.io` (cross-network — operator may not be physically near the phone). Pair-ceremony authenticated (Ed25519 pubkey + signed pair-request). Carries camera frames, ask-human responses, robot-command relays. - - *Robot ↔ desktop* (Pi or ESP32): signaled over the BLE `SIGNAL` characteristic — no internet rendezvous, no Mixed-Content/PNA gate. ESP32 handles signaling in-firmware; Pi forwards the offer to a local aiortc daemon (`pi_robot_rtc.py`) over a Unix socket and chunks the answer back via BLE notify. BLE pair = signal = auth. Carries OTA bundles, log tail, PTY shell (Pi), and camera video (BLE-signaled `camera-signal` char on Pi). -- **Wifi-presence** — Pi exposes `.local:81/health` (pi_robot_health.py); dashboard probes it for the "on wifi" badge + service-crash detection. ESP32 retired its HTTP server in Phase 2.H — its presence shows up only when BLE-paired (wifi-status notify). No internet rendezvous for robot presence (signal.neevs.io stays for cross-network phone-pair only). + - *Robot ↔ desktop* (Pi or ESP32): signaled over the BLE `SIGNAL` characteristic — no internet rendezvous, no Mixed-Content/PNA gate. BLE pair = signal = auth. Carries OTA bundles, log tail, PTY shell (Pi), and camera video. See `firmware/esp32_robot_idf/WEBRTC.md` for the ESP32-side patches. +- **Wifi-presence** — Pi exposes `.local:81/health`; dashboard probes it for the "on wifi" badge + service-crash detection. ESP32 presence shows up only when BLE-paired (wifi-status notify). No internet rendezvous for robot presence. - **USB-CDC** — recovery plane. Last-resort serial console, runs as its own systemd unit so a `pi-robot.service` crash doesn't take recovery with it. Bounded by physical access. -Pattern: control = BLE, observe = wifi/discover, recover = USB. +# Connection-first init -# ESP32 WebRTC: chip is the DTLS client, not the server - -Classic ESP32 streams WebRTC video to current Chrome via four coordinated -patches that don't independently make sense — anyone debugging this stack -needs to see them as one shape: - -1. **DTLS role: chip is CLIENT** (forced in `dtls_srtp_init` regardless of - what libpeer's binary blob passes). libpeer always passes ROLE_SERVER, - but mbedTLS's `ssl_parse_client_hello` can't reassemble Chrome's ~1413- - byte fragmented ClientHello — bails immediately with `FEATURE_UNAVAILABLE`. - As CLIENT, chip sends the (small, never-fragmented) ClientHello and - Chrome handles whatever it receives. Chrome 124+ enforces this strictly. -2. **DTLS cert is dashboard-supplied**, ECDSA P-256. The browser generates - the keypair (WebCrypto) and self-signs an X.509 cert (@peculiar/x509), - then pushes both PEMs over the SIGNAL char (opcodes 0x07/0x08/0x09) BEFORE - the offer. Chip's `dtls_srtp_init` refuses to open WebRTC if nothing was - supplied — chip-gen path was removed for ~9 KB flash saved (linker gc on - mbedtls x509write_crt_* + ecp_gen_key). WebRTC standardized on ECDSA; - current Chrome rejects RSA in DTLS-SRTP, so the dashboard cert is built - ECDSA-only too. -3. **All chip-quirk SDP rewriting lives in the dashboard** (webrtc-robot.js). - The browser pre-strips TCP candidates from the offer (chip is UDP-only), - pins offer MID to "0" so libpeer's hardcoded "0" in the answer matches, - and flips `setup:passive`→`setup:active` on the incoming answer (libpeer - always emits passive even though chip is actually CLIENT). Used to be - three string-walking functions in webrtc_peer.c (`filter_sdp_for_chip`, - `capture_offer_mid`, `rewrite_answer_mid`); centralizing made the chip - an SDP-agnostic byte pipe. -4. **mbedTLS Kconfig** must enable the WebRTC cipher set explicitly - (DTLS_SRTP, ECDHE_ECDSA, ECDH_C, ECDSA_C, SECP256R1, GCM_C, SHA1_C, - HKDF_C). IDF defaults are tuned for HTTPS-client and lack what DTLS-SRTP - needs. X509_CREATE_C: not needed on v5 (dashboard does the cert - creation, chip only parses); v6 path of esp_peer re-enables it for - upstream cert helpers even though our flow stays dashboard-side. -5. **PSRAM-default malloc** with `RESERVE_INTERNAL=32768` — mbedTLS context - + libpeer SCTP/SRTP buffers go to PSRAM so the camera DMA's 32 KB - contiguous internal block is always available mid-session. - -Removing any one of these reverts the chip to "DTLS handshake never -completes" or "camera_acquire fails after WebRTC opens." Firmware-side -constraints (DTLS role, mbedTLS Kconfig, PSRAM malloc) are documented in -firmware/esp32_robot_idf/components/espressif__esp_peer/src/dtls_srtp.c -and sdkconfig.defaults.esp32; dashboard-side constraints (cert push, SDP -rewriting) in docs/webrtc-cert.js and docs/webrtc-robot.js. - -**Sunset path.** mbedTLS PR #10623 (3.6 backport of the fragmented DTLS- -ClientHello reassembly fix, first released in 3.6.6 / 4.1.0, March 2026) -collapses Patch 1 and the half of Patch 3 that exists because of it. -ESP-IDF v5.5.4 (current pin) ships 3.6.5, v6.0.1 ships 4.0.0 — both -pre-fix. espressif/esp-idf release/v5.5 (now on 3.6.6-idf) and -release/v6.0 (now on 4.1.0-idf) have the fix on their HEAD branches; the -next tagged release in either line is the trigger. - -Prefer v6.0.x. components/espressif__esp_peer/src/dtls_srtp_v6.c is -pre-staged (CMake selects it on `IDF_VERSION_MAJOR >= 6`) and already -encodes the post-sunset shape: role honored from cfg (no CLIENT -override), HelloVerifyRequest cookies enabled, PSA crypto path. The -cleanup on a v6.0.2 bump collapses to "delete the v5 dtls_srtp.c sibling -and the IDF major-version CMake selector" rather than reverting patches -in-place. The rest of the firmware migrates clean — NimBLE / WiFi / -esp_netif / esp_http_server / LEDC / GPIO / NVS / esp_timer call sites -all survive v6.0; exposure is `-Werror` flip + gnu23 default surfacing -latent warnings. - -v5.5.5 is the fallback if v6.0.2 is slow. On v5.5.5, the manual cleanup -is: revert chip-as-CLIENT in dtls_srtp.c (lines 75 and 161), restore -HelloVerifyRequest cookies (line 95). In either case, drop the -`setup:passive`→`setup:active` flip from docs/webrtc-robot.js. Patches -2 (dashboard ECDSA cert), 4 (mbedTLS Kconfig) and 5 (PSRAM malloc) stay -— those are WebRTC-spec or chip-shape, not mbedTLS-bug workarounds. - -**Opt-in via `CONFIG_BR_WEBRTC_ESP_PEER`** (main/Kconfig.projbuild, default -y). Set =n to drop all WebRTC code — `select` chain removes the WebRTC-only -mbedTLS bits, all call sites in webrtc_peer / app_main / gatt_svr / telemetry -guard out with `#ifdef`, and the linker's `--gc-sections` strips libpeer.a -from the image (~215 KB smaller binary). Useful for forks that only need -HTTP MJPEG video. esp_peer always *registers* as a component (Kconfig values -aren't visible to IDF's component-registration phase), but produces no live -references when off, so the linker drops it. +Connection infrastructure (BLE, WiFi, USB-CDC) initializes before capability infrastructure (camera, perception, motors). A robot whose BLE stays up with no camera is observable and actionable; the reverse is a brick. ESP32 example: NimBLE host init and `wifi_sta_init` run early in `app_main()` so radio drivers pre-allocate their buffers in fresh internal heap. Camera comes after; if it can't fit its 32 KB DMA buffer in what's left, it fails loudly and `fw_info` hides the cap so the dashboard adapts. -# Connection-first init +# Project layout + +`docs/` is the GitHub Pages publish root — static ES modules live there directly. Repo-level docs (HARDWARE.md, SMOKE.md, etc.) live at the root or inside subsystems, not in `docs/`. + +- **Root holds primitives, subsystems hold vocabularies.** `docs/` root is for (a) HTML entry points, (b) app-shell singletons (`app.js`, `state.js`, `dom.js`, `event-bus.js`, `log.js`, `settings.js`), (c) cross-cutting primitives imported by ≥3 subsystems (`format.js`, `error-capture.js`). Everything else presumed to belong in a subsystem folder. +- **Promotion trigger: vocabulary closure.** Files belong in their own folder when they (1) share a naming prefix, (2) change together for the same reason, (3) expose ≤2 symbols outward. A 3-file sealed vocabulary (`pinout-*`) is more ready than a 6-file loose collection (`mobile-*`). When a prefix collects files that change for *different* reasons, split — don't folder. -Connection infrastructure (BLE, WiFi, USB-CDC) initializes before capability infrastructure (camera, perception, motors). When constrained resources force a tradeoff, connection wins. A robot whose BLE stays up with no camera is observable and actionable; the reverse is a brick. +# Comment discipline -ESP32 example: NimBLE host init and `wifi_sta_init` run early in `app_main()` so radio drivers pre-allocate their buffers in fresh internal heap. Camera comes after; if it can't fit its 32 KB DMA buffer in what's left, it fails loudly via `camera_init_error()` and `fw_info` hides the cap so the dashboard adapts. +Every line is context cost in an AI-edited codebase. Comments earn their place when they carry WHY: hidden constraints, kernel/API gotchas, workarounds for past bugs, cross-file invariants ("must match `firmware/pi_robot/pi_robot.py`"), schema/wire-format examples. Restatement (module preambles, narration, section banners, labels above obvious code) is the cut. -# Unit preconditions belong in the script, not in `Condition*` +# Abstractions earn upstream consumers -`ConditionPathExists=`, `ConditionFileNotEmpty=`, etc. evaluate **once** at unit-start time and silently skip the unit when false — no retry, no log noise the operator can search for, no recovery without manual `systemctl start`. When the prerequisite is racy (asynchronous kernel-driver probes, hotplug events, network reachability, anything not synchronously guaranteed by an `After=` ordering), a missed check turns the unit invisibly inert until the next reboot, and even that may race the same way. +Before adding a logical layer, registry, wrapper, or routing decision, audit who outside its home module will use it. If only one module touches it, it's internal. The cost of an unused abstraction isn't only the lines it adds — it's the explanatory comments, the cross-cutting params plumbed through siblings, and the bug-shaped negative space (a one-shot helper turns into a parameter on every sibling, then the sibling that forgot to use it ships a regression). Audit before adding, not before deleting. -Pattern instead: drop the `Condition*` and wait inside the `ExecStart` script with a bounded poll loop. The script makes the timeout legible (logs a clear failure on exhaustion), the unit gets to use `Restart=on-failure` for self-healing, and a future contributor can read the wait-condition next to the work it gates. The `usb-gadget.service` → `usb-gadget-setup.sh` pair is the reference shape: 10 s poll for `/sys/class/udc` to populate, clean exit-1 with a message if dwc2 never publishes. +# Dialog vs menu dismiss -If the precondition really is synchronous and unambiguous (a config file the user wrote, the existence of a hardware feature already enumerated at boot), `Condition*` is fine. The line is "does this become true asynchronously after the unit's `After=` ordering?" — if yes, wait in the script. +- **Menus + popovers** (robot-menu, avatar-menu, help popovers, Pip's `
`): outside-click + Escape dismiss. +- **Dialogs**: × button or Escape only. Outside-click would nuke session state (recovery terminal, SD prep) for a tiny convenience win. +# References + +- `DEV.md` — URL flags, `window.*` handles, IndexedDB stores, common debug paths. +- `SMOKE.md` — manual checklist for architectural promises. +- `USER-CODE.md` — surface that `scripts.js` exposes to user-authored code. +- `HARDWARE.md` — wiring, board-specific knobs. +- `.claude/direction.md` — what we're committing to close for the course pilot. +- `.claude/exploration.md` — open architectural directions, design rationale, wired-but-unproven inventory, forks evaluated. +- `.claude/field.md` — positioning analysis vs adjacent work. +- `firmware/esp32_robot_idf/WEBRTC.md` — the four coordinated DTLS/SDP patches. +- `firmware/pi_robot/SYSTEMD.md` — preconditions-belong-in-the-script pattern. +- `make smoke` — pure-function tests (<1 s); `make install-hooks` wires pre-commit (`make smoke` + gen-uuids drift + sw.js VERSION stamp), bypassable with `--no-verify`, CI is the binding layer. diff --git a/.claude/direction.md b/.claude/direction.md index 6e22c4d5..9404252a 100644 --- a/.claude/direction.md +++ b/.claude/direction.md @@ -1,229 +1,34 @@ -# Architectural direction — better-robotics +# Direction -Long-horizon shape decisions. Unlike `working.md` (tactical pending), this file names structural moves the project is committing to. Updated when the shape of the system changes. +What we're committing to close for the Fall 2026 course pilot. Open exploration lives in `exploration.md`; positioning research in `field.md`. -## 1. Generic typed-characteristic runtime (in flight) +## Ranked gaps -**Claim.** Every capability today exists in ~3 places (browser module, Pi -handler, ESP32 handler). 80% of those files are boilerplate isomorphic to -the capability's TYPE, not its identity. A generic runtime keyed on type -eliminates the boilerplate. +1. **Live parameters get/set.** Learning step 2 — "tune parameters live without editing code." Capabilities are already structured; add a param characteristic + getter/setter + a tuner row per capability card. Cheapest gap to close. -**The data already exists.** `fw-info.caps` declares typed schemas: +2. **Sensor/motor hot-plug auto-discovery.** Learning step 3 — "add physical components." Today capabilities are firmware-declared, not detected at runtime. The "smart breadboard" promise. Tractable on ESP32 with an i2c scan + a discovery characteristic. -```json -{ "name": "led", "char": "…d92", "type": "toggle" } -{ "name": "motors", "char": "…d99", "type": "signed-pair", "range": [-100, 100] } -{ "name": "wifi", "chars": {...}, "type": "wifi-scan" } -{ "name": "ota", "chars": {...}, "type": "bundle-ota" } -{ "name": "camera", "chars": {...}, "type": "webrtc-installable" } -{ "name": "ops", "char": "…d9c", "type": "command" } -``` +3. **Pub/sub vocabulary for messaging.** Learning step 5 and the explicit ROS2-transition story. Could be a topic layer over BLE notify, with MQTT as the "once WiFi is on" tier. Earns its keep only if the ROS2-prep framing stays. -**The runtime (browser side).** A per-type constructor `makeXxxCap(schema)` -returns `{probe, cleanup, renderSection, wireActions, postRender?}`. Adding -a capability of a known type = one schema entry + zero JS code. +4. **Discovery graph view.** Cheap once (3) lands. Makes pub/sub legible — pedagogical payoff for low effort. -**Firmware-side direction (farther out).** Pi and ESP32 firmware have -identical ceremony: register char, parse read/write, notify on change, -gate on config. A "typed char runtime" on firmware reads the capability -declaration and handles generic typed chars with a small driver binding -per capability (`{ on_write: fn, on_read: fn }`). +5. **Simulation hook.** A `MockRobot` so a student can iterate scripts without a working kit on the table. Classroom-critical; not everyone has hardware ready every session. -**Progress so far:** -- fw-info.caps carries the typed schema (shipped) -- Browser reads + stores `entry.capSchema` (shipped) -- Each capability module exports its own `schema` for cross-check (shipped) -- **First type migrated: `toggle` → LED** (this session) -- Future types to migrate: `signed-pair`, `wifi-scan`, `bundle-ota`, - `webrtc-installable`, `command`. Each is ~2–4 hours. +## NFC tap-to-pair -**Migration strategy.** Per-type, not per-capability. When we migrate -`signed-pair`, both motors AND any future 2-axis input use the same -runtime. The compound payoff is the Nth capability, not the first. +The plan's original NFC role (handing the phone the puck's SoftAP creds) is dead post-BLE-first. Tags can still earn their keep as a *tap-to-pair-this-specific-robot* shortcut — collapses "scan → find robot-7 in a list of 12 → confirm" to a single tap. -## 2. AI-maintained documentation (cheap, deferred) +- **Tag content:** NDEF URL → `https://better-robotics.github.io/?pair=`. Dashboard reads `pair` from `location.search`, filters BLE scan to that device. +- **Android Chrome:** tap → URL → filtered scan → confirm. +- **iPhone:** iOS opens the URL but Web Bluetooth is unavailable. Workaround uses the existing phone↔desktop pair layer (`signal.neevs.io`, signed pair-request, `phone.html`): encode `phone.html?pair=`. Phone forwards `{type:"pair-robot", robotId}` over WebRTC; desktop surfaces a "Phone wants to pair robot-7 — click to confirm" banner. Desktop click is required because `navigator.bluetooth.requestDevice` needs a user gesture. Cross-network works for free. +- **Bootstrap caveat:** first-ever use still needs the existing phone↔desktop pair ceremony. -**Claim.** `README.md`, `HARDWARE.md`, `firmware/pi_robot/README.md`, and -per-capability comments all describe what `fw-info.caps` + the code -already know. They drift. An AI agent watching the schema + commit log -can regenerate docs per release. +~An afternoon to prototype. Concrete iOS+NFC demo defuses the "BLE-first leaves iPhones out" objection. -**Scope.** ~2 days to wire a pre-commit generator plus a CI check that -fails if docs aren't regenerated. Starts small: capability reference -page auto-generated from the live schema. Expands to change-log -summarization from commit messages. +## Other gaps -**Not urgent.** Doc drift isn't causing failures today. Worth doing -when the project has contributors outside the core, or when we promise -backward-compatibility guarantees that require accurate docs. +- **Visual / block-based authoring tier.** Today: capability cards (drive motors, toggle LED) and `pip.ask` natural language — no block-editor surface for "when distance < 30cm, stop and turn right." XRP and MicroBlocks (see `.claude/field.md`) ship Blockly. Open question: do cards + Pip cover non-coder authoring, or is a drag-drop tier needed? -## 3. Transparent-data-plane OTA (partially in flight) +- **Inter-puck messaging.** Every message fans out through the browser as hub — no puck↔puck path. Tied to (3) but a separate architectural commitment. -**Claim.** Every robot should have three OTA lanes with a clear fallback -order. The dashboard picks the fastest available without user -intervention. Iteration-loop speed is the core dev experience; "how fast -does code get onto the robot" sets the tone for everything else. - -**The three lanes, decreasing friction:** - -1. **BLE-stream** — always works, no WiFi needed, no LAN co-location - required. Baseline for every robot on every network. - Today: `writeValueWithResponse` + ATT ack per 180-byte frame → - 3-10 min for a 1.6 MB bin. Switching to - `writeValueWithoutResponse` + software flow control over - `ota-status` gets it to ~30 sec. **Not yet implemented.** - -2. **PNA direct to target robot** — dashboard fetches - `http:///ota` straight from the browser. Chrome/Edge's - Private Network Access (shipped 2022) gates the first request on a - one-time user consent per origin. No TLS on the robot, no cert - ceremony, no crypto IRAM pressure. ~1 sec for a 1.6 MB bin over - LAN. Works whenever the dashboard and robot share a network. - **Not yet implemented on ESP32** (Pi doesn't need this lane — BLE - bundle OTA is already fast enough for Pi-sized updates). - -3. **Pi-as-gateway** — for multi-robot orchestration and offline-first - classroom deployments. Pi runs an `aioquic` WebTransport server - with a self-signed cert; dashboard uses `serverCertificateHashes` - pinning (cert sha256 published in Pi's fw-info) to connect - without PKI ceremony. Pi proxies raw TCP to the target ESP32 on - the LAN. Same ~1 sec speed as PNA direct, with bonus orchestration - surface (mesh multiple ESP32s, serve dashboard offline). - **Not yet implemented.** Earns its slot when multi-robot coord or - offline-first use cases land, not purely for OTA speed. - -**Why the three-lane shape is right:** -- Lane 1 works on BLE only. No WiFi assumption. -- Lane 2 works when browser and robot share a LAN. Most common case. -- Lane 3 works when the fleet has a Pi (most Better Robotics fleets do). - -Dashboard tries fastest available, falls back automatically. User never -picks a lane — it just updates as fast as the topology allows. - -**What's baked in vs what's not:** -- BLE-stream as a baseline works today (for Pi bundle OTA; for ESP32 - single-binary OTA, the WithResponse variant is live and slow). -- ESP32 already runs a raw `WiFiServer` (for MJPEG) — adding a `/ota` - endpoint on the same task is near-zero new code on the firmware side. -- Pi-as-gateway is purely additive to `pi_robot.py` — every Pi ships - with it, no opt-in, just one more capability. -- Dashboard-side lane selection: not yet written. Attempts lanes in - order, falls back on timeout/error. - -**Sequencing:** -1. BLE-WithoutResponse first (universal, smallest change). -2. PNA + ESP32 `/ota` endpoint second (big bang for effort). -3. Pi-as-gateway when its orchestration/offline story earns it. - -## 4. ESP32 build-as-a-service (bold, later) - -**Claim.** ESP32 firmware is purely deterministic from `{board, caps}`. -Users currently install `arduino-cli` + core + toolchain to compile. -If a service accepts a config and returns a signed `.bin`, the dashboard's -"Flash firmware" button fetches a per-robot-config binary; no local dev -environment is needed for adding capabilities. - -**Constraint.** The service has to be reliable enough that users aren't -stuck if it's down. Either (a) same-origin build on GitHub Actions, or -(b) a small hosted build service, or (c) in-browser compile via -something like Wokwi's WebAssembly toolchain (the bold option). - -**The compound effect.** Combined with #1, adding an ESP32 capability -becomes: declare schema, bind driver code in a capability driver DSL, -click Flash. No C++, no toolchain, no linker flags. - -**Worth it when.** Project has contributors who want to add capabilities -without learning the ESP32 toolchain. Today the audience is small enough -that `make flash` is fine. - -## 5. Closed-loop visual control: draw-a-path (next, after overhead ArUco validates) - -**Claim.** Once an overhead camera + marker is established (the -`aruco.js` work — overhead localization writing `entry.arucoPosition` -per scan), the natural next layer is closed-loop control driven from -that pose. Operator props the phone (or local webcam) overhead, finger- -draws a path on the phone screen, the robot follows it. New sensor -isn't needed; the pose primitive is already shipping. - -**The hard sub-problem isn't drawing or motor control — it's pose -reliability.** Without knowing where the robot is each frame, the -closed loop doesn't close and the robot drifts within seconds. The -overhead ArUco surface gates this — until metric accuracy is -validated against tape-measure ground truth (see `.claude/notes.md` → -"Wired but unproven"), don't build the follower on top. - -**Right primitives, in order of load-bearing-ness:** -- **Pose**: ArUco overhead, already shipped. Producer writes - `entry.arucoPosition`; consumer (this work) must gate on - `Date.now() - updatedAt` for staleness. -- **Where compute lives**: dashboard runs detector + controller + - emits pulse-bounded BLE motor writes. Phone is I/O. Robot - unchanged. Same control-plane / data-plane split as everything else. -- **Tech**: `js-aruco2` already in. Pure-pursuit controller in plain - JS (~50 lines). -- **Control loop budget**: detect (~15 ms) + plan (~1 ms) + BLE - pulse (~50 ms) ≈ 70 ms / iter → ~14 Hz. Each iteration emits a - short pulse (`duration_ms ≈ 100 ms`); firmware watchdog auto-stops - if the next iter doesn't arrive. The existing pulse-bounded-motion - + watchdog invariants are the safety floor — same discipline as - Pip / user scripts. - -**Phases:** -1. **Path source.** `` overlay on `phone.html` viewfinder; - touch listeners build a stroke-point array; send over the existing - WebRTC data channel as a typed message - (`{type: "path", points: [[x, y], ...]}`). Dashboard receives and - renders on the helper's SVG overlay alongside marker outlines. No - motors yet. -2. **Closed-loop follower.** Pure-pursuit drives the most-recent - path; pulse-bounded each iter; safety stops on marker-loss ≥ 1 s, - end-of-path, or tap-to-cancel from the phone. -3. **Pip tool surface.** `get_robot_pose(robot_id)` returns - `{x, y, theta, confidence}` from `entry.arucoPosition`. Optional; - not on the MVP critical path. - -**Validation criterion.** Tape marker on a rover, prop a phone or -webcam overhead, draw a curved path on the phone screen, watch the -robot trace within ~5 cm of the line over 1-2 m. If shipping leaves -the rover drifting off-line within seconds, or the loop falls below -5 Hz on target hardware, the primitive isn't load-bearing — redesign -before extending. - -**Scope honesty.** This flips part of CLAUDE.md's "Not spatially -aware" stance: when an overhead camera + marker is present, the -robot has a known 2D pose. Not SLAM, not depth — just fiducial- -bounded planar pose. Do NOT update CLAUDE.md until phase 2 lands and -the validation criterion passes; claiming a capability before it -works is the worst kind of scope drift. - -**Failure modes to watch:** -- **Marker lost.** Phone is hand-held — will shake, tilt, occlude. - > 1 s loss → safety stop. Not optional; this IS the safety story - for this loop. -- **"Phone overhead" geometry assumption.** If held at angle, the - floor isn't co-planar with the image. For short paths and small - angles, marker pixel position is good enough as a proxy. Larger - paths earn a homography (4 known floor points OR phone IMU + - marker scale) — defer until path length actually demands it. -- **Detector latency budget.** Open-vocab "drive toward the yellow - cup" routes through Claude vision (~1–2 s round-trip), too slow - for a 5 Hz loop. A reflex-tier open-vocab detector earns its way - only when that specific use case lands — until then the ArUco- - pose loop is self-contained and MediaPipe COCO handles closed- - vocab reflex needs. - -## What this list doesn't include - -These ideas were considered and rejected or deferred for specific reasons -— recording them here so we don't re-rehash: - -- **Running without Linux on the Pi (bare-metal).** Loses Python, gpiozero, - systemd, apt. Not a simplification; a regression. The Pi being a real - computer is the feature. -- **Replacing BLE GATT with a custom protocol.** GATT is a standard with - tooling, debuggers (`bluetoothctl`, `nRF Connect`), and cross-platform - support. Reinventing would be faster to design and slower forever. -- **Making the dashboard a conversational (chat-only) UI.** Visual - feedback for video, logs, and pinout has better throughput than text. - The LLM-orchestrator direction adds chat alongside, doesn't replace. +- **Coordinate frames + time sync.** Required for sensor fusion and any multi-puck localization. Nothing in the scaffold addresses them. diff --git a/.claude/exploration.md b/.claude/exploration.md new file mode 100644 index 00000000..ecf967f7 --- /dev/null +++ b/.claude/exploration.md @@ -0,0 +1,223 @@ +# Exploration + +Open architectural directions, design rationale, runtime state under validation, and forks evaluated but not taken. The thinking-in-progress layer. Committed work lives in `direction.md`; positioning research in `field.md`. + +--- + +# Architectural directions + +Long-horizon shape decisions. Updated when the shape of the system changes. + +## 1. Generic typed-characteristic runtime (in flight) + +**Claim.** Every capability today exists in ~3 places (browser module, Pi handler, ESP32 handler). 80% of those files are boilerplate isomorphic to the capability's TYPE, not its identity. A generic runtime keyed on type eliminates the boilerplate. + +**The data already exists.** `fw-info.caps` declares typed schemas: + +```json +{ "name": "led", "char": "…d92", "type": "toggle" } +{ "name": "motors", "char": "…d99", "type": "signed-pair", "range": [-100, 100] } +{ "name": "wifi", "chars": {...}, "type": "wifi-scan" } +{ "name": "ota", "chars": {...}, "type": "bundle-ota" } +{ "name": "camera", "chars": {...}, "type": "webrtc-installable" } +{ "name": "ops", "char": "…d9c", "type": "command" } +``` + +**The runtime (browser side).** A per-type constructor `makeXxxCap(schema)` returns `{probe, cleanup, renderSection, wireActions, postRender?}`. Adding a capability of a known type = one schema entry + zero JS code. + +**Firmware-side direction (farther out).** Pi and ESP32 firmware have identical ceremony: register char, parse read/write, notify on change, gate on config. A "typed char runtime" on firmware reads the capability declaration and handles generic typed chars with a small driver binding per capability (`{ on_write: fn, on_read: fn }`). + +**Progress so far:** +- fw-info.caps carries the typed schema (shipped) +- Browser reads + stores `entry.capSchema` (shipped) +- Each capability module exports its own `schema` for cross-check (shipped) +- First type migrated: `toggle` → LED +- Future types to migrate: `signed-pair`, `wifi-scan`, `bundle-ota`, `webrtc-installable`, `command`. Each is ~2–4 hours. + +**Migration strategy.** Per-type, not per-capability. When we migrate `signed-pair`, both motors AND any future 2-axis input use the same runtime. The compound payoff is the Nth capability, not the first. + +## 2. AI-maintained documentation (cheap, deferred) + +**Claim.** `README.md`, `HARDWARE.md`, `firmware/pi_robot/README.md`, and per-capability comments all describe what `fw-info.caps` + the code already know. They drift. An AI agent watching the schema + commit log can regenerate docs per release. + +**Scope.** ~2 days to wire a pre-commit generator plus a CI check that fails if docs aren't regenerated. Starts small: capability reference page auto-generated from the live schema. Expands to change-log summarization from commit messages. + +**Not urgent.** Doc drift isn't causing failures today. Worth doing when the project has contributors outside the core, or when we promise backward-compatibility guarantees that require accurate docs. + +## 3. Transparent-data-plane OTA (partially in flight) + +**Claim.** Every robot should have three OTA lanes with a clear fallback order. The dashboard picks the fastest available without user intervention. Iteration-loop speed is the core dev experience; "how fast does code get onto the robot" sets the tone for everything else. + +**The three lanes, decreasing friction:** + +1. **BLE-stream** — always works, no WiFi needed, no LAN co-location required. Baseline for every robot on every network. Today: `writeValueWithResponse` + ATT ack per 180-byte frame → 3-10 min for a 1.6 MB bin. Switching to `writeValueWithoutResponse` + software flow control over `ota-status` gets it to ~30 sec. **Not yet implemented.** + +2. **PNA direct to target robot** — dashboard fetches `http:///ota` straight from the browser. Chrome/Edge's Private Network Access (shipped 2022) gates the first request on a one-time user consent per origin. No TLS on the robot, no cert ceremony, no crypto IRAM pressure. ~1 sec for a 1.6 MB bin over LAN. Works whenever the dashboard and robot share a network. **Not yet implemented on ESP32** (Pi doesn't need this lane — BLE bundle OTA is already fast enough for Pi-sized updates). + +3. **Pi-as-gateway** — for multi-robot orchestration and offline-first classroom deployments. Pi runs an `aioquic` WebTransport server with a self-signed cert; dashboard uses `serverCertificateHashes` pinning (cert sha256 published in Pi's fw-info) to connect without PKI ceremony. Pi proxies raw TCP to the target ESP32 on the LAN. Same ~1 sec speed as PNA direct, with bonus orchestration surface (mesh multiple ESP32s, serve dashboard offline). **Not yet implemented.** Earns its slot when multi-robot coord or offline-first use cases land, not purely for OTA speed. + +**Why the three-lane shape:** +- Lane 1 works on BLE only. No WiFi assumption. +- Lane 2 works when browser and robot share a LAN. Most common case. +- Lane 3 works when the fleet has a Pi (most Better Robotics fleets do). + +Dashboard tries fastest available, falls back automatically. User never picks a lane. + +**Sequencing.** BLE-WithoutResponse first (universal, smallest change). PNA + ESP32 `/ota` second (big bang for effort). Pi-as-gateway when its orchestration/offline story earns it. + +## 4. ESP32 build-as-a-service (bold, later) + +**Claim.** ESP32 firmware is purely deterministic from `{board, caps}`. Users currently install `arduino-cli` + core + toolchain to compile. If a service accepts a config and returns a signed `.bin`, the dashboard's "Flash firmware" button fetches a per-robot-config binary; no local dev environment is needed for adding capabilities. + +**Constraint.** The service has to be reliable enough that users aren't stuck if it's down. Either (a) same-origin build on GitHub Actions, or (b) a small hosted build service, or (c) in-browser compile via something like Wokwi's WebAssembly toolchain (the bold option). + +**The compound effect.** Combined with #1, adding an ESP32 capability becomes: declare schema, bind driver code in a capability driver DSL, click Flash. No C++, no toolchain, no linker flags. + +**Worth it when.** Project has contributors who want to add capabilities without learning the ESP32 toolchain. Today the audience is small enough that `make flash` is fine. + +## 5. Closed-loop visual control: draw-a-path (next, after overhead ArUco validates) + +**Claim.** Once an overhead camera + marker is established (the `aruco.js` work — overhead localization writing `entry.arucoPosition` per scan), the natural next layer is closed-loop control driven from that pose. Operator props the phone (or local webcam) overhead, finger-draws a path on the phone screen, the robot follows it. New sensor isn't needed; the pose primitive is already shipping. + +**The hard sub-problem isn't drawing or motor control — it's pose reliability.** Without knowing where the robot is each frame, the closed loop doesn't close and the robot drifts within seconds. The overhead ArUco surface gates this — until metric accuracy is validated against tape-measure ground truth (see "Wired but unproven" below), don't build the follower on top. + +**Right primitives, in order of load-bearing-ness:** +- **Pose**: ArUco overhead, already shipped. Producer writes `entry.arucoPosition`; consumer (this work) must gate on `Date.now() - updatedAt` for staleness. +- **Where compute lives**: dashboard runs detector + controller + emits pulse-bounded BLE motor writes. Phone is I/O. Robot unchanged. Same control-plane / data-plane split as everything else. +- **Tech**: `js-aruco2` already in. Pure-pursuit controller in plain JS (~50 lines). +- **Control loop budget**: detect (~15 ms) + plan (~1 ms) + BLE pulse (~50 ms) ≈ 70 ms / iter → ~14 Hz. Each iteration emits a short pulse (`duration_ms ≈ 100 ms`); firmware watchdog auto-stops if the next iter doesn't arrive. The existing pulse-bounded-motion + watchdog invariants are the safety floor — same discipline as Pip / user scripts. + +**Phases:** +1. **Path source.** `` overlay on `phone.html` viewfinder; touch listeners build a stroke-point array; send over the existing WebRTC data channel as a typed message (`{type: "path", points: [[x, y], ...]}`). Dashboard receives and renders on the helper's SVG overlay alongside marker outlines. No motors yet. +2. **Closed-loop follower.** Pure-pursuit drives the most-recent path; pulse-bounded each iter; safety stops on marker-loss ≥ 1 s, end-of-path, or tap-to-cancel from the phone. +3. **Pip tool surface.** `get_robot_pose(robot_id)` returns `{x, y, theta, confidence}` from `entry.arucoPosition`. Optional; not on the MVP critical path. + +**Validation criterion.** Tape marker on a rover, prop a phone or webcam overhead, draw a curved path on the phone screen, watch the robot trace within ~5 cm of the line over 1-2 m. If shipping leaves the rover drifting off-line within seconds, or the loop falls below 5 Hz on target hardware, the primitive isn't load-bearing — redesign before extending. + +**Scope honesty.** This flips part of CLAUDE.md's "Not spatially aware" stance: when an overhead camera + marker is present, the robot has a known 2D pose. Not SLAM, not depth — just fiducial-bounded planar pose. CLAUDE.md updates only after phase 2 lands and the validation criterion passes; claiming a capability before it works is the worst kind of scope drift. + +**Failure modes to watch.** Marker lost > 1 s → safety stop (this IS the safety story for this loop). Phone-held-at-angle breaks the co-planar assumption: small angles tolerable, larger paths earn a homography. Open-vocab "drive toward the yellow cup" routes through Claude vision (~1–2 s) — too slow for a 5 Hz loop; ArUco-pose stays self-contained until reactive open-vocab earns its way. + +## Rejected / deferred + +- **Running without Linux on the Pi (bare-metal).** Loses Python, gpiozero, systemd, apt. Not a simplification; a regression. The Pi being a real computer is the feature. +- **Replacing BLE GATT with a custom protocol.** GATT is a standard with tooling, debuggers (`bluetoothctl`, `nRF Connect`), and cross-platform support. Reinventing would be faster to design and slower forever. +- **Making the dashboard a conversational (chat-only) UI.** Visual feedback for video, logs, and pinout has better throughput than text. The LLM-orchestrator direction adds chat alongside, doesn't replace. + +--- + +# Pip's proactive messages come from project state, not external feeds + +No scheduled pipeline scraping external robotics sources (X, Reddit, HN, ArXiv, Hackaday RSS). No notification backend, no content channel. + +## What we do instead + +Situational observations from state the dashboard already has. A colleague leaning over your desk saying *"hey, I notice X,"* not a newsletter. + +Inputs, all same-origin: + +- Robot telemetry — firmware version drift, last-seen timestamps, which robots are `firmware-down` vs `connected`, which capabilities have never been exercised. +- User scripts (`scripts.js` + localStorage) — saved but never run, errored on last run, related to a stalled goal. +- Project intent — `.claude/CLAUDE.md` and `.claude/working.md` when present. + +One short observation, tied to a user-activity boundary (session start, session end, robot reconnect after > 24h), not a wall-clock cron. Dismissable without consequence. + +Shape examples: + +``` +Your "line-follow" script errored on BLE drop last Thursday. +Heartbeat shipped — worth retrying? + +You've paired Pi-03 twice but never opened the camera capability. +Want me to walk through it? + +Firmware on Pi-01 is 4 versions behind. New pulse caps landed in +between — OTA when convenient? +``` + +Each names a *specific* thing *this user* did or didn't do — signal a generic feed can't carry. + +## Why this shape + +Pip runs in the browser; every input that would change what Pip says is also in the browser, or one `fetch()` away in `.claude/*.md`. Putting signal source on a schedule outside the browser separates thinking from data and pays the cost of keeping them in sync. + +- **Zero new infrastructure.** No cron, scraper, CI job, JSON corpus, filter pipeline. Just `assistant.js` plus a small observation reader. +- **Zero new trust boundary.** Same-origin reads of the dashboard's own stores. +- **High signal by construction.** An observation referencing the user's own script by name clears "is this relevant?" before it's written. A trending-reddit link does not. +- **Dismissal is free.** Observations are ephemeral; ignoring one builds no unread debt. + +## Failure mode this avoids + +"Give Pip a feed so messages aren't boring" is the engagement reflex every newsletter SaaS tries: push content on a schedule, hope relevance averages out. Generic feeds get ignored because the user pays a translation cost from *"someone built X"* to *"does this matter for me right now?"* That cost kills engagement. + +## When would an external feed earn its way in? + +When the state-aware layer saturates — Pip has mined what the browser knows and the ceiling becomes *"Pip doesn't know about the new ESP32-S3 cam module that would unblock the perception loop."* Then: + +1. GitHub Action on the `pulse` pattern — public-API-only, no-auth, committing JSON to `docs/feed/`. Sources: Reddit `.json`, HN Algolia, GitHub trending by topic, Hackaday/Adafruit/Sparkfun RSS, ArXiv. **Not X**: free tier died. +2. Feed is a **secondary input to the same filter** reading project state. Filter stays in the browser; the Action is dumb by design. +3. Observations referencing external content still clear *"and here's why it matters for your current work."* + +State-aware layer first, let it saturate, then add the corpus. + +--- + +# Wired but unproven — pending real-world validation + +Loads at runtime but not confirmed end-to-end against hardware. Kept out of `README.md`, `DEV.md`, and the GitHub repo About until a real run confirms the path. + +## Overhead ArUco localization (`docs/perception/aruco.js`) + +**What's wired.** +- Headless detection service — no UI panel. Helper-card "Camera role" select on each paired phone offers `Operator / Overhead localization / Mount on `. Choosing Overhead sets `settings.arucoOverheadPhoneId` (persisted) and points the detection loop at that phone's existing preview tile in the helpers card. No second video element, no second decoder. +- SVG overlay paints detected markers directly on the helper's preview (`patchArucoOverlay`-style — same shape as the deleted phone-on-robot tracker, retargeted at the helpers tile). +- Detection via `js-aruco2` from jsDelivr (`cv.js` + `aruco.js` + `posit1.js`), dictionary `ARUCO_4X4_50`. Printable marker sheets in `docs/assets/aruco_markers_0.pdf` and `_1.pdf`. Pose via `POS.Posit` using `settings.arucoMarkerSizeMm` + focal-length heuristic (`max(w,h) * 0.85`) — no calibration file. +- Marker → robot binding: prefers explicit `entry.arucoMarkerId` (persisted in localStorage; set via `window.bindArucoMarker(robotId, markerId)`). Falls back to positional `entries[m.id]` only when NO entry has claimed that id. Hits write `entry.arucoPosition = { x, y, headingDeg, markerSizeMm, updatedAt }`. + +**What hasn't been confirmed.** +- Focal-length heuristic accuracy against a real ruler ("perfect" to "off by 30%" both plausible without ground truth). +- ARUCO_4X4_50 detection reliability on a phone-camera feed via WebRTC (compression, autofocus hunting, rolling-shutter under motion). +- Multi-robot orchestration end-to-end: two robots, two markers, two bindings, both `arucoPosition`s update on the same scan, motion planner consumes both without drift. Wedge demo for the primitive. +- Ultra-wide-by-default for "Back" sharing (`docs/mobile.js` `openCameraStream`) means a phone designated for overhead localization will feed an ultra-wide stream — barrel distortion + a much shorter focal length than the `max(w,h)*0.85` heuristic assumes. The aruco detector itself will likely still find markers; pose estimation will be biased. If overhead aruco gets promoted out of unproven, the right fix is to force a non-widening lens on phones designated as overhead, or take a per-deviceId intrinsic from a one-time calibration. + +**To validate.** Print sheet 0 + sheet 1, tape marker 0 on Pi-01 and marker 1 on Pi-02. Pair a phone, share its camera, set role to "Overhead localization." Bind explicitly: `window.bindArucoMarker("", 0)` and `window.bindArucoMarker("", 1)`. Confirm both robots' `arucoPosition` update simultaneously on each detection, metric XY within ~20 mm of tape-measured ground truth at ~50 cm camera height. If it holds, promote: line in `README.md` perception section, bullet in `DEV.md` "When to reach for what." + +**Why bother.** Sub-pixel deterministic pose for a tagged object is the only roadmap primitive that closes the visual-servo loop without a depth sensor — and the substrate for multi-robot orchestration. Drives `entry.arucoPosition` which the motion controller consumes as ground truth (subject to its staleness gate — `aruco.js` does not clear stale entries when a robot leaves frame; consumer's job). + +## Grounding DINO open-vocab detector — deleted (May 2026) + +Lived in `docs/grounding.js` as the open-vocab fallback when MediaPipe COCO's 80 classes couldn't cover a target. Disabled after real-world false positives (medium-confidence "stop sign.[SEP]" matches against a robot-vacuum dock — BERT separator token leaking through the post-processor). Deleted entirely once Claude vision via `view_robot_frame` was confirmed to fill the same role with scene reasoning the bbox-only detector couldn't do. + +**Why deleting rather than fixing.** The role this module filled — "give Pip a way to localize 'the yellow can' or 'the book on the bag'" — is now served by the planner itself. Pip sends a frame to Claude, Claude reads the scene, plans the next action. No bbox needed when the planner can reason. Re-arming the closed-vocab variant would duplicate the role with worse semantics (no scene context, false-positive history) AND keep a 151 MB model download in the asset graph. + +**What to revisit if it comes back.** A future need for sub-second open-vocab bboxes at the rate the LLM can't serve (Claude vision is ~1–2 s round-trip; bbox-rate use cases want ~100 ms). At that point: re-evaluate Grounding DINO 1.5, owlv2, or YOLO-World — but only after a use case earns it. Reactive open-vocab is not on the wedge today. + +## YOLO26n closed-vocab detector (`docs/perception/yolo26.js`) + +Faster sibling for reactive-tier use cases (visual servo, gamepad-overlay tracking). Wired behind `/detector yolo26` with the registry in `docs/perception/detectors.js`; MediaPipe stays the default. ONNX runtime via WebGPU EP with WASM fallback, ~10 MB COCO model fetched from HuggingFace on first use. + +**What hasn't been confirmed.** End-to-end accuracy vs MediaPipe EfficientDet-Lite0 on the same scenes, WebGPU EP stability across the Chrome/Edge versions students will run, first-fetch UX on classroom WiFi (10 MB ONNX + onnxruntime-web bytes). Promote to default — or remove from the registry — only after a side-by-side run. Out of `README.md` and `DEV.md` until then. + +## Laptop camera → phone feed (helper card role "Send to phone") + +Local-cam helper card gains a third role alongside Overhead. Selecting "Send to phone" opens the camera via getUserMedia and `peer.addTrack`s the video track on every paired phone; the phone displays it in the existing `phone-cam-section` since it's "incoming forwarded video from desktop" — the same sink robot cameras already use. Runtime-only state (`_phoneFeedLocalId` in phone-helpers.js), not persisted across reloads. + +**Latent.** `phone-cam-section` displays one stream at a time (`v.srcObject = e.streams[0]`, last-wins). When both a robot camera and the laptop-cam are routed to the same phone simultaneously, whichever fires `peer.onTrack` last wins; there is no UI on the phone to switch back. The existing `available-sources` / `subscribe-source` picker handles this per-robot but is not yet generalized across owner types. Acceptable for the single-source case the prototype is built around; if multi-source coexistence becomes the steady-state demo, generalize the picker (own-id namespace = `"robot:" | "local:"`, single global active per phone) before adding more source kinds. + +--- + +# Forks in the road — alternatives evaluated, with revisit triggers + +Adjacent technical paths declined, with the specific change in project direction that would trigger a revisit. (Distinct from `field.md`, which audits adjacent work.) + +## Espressif KVS WebRTC SDK for ESP32 + +**Evaluated:** May 2026. Espressif's first-party WebRTC stack ([awslabs/amazon-kinesis-video-streams-webrtc-sdk-c@beta-reference-esp-port](https://github.com/awslabs/amazon-kinesis-video-streams-webrtc-sdk-c/tree/beta-reference-esp-port), HEAD 119617b7 at evaluation time). Ships an AppRTC-mode example targeting classic ESP32. Active development, monthly sync to upstream awslabs releases, 1.2k stars. + +**What it would buy us.** Eliminates three of our four libpeer/esp_peer patches at the chip level: chip is DTLS CLIENT by default (so the fragmented-ClientHello bug is sidestepped without patching), SDP answerer emits `setup:active` directly, MID copied from remote offer, ICE agent silently ignores TCP candidates. The four-patch shape in `firmware/esp32_robot_idf/WEBRTC.md` collapses to one (mbedTLS Kconfig). Cert flow returns to chip-side ECDSA generation (~9 KB flash cost we currently save). + +**Why not now.** Signaling is hardwired to HTTPS+WebSocket against AWS KVS or `webrtc.espressif.com`. Swapping in means writing a custom `signaling_client_if` implementation that takes offers/answers off our BLE `SIGNAL` characteristic and feeds the SDK's `kvs_peer_connection_if`. The plug point is documented (`CUSTOM_SIGNALING.md` in their tree), but the work doesn't buy us anything the libpeer patches don't already deliver — our wedge is precisely "BLE-signaled, no internet rendezvous." We'd also inherit a 3 MB factory partition expectation that's marginal on classic ESP32's 4 MB flash. + +**Revisit trigger.** If a hosted-mode / internet-rendezvoused operator surface lands on the roadmap (share-a-link demos, remote tele-op, third-party robots controlling our robots), KVS WebRTC SDK is the prebuilt path — switch outright rather than reinvent BLE-on-KVS. The libpeer + four-patch setup made sense for BLE-only; it does not earn its keep against a stack that handles cloud signaling for free. + +**Bonus capability worth knowing.** KVS WebRTC Split Mode distributes signaling to ESP32-C6 (light sleep) and streaming to ESP32-P4 (deep sleep until wake-on-signal) — the only battery-powered WebRTC camera architecture in the ecosystem. Irrelevant to mains-powered robots today; remember it if low-power ever becomes a constraint. diff --git a/.claude/field.md b/.claude/field.md new file mode 100644 index 00000000..477da565 --- /dev/null +++ b/.claude/field.md @@ -0,0 +1,125 @@ +# Field + +Adjacent work that defines what positioning BetterRobotics can claim. Frame: "what's already claimed in the surrounding field," not "who do we beat." Filtered for what would change a decision. + +## schematik.io — not in this lane + +[schematik.io](https://schematik.io) bills itself as "Cursor for Hardware": AI code-generation emitting firmware/schematic-adjacent code from natural language for Arduino, ESP32, Raspberry Pi (~$4.6M pre-seed). Not a pairing UI, not a control plane, not a dashboard. A *potential input* for authoring firmware like ours, not a parallel to the runtime-control story. + +## The real candidates + +### LEGO SPIKE web app (spike.legoeducation.com) +- **Claims:** the classroom decision — "which kit lets students code from a Chromebook with no install." +- **Overlap:** Web Bluetooth + WebSerial in Chrome, no native app ([Chrome for Developers](https://developer.chrome.com/blog/lego-education-spike-web-bluetooth-web-serial)). Programs upload to hub, hub executes. +- **Divergence:** code runs *on the hub*, not the browser. Closed hardware, closed firmware, no user-owned OTA. +- **Ships today that we don't:** mature curriculum, institutional purchase channel. +- **Decision impact:** confirms BLE-first-via-browser as mainstream, not contrarian. Does not threaten browser-as-brain — they deploy to hub; we deliberately don't. + +### Sphero EDU web app +- **Claims:** same classroom decision as LEGO. +- **Overlap:** Web Bluetooth pairing of BOLT+/BOLT/Mini/RVR ([help.sphero.com](https://help.sphero.com/sphero-support/connecting-robots-in-the-sphero-edu-web-app)). +- **Divergence:** Sphero account required, their robots only. No user-owned firmware, no recovery plane, no LLM surface. +- **Ships today that we don't:** polished UI, k-12 marketplace presence, iOS native fallback. +- **Decision impact:** reinforces the "no account" moat — account-gating is exactly the friction this project refuses. + +### Makeblock (mBlock + mBot family) +- **Claims:** same K-12 classroom decision — at the largest scale claim of any vendor in this list (200k+ schools). +- **Overlap:** mBlock 5 web at [ide.mblock.cc](https://ide.mblock.cc/) runs in Chrome/Edge, connects to mBot/CyberPi/Codey Rocky over Web Bluetooth + WebSerial without a helper app ([Makeblock support](https://support.makeblock.com/hc/en-us/articles/19412317319191-Introduction-to-Direct-Connection-of-mBlock-5-on-the-web)). Block + Python. +- **Divergence:** account-required walled garden. Programs run on closed proprietary firmware. Hardware lock-in to Makeblock kits. No LLM, no recovery plane. +- **Ships today that we don't:** scale (200k schools), educator curriculum, hardware breadth (CyberPi has its own screen + sensors), Chinese-market depth, multi-platform (PC/mobile/web). +- **Decision impact:** confirms Web-Bluetooth-from-browser is the dominant K-12 STEAM pattern, not contrarian. Reinforces the "no account, no proprietary kit" wedge: every major K-12 vendor (LEGO, Sphero, Makeblock) is account-gated and kit-locked. The combination "browser-paired AND user-owned hardware AND no account" remains unoccupied. + +### MicroBlocks (microblocks.fun) +- **Claims:** browser IDE to program a BLE/serial-connected microcontroller with blocks. +- **Overlap:** runs in Chrome/Edge via WebSerial + Web Bluetooth, no install; supports micro:bit, XRP, and others ([wiki.microblocks.fun](https://wiki.microblocks.fun/en/xrp_setup)). Live programming model. +- **Divergence:** pushes a VM to the device; programs run on-board. No LLM, no phone-human handoff. Single-device focus. +- **Ships today that we don't:** live autocomplete / block editing against running firmware; a real educational community. +- **Decision impact:** closest architectural cousin. Validates "browser-first, no-account, BLE-capable" as a shipped pattern. Has no opinion on browser-as-brain for runtime. + +### XRPCode / WPILib XRP (experientialrobotics.org) +- **Claims:** cheap classroom robot + browser IDE — the tightest hardware-class analog. +- **Overlap:** browser IDE for the XRP (RP2040), Python + Blockly, no install ([WPILib docs](https://docs.wpilib.org/en/stable/docs/xrp-robot/web-ui.html)). +- **Divergence:** WiFi/WebSocket, not BLE-first — robot must be on the same network, which is exactly the classroom pain our BLE-first bet was designed around. Code runs on-robot. No LLM, no phone handoff. +- **Ships today that we don't:** FRC-backed curriculum, ~$75 hardware, real classroom deployments. +- **Decision impact:** directly validates bet #1 — WiFi-first classroom stories *do* break. + +### Viam +- **Claims:** *closest framing rhyme.* Tagline "build robots like you build software" — same dev-environment-shape pitch, different audience and distribution model. +- **Overlap:** browser dashboard, camera streaming, live control ([viam.com](https://www.viam.com/product/platform-overview)). gRPC/WebRTC to a device-resident `viam-server`. Modular components, multi-language SDKs. +- **Divergence:** server-resident B2B cloud SaaS. `viam-server` fetches config from Viam cloud at startup ([docs.viam.com](https://docs.viam.com/operate/reference/viam-server/)). Different buyer (software engineer at an industrial outfit, fleet operator), different distribution shape (account-anchored cloud product vs. static-site, no-backend). +- **Ships today that we don't:** data capture/sync, fleet management, funding, UR partnership. +- **Decision impact:** **inspiration, not competition.** Same transport stack we ship; treats the same problem space at industrial scale. Watching their feature surface tells us what becomes table-stakes for "robotics dev environment." Our distribution shape (browser-only, no backend, MIT) is the moat — they can ship features in 18 months; restructuring their cloud-product distribution model to match would be a different company. + +### Freedom Robotics +- **Claims:** browser-based teleop and remote operation of fielded robots. +- **Overlap:** WebRTC video + control via browser; SDK/agent runs on the robot ([freedomrobotics.com](https://www.freedomrobotics.com/)). +- **Divergence:** server-resident B2B cloud SaaS, TURN-relay-anchored teleop, account + fleet model. No standalone deploy, no offline mode, no LLM/scripting surface. +- **Ships today that we don't:** production teleop UX for industrial deployments, observability tooling, customer base in delivery + service robotics. +- **Decision impact:** same audience-shape conflict as Viam — enterprise/industrial vs. consumer/education/hobbyist. Worth tracking for transport / observability conventions; not a wedge threat. + +### Improv Wi-Fi (open standard) +- **Claims:** the onboarding moment — "how does a fresh device join Wi-Fi." +- **Overlap:** open standard for BLE-based Wi-Fi onboarding from a browser, Chrome/Edge ([improv-wifi.com](https://www.improv-wifi.com/)). Shipped across WLED, Tasmota, ESPHome. +- **Divergence:** explicitly scoped to Wi-Fi onboarding only — *"not the goal to offer a way for devices to share data or control."* Hands off to a device-hosted URL after provisioning. +- **Ships today that we don't:** it's a *standard*, with network-effect adoption we don't have. +- **Decision impact:** **integration candidate, not a threat.** Our BLE onboarding characteristic could optionally speak Improv so any Improv-aware browser tool can provision our robots. See `@improv-wifi/sdk-js` on npm. + +### ESP RainMaker +- **Claims:** "ESP32-based product with BLE provisioning and a dashboard to control it." +- **Overlap:** BLE provisioning for ESP32/S3/C3/C6 ([docs.rainmaker.espressif.com](https://docs.rainmaker.espressif.com/docs/sdk/rainmaker-base-sdk/DeviceManagement/provisioning/)). +- **Divergence:** cloud-account-anchored by design — user↔node mapping during provisioning, AWS Cognito underneath. Mobile-app first. No browser-first story, no LLM. +- **Ships today that we don't:** Espressif-backed, production-scale cloud infra. +- **Decision impact:** confirms that in the ESP32 ecosystem, the dominant BLE-provisioning story still assumes cloud + account + phone app. The "browser tab, no account, no server" stance remains differentiated. + +### LeRobot (Hugging Face) +- **Claims:** open-source stack to put an LLM/VLA brain on a robot. +- **Overlap:** LLM/VLA orchestration for hobby+research robots; v0.5 added Pi0-FAST, Real-Time Chunking, EnvHub ([HF blog](https://huggingface.co/blog/lerobot-release-v050), March 2026). +- **Divergence:** Python stack, GPU-assumed, imitation/RL-focused. No BLE story, no browser runtime, no classroom onboarding. Arms + manipulation, not browser-paired hobby robots. +- **Ships today that we don't:** actual VLA models, datasets, research community. +- **Decision impact:** adjacent, not parallel — the "not real-time, not spatially aware, decision loop is seconds" scope line keeps us in a different lane. Potential future integration: `scripts.js` calling LeRobot policies client-side via transformers.js. + +## Out of scope (one-liners) + +- **Wokwi** — browser simulator, not a real-device pairing UI. +- **esptool-js / ESP Web Tools** — WebSerial flashers. Shared substrate, not parallel work; we already rely on the same Web Serial API for recovery. +- **MakeCode micro:bit** — mature web IDE for micro:bit; overlaps MicroBlocks, adds little new signal. +- **Particle Device OS** — BLE provisioning exists but mobile-SDK oriented, commercial product flow, account-anchored. Same shape as RainMaker. +- **ROS 2 MoveIt, Dora-rs, industrial / arm stacks** — different buyer, different latency bracket, no browser pairing story. "Not real-time, not spatially aware" rules the lane out. +- **VEX IQ/V5, ROBOTIS** — proprietary-kit + proprietary-app lane. Doubly unavailable to the "no accounts, no server" thesis. + +## Concluding read + +**Anyone claiming the same shape — *write code for a robot in a browser tab, no install, AI assist optional, no backend*?** No. The field divides cleanly: **MicroBlocks** and **XRPCode** claim browser-IDE-to-hardware but deploy code *to* the device and have no in-browser AI layer; **LEGO SPIKE**, **Sphero EDU**, **Makeblock mBlock** claim classroom-web-app experience but are walled gardens with accounts and proprietary kits; **Viam** and **Freedom Robotics** are framing rhymes (server-resident dev environments) anchored to industrial cloud, accounts, fleet ops; **ESP RainMaker** and **Improv Wi-Fi** claim BLE-provisioning but stop there; **LeRobot** claims VLA/LLM orchestration but has no browser runtime or BLE story. + +**Anything say change direction?** No. Nearest tactical move: implement **Improv Wi-Fi** BLE onboarding alongside ours so Improv-aware tools (ESPHome Dashboard, WLED config, Home Assistant) can provision our robots. Interop win, not a strategy shift. + +**Positioning, ranked by durability (slowest to erode first):** +- **Browser-native dev surface.** Every "robotics platform" worth naming requires *some* install — `viam-server`, ESP-IDF, gpiozero on Pi, Arduino IDE. Static-site, no-backend distribution is structurally hard to copy without restructuring a whole company's product surface. +- **Browser-resident model serving.** Open-vocab detector, ArUco fiducial pose — all client-side. Viam, Freedom Robotics, LeRobot all assume server-side or per-device GPU. +- **Layered safety.** Firmware-bounded motors the IDE-level planner can't bypass. Ask-human as terminal cascade rung. Standard in driving (openpilot-panda), rare in hobby/classroom. +- **No backend, no accounts.** Static-site deployable, MIT-licensed. Sphero, Viam, Particle, RainMaker, Freedom — all account-anchor. + +Scope lines stay loud in the README. Market reads "robotics platform" and expects Sphero or Viam. Naming what it *isn't* — *not a teleop dashboard, not a fleet manager, not "AI does everything autonomously," not real-time, not spatially aware* — does more positioning work than any feature comparison. + +## Sources + +- [Schematik.io homepage](https://schematik.io) +- [LEGO Education SPIKE — Web Bluetooth + Web Serial (Chrome for Developers)](https://developer.chrome.com/blog/lego-education-spike-web-bluetooth-web-serial) +- [Sphero EDU Web App — Connecting Robots](https://help.sphero.com/sphero-support/connecting-robots-in-the-sphero-edu-web-app) +- [mBlock 5 web IDE](https://ide.mblock.cc/) +- [Makeblock support — direct browser connection](https://support.makeblock.com/hc/en-us/articles/19412317319191-Introduction-to-Direct-Connection-of-mBlock-5-on-the-web) +- [MicroBlocks XRP setup (Web Bluetooth)](https://wiki.microblocks.fun/en/xrp_setup) +- [MicroBlocks in the browser](http://www.microblocks.fun/en/microblocks_in_browser) +- [WPILib XRP Web UI](https://docs.wpilib.org/en/stable/docs/xrp-robot/web-ui.html) +- [Experiential Robotics XRP Code](https://www.experiential.bot/code) +- [Viam Platform Overview](https://www.viam.com/product/platform-overview) +- [viam-server reference](https://docs.viam.com/operate/reference/viam-server/) +- [Freedom Robotics homepage](https://www.freedomrobotics.com/) +- [Improv Wi-Fi homepage](https://www.improv-wifi.com/) +- [ESPHome 2025.10.0 changelog — Improv BLE improvements](https://esphome.io/changelog/2025.10.0/) +- [ESP RainMaker provisioning docs](https://docs.rainmaker.espressif.com/docs/sdk/rainmaker-base-sdk/DeviceManagement/provisioning/) +- [ESP RainMaker homepage](https://rainmaker.espressif.com/) +- [LeRobot v0.5.0 release notes (HF blog, Mar 2026)](https://huggingface.co/blog/lerobot-release-v050) +- [Particle BLE provisioning reference](https://docs.particle.io/reference/device-os/bluetooth-le/) +- [esptool-js (Espressif)](https://github.com/espressif/esptool-js) +- [LOFI Control (Web Bluetooth PWA for micro:bit)](https://cardboard.lofirobot.com/lofi-control-app-info/) diff --git a/.claude/notes.md b/.claude/notes.md deleted file mode 100644 index 784aa711..00000000 --- a/.claude/notes.md +++ /dev/null @@ -1,251 +0,0 @@ -# Notes - -Operator-private notes — decisions, competitive analysis, feature design rationale. Encrypted at rest via git-crypt. - ---- - -# Competitors - -Systems competing for the same user decision — *"how do I write code for a small robot from a browser tab without installing anything."* Filtered for what would change a decision. - -## schematik.io — not in this lane - -[schematik.io](https://schematik.io) bills itself as "Cursor for Hardware": AI code-generation emitting firmware/schematic-adjacent code from natural language for Arduino, ESP32, Raspberry Pi (~$4.6M pre-seed). Not a pairing UI, not a control plane, not a dashboard. A *potential input* for authoring firmware like ours, not a competitor to the runtime-control story. - -## The real candidates - -### LEGO SPIKE web app (spike.legoeducation.com) -- **Competes for:** the classroom decision — "which kit lets students code from a Chromebook with no install." -- **Overlap:** Web Bluetooth + WebSerial in Chrome, no native app ([Chrome for Developers](https://developer.chrome.com/blog/lego-education-spike-web-bluetooth-web-serial)). Programs upload to hub, hub executes. -- **Divergence:** code runs *on the hub*, not the browser. Closed hardware, closed firmware, no user-owned OTA. -- **Better than us today:** mature curriculum, institutional purchase channel. -- **Decision impact:** confirms BLE-first-via-browser as mainstream, not contrarian. Does not threaten browser-as-brain — they deploy to hub; we deliberately don't. - -### Sphero EDU web app -- **Competes for:** same classroom decision as LEGO. -- **Overlap:** Web Bluetooth pairing of BOLT+/BOLT/Mini/RVR ([help.sphero.com](https://help.sphero.com/sphero-support/connecting-robots-in-the-sphero-edu-web-app)). -- **Divergence:** Sphero account required, their robots only. No user-owned firmware, no recovery plane, no LLM surface. -- **Better than us today:** polished UI, k-12 marketplace presence, iOS native fallback. -- **Decision impact:** reinforces the "no account" moat — account-gating is exactly the friction this project refuses. - -### Makeblock (mBlock + mBot family) -- **Competes for:** same K-12 classroom decision — at the largest scale claim of any vendor in this list (200k+ schools). -- **Overlap:** mBlock 5 web at [ide.mblock.cc](https://ide.mblock.cc/) runs in Chrome/Edge, connects to mBot/CyberPi/Codey Rocky over Web Bluetooth + WebSerial without a helper app ([Makeblock support](https://support.makeblock.com/hc/en-us/articles/19412317319191-Introduction-to-Direct-Connection-of-mBlock-5-on-the-web)). Block + Python. -- **Divergence:** account-required walled garden. Programs run on closed proprietary firmware. Hardware lock-in to Makeblock kits. No LLM, no recovery plane. -- **Better than us today:** scale (200k schools), educator curriculum, hardware breadth (CyberPi has its own screen + sensors), Chinese-market depth, multi-platform (PC/mobile/web). -- **Decision impact:** confirms Web-Bluetooth-from-browser is the dominant K-12 STEAM pattern, not contrarian. Reinforces the "no account, no proprietary kit" wedge: every major K-12 vendor (LEGO, Sphero, Makeblock) is account-gated and kit-locked. The combination "browser-paired AND user-owned hardware AND no account" remains unoccupied. - -### MicroBlocks (microblocks.fun) -- **Competes for:** browser IDE to program a BLE/serial-connected microcontroller with blocks. -- **Overlap:** runs in Chrome/Edge via WebSerial + Web Bluetooth, no install; supports micro:bit, XRP, and others ([wiki.microblocks.fun](https://wiki.microblocks.fun/en/xrp_setup)). Live programming model. -- **Divergence:** pushes a VM to the device; programs run on-board. No LLM, no phone-human handoff. Single-device focus. -- **Better than us today:** live autocomplete / block editing against running firmware; a real educational community. -- **Decision impact:** closest architectural cousin. Validates "browser-first, no-account, BLE-capable" as a shipped pattern. Has no opinion on browser-as-brain for runtime. - -### XRPCode / WPILib XRP (experientialrobotics.org) -- **Competes for:** cheap classroom robot + browser IDE — the tightest hardware-class analog. -- **Overlap:** browser IDE for the XRP (RP2040), Python + Blockly, no install ([WPILib docs](https://docs.wpilib.org/en/stable/docs/xrp-robot/web-ui.html)). -- **Divergence:** WiFi/WebSocket, not BLE-first — robot must be on the same network, which is exactly the classroom pain our BLE-first bet was designed around. Code runs on-robot. No LLM, no phone handoff. -- **Better than us today:** FRC-backed curriculum, ~$75 hardware, real classroom deployments. -- **Decision impact:** directly validates bet #1 — WiFi-first classroom stories *do* break. - -### Viam -- **Competes for:** *closest framing rhyme.* Tagline "build robots like you build software" — same dev-environment-shape pitch, different audience and distribution model. -- **Overlap:** browser dashboard, camera streaming, live control ([viam.com](https://www.viam.com/product/platform-overview)). gRPC/WebRTC to a device-resident `viam-server`. Modular components, multi-language SDKs. -- **Divergence:** server-resident B2B cloud SaaS. `viam-server` fetches config from Viam cloud at startup ([docs.viam.com](https://docs.viam.com/operate/reference/viam-server/)). Different buyer (software engineer at an industrial outfit, fleet operator), different distribution shape (account-anchored cloud product vs. static-site, no-backend). -- **Better than us today:** data capture/sync, fleet management, funding, UR partnership. -- **Decision impact:** **inspiration, not competition.** Same transport stack we ship; treats the same problem space at industrial scale. Watching their feature surface tells us what becomes table-stakes for "robotics dev environment." Our distribution shape (browser-only, no backend, MIT) is the moat — they can ship features in 18 months; restructuring their cloud-product distribution model to match would be a different company. - -### Freedom Robotics -- **Competes for:** browser-based teleop and remote operation of fielded robots. -- **Overlap:** WebRTC video + control via browser; SDK/agent runs on the robot ([freedomrobotics.com](https://www.freedomrobotics.com/)). -- **Divergence:** server-resident B2B cloud SaaS, TURN-relay-anchored teleop, account + fleet model. No standalone deploy, no offline mode, no LLM/scripting surface. -- **Better than us today:** production teleop UX for industrial deployments, observability tooling, customer base in delivery + service robotics. -- **Decision impact:** same audience-shape conflict as Viam — enterprise/industrial vs. consumer/education/hobbyist. Worth tracking for transport / observability conventions; not a wedge threat. - -### Improv Wi-Fi (open standard) -- **Competes for:** the onboarding moment — "how does a fresh device join Wi-Fi." -- **Overlap:** open standard for BLE-based Wi-Fi onboarding from a browser, Chrome/Edge ([improv-wifi.com](https://www.improv-wifi.com/)). Shipped across WLED, Tasmota, ESPHome. -- **Divergence:** explicitly scoped to Wi-Fi onboarding only — *"not the goal to offer a way for devices to share data or control."* Hands off to a device-hosted URL after provisioning. -- **Better than us today:** it's a *standard*, with network-effect adoption we don't have. -- **Decision impact:** **integration candidate, not a threat.** Our BLE onboarding characteristic could optionally speak Improv so any Improv-aware browser tool can provision our robots. See `@improv-wifi/sdk-js` on npm. - -### ESP RainMaker -- **Competes for:** "ESP32-based product with BLE provisioning and a dashboard to control it." -- **Overlap:** BLE provisioning for ESP32/S3/C3/C6 ([docs.rainmaker.espressif.com](https://docs.rainmaker.espressif.com/docs/sdk/rainmaker-base-sdk/DeviceManagement/provisioning/)). -- **Divergence:** cloud-account-anchored by design — user↔node mapping during provisioning, AWS Cognito underneath. Mobile-app first. No browser-first story, no LLM. -- **Better than us today:** Espressif-backed, production-scale cloud infra. -- **Decision impact:** confirms that in the ESP32 ecosystem, the dominant BLE-provisioning story still assumes cloud + account + phone app. The "browser tab, no account, no server" stance remains differentiated. - -### LeRobot (Hugging Face) -- **Competes for:** open-source stack to put an LLM/VLA brain on a robot. -- **Overlap:** LLM/VLA orchestration for hobby+research robots; v0.5 added Pi0-FAST, Real-Time Chunking, EnvHub ([HF blog](https://huggingface.co/blog/lerobot-release-v050), March 2026). -- **Divergence:** Python stack, GPU-assumed, imitation/RL-focused. No BLE story, no browser runtime, no classroom onboarding. Arms + manipulation, not browser-paired hobby robots. -- **Better than us today:** actual VLA models, datasets, research community. -- **Decision impact:** adjacent, not competitive — the "not real-time, not spatially aware, decision loop is seconds" scope line keeps us in a different lane. Potential future integration: `scripts.js` calling LeRobot policies client-side via transformers.js. - -## Out of scope (one-liners) - -- **Wokwi** — browser simulator, not a real-device pairing UI. -- **esptool-js / ESP Web Tools** — WebSerial flashers. Dependencies of the neighborhood, not competitors; we already rely on the same Web Serial API for recovery. -- **MakeCode micro:bit** — mature web IDE for micro:bit; overlaps MicroBlocks, adds little new signal. -- **Particle Device OS** — BLE provisioning exists but mobile-SDK oriented, commercial product flow, account-anchored. Same shape as RainMaker. -- **ROS 2 MoveIt, Dora-rs, industrial / arm stacks** — different buyer, different latency bracket, no browser pairing story. "Not real-time, not spatially aware" rules the lane out. -- **VEX IQ/V5, ROBOTIS** — proprietary-kit + proprietary-app lane. Doubly unavailable to the "no accounts, no server" thesis. - -## Concluding read - -**Clean head-on competitor for the actual shape — *write code for a robot in a browser tab, no install, AI assist optional, no backend*?** No. Closest cousins split the problem: **MicroBlocks** and **XRPCode** own browser-IDE-to-hardware but deploy code *to* the device and have no in-browser AI layer; **LEGO SPIKE**, **Sphero EDU**, **Makeblock mBlock** own classroom-web-app experience but are walled gardens with accounts and proprietary kits; **Viam** and **Freedom Robotics** are framing rhymes (server-resident dev environments) anchored to industrial cloud, accounts, fleet ops; **ESP RainMaker** and **Improv Wi-Fi** own BLE-provisioning but stop there; **LeRobot** owns VLA/LLM orchestration but has no browser runtime or BLE story. - -**Anything say change direction?** No. Nearest tactical move: implement **Improv Wi-Fi** BLE onboarding alongside ours so Improv-aware tools (ESPHome Dashboard, WLED config, Home Assistant) can provision our robots. Interop win, not a strategy shift. - -**Moat, ranked by erosion runway (slowest first):** -- **Browser-native dev surface.** Every "robotics platform" worth naming requires *some* install — `viam-server`, ESP-IDF, gpiozero on Pi, Arduino IDE. Static-site, no-backend distribution is structurally hard to copy without restructuring a whole company's product surface. -- **Browser-resident model serving.** Open-vocab detector, ArUco fiducial pose — all client-side. Viam, Freedom Robotics, LeRobot all assume server-side or per-device GPU. -- **Layered safety.** Firmware-bounded motors the IDE-level planner can't bypass. Ask-human as terminal cascade rung. Standard in driving (openpilot-panda), rare in hobby/classroom. -- **No backend, no accounts.** Static-site deployable, MIT-licensed. Sphero, Viam, Particle, RainMaker, Freedom — all account-anchor. - -Keep the scope lines loud in the README. Market reads "robotics platform" and expects Sphero or Viam. Naming what it *isn't* — *not a teleop dashboard, not a fleet manager, not "AI does everything autonomously," not real-time, not spatially aware* — does more positioning work than any feature comparison. - -## Sources - -- [Schematik.io homepage](https://schematik.io) -- [LEGO Education SPIKE — Web Bluetooth + Web Serial (Chrome for Developers)](https://developer.chrome.com/blog/lego-education-spike-web-bluetooth-web-serial) -- [Sphero EDU Web App — Connecting Robots](https://help.sphero.com/sphero-support/connecting-robots-in-the-sphero-edu-web-app) -- [mBlock 5 web IDE](https://ide.mblock.cc/) -- [Makeblock support — direct browser connection](https://support.makeblock.com/hc/en-us/articles/19412317319191-Introduction-to-Direct-Connection-of-mBlock-5-on-the-web) -- [MicroBlocks XRP setup (Web Bluetooth)](https://wiki.microblocks.fun/en/xrp_setup) -- [MicroBlocks in the browser](http://www.microblocks.fun/en/microblocks_in_browser) -- [WPILib XRP Web UI](https://docs.wpilib.org/en/stable/docs/xrp-robot/web-ui.html) -- [Experiential Robotics XRP Code](https://www.experiential.bot/code) -- [Viam Platform Overview](https://www.viam.com/product/platform-overview) -- [viam-server reference](https://docs.viam.com/operate/reference/viam-server/) -- [Freedom Robotics homepage](https://www.freedomrobotics.com/) -- [Improv Wi-Fi homepage](https://www.improv-wifi.com/) -- [ESPHome 2025.10.0 changelog — Improv BLE improvements](https://esphome.io/changelog/2025.10.0/) -- [ESP RainMaker provisioning docs](https://docs.rainmaker.espressif.com/docs/sdk/rainmaker-base-sdk/DeviceManagement/provisioning/) -- [ESP RainMaker homepage](https://rainmaker.espressif.com/) -- [LeRobot v0.5.0 release notes (HF blog, Mar 2026)](https://huggingface.co/blog/lerobot-release-v050) -- [Particle BLE provisioning reference](https://docs.particle.io/reference/device-os/bluetooth-le/) -- [esptool-js (Espressif)](https://github.com/espressif/esptool-js) -- [LOFI Control (Web Bluetooth PWA for micro:bit)](https://cardboard.lofirobot.com/lofi-control-app-info/) - ---- - -# Pip's proactive messages come from project state, not external feeds - -No scheduled pipeline scraping external robotics sources (X, Reddit, HN, ArXiv, Hackaday RSS). No notification backend, no content channel. - -## What we do instead - -Situational observations from state the dashboard already has. A colleague leaning over your desk saying *"hey, I notice X,"* not a newsletter. - -Inputs, all same-origin: - -- Robot telemetry — firmware version drift, last-seen timestamps, which robots are `firmware-down` vs `connected`, which capabilities have never been exercised. -- User scripts (`scripts.js` + localStorage) — saved but never run, errored on last run, related to a stalled goal. -- Project intent — `.claude/CLAUDE.md` and `.claude/working.md` when present. - -One short observation, tied to a user-activity boundary (session start, session end, robot reconnect after > 24h), not a wall-clock cron. Dismissable without consequence. - -Shape examples: - -``` -Your "line-follow" script errored on BLE drop last Thursday. -Heartbeat shipped — worth retrying? - -You've paired Pi-03 twice but never opened the camera capability. -Want me to walk through it? - -Firmware on Pi-01 is 4 versions behind. New pulse caps landed in -between — OTA when convenient? -``` - -Each names a *specific* thing *this user* did or didn't do — signal a generic feed can't carry. - -## Why this shape - -Pip runs in the browser; every input that would change what Pip says is also in the browser, or one `fetch()` away in `.claude/*.md`. Putting signal source on a schedule outside the browser separates thinking from data and pays the cost of keeping them in sync. - -- **Zero new infrastructure.** No cron, scraper, CI job, JSON corpus, filter pipeline. Just `assistant.js` plus a small observation reader. -- **Zero new trust boundary.** Same-origin reads of the dashboard's own stores. -- **High signal by construction.** An observation referencing the user's own script by name clears "is this relevant?" before it's written. A trending-reddit link does not. -- **Dismissal is free.** Observations are ephemeral; ignoring one builds no unread debt. - -## Failure mode this avoids - -"Give Pip a feed so messages aren't boring" is the engagement reflex every newsletter SaaS tries: push content on a schedule, hope relevance averages out. Generic feeds get ignored because the user pays a translation cost from *"someone built X"* to *"does this matter for me right now?"* That cost kills engagement. - -Shipping a scheduled pipeline before the state-aware layer exists pays pipeline maintenance for output state-aware messaging would dominate on relevance anyway. - -## When would an external feed earn its way in? - -When the state-aware layer saturates — Pip has mined what the browser knows and the ceiling becomes *"Pip doesn't know about the new ESP32-S3 cam module that would unblock the perception loop."* Then: - -1. GitHub Action on the `pulse` pattern — public-API-only, no-auth, committing JSON to `docs/feed/`. Sources: Reddit `.json`, HN Algolia, GitHub trending by topic, Hackaday/Adafruit/Sparkfun RSS, ArXiv. **Not X**: free tier died. -2. Feed is a **secondary input to the same filter** reading project state. Filter stays in the browser; the Action is dumb by design. -3. Observations referencing external content still clear *"and here's why it matters for your current work."* - -State-aware layer first, let it saturate, then add the corpus. - ---- - -# Wired but unproven — pending real-world validation - -Loads at runtime but not confirmed end-to-end against hardware. Kept out of `README.md`, `DEV.md`, and the GitHub repo About. Promote into user docs only after a real run confirms the path. - -## Overhead ArUco localization (`docs/aruco.js`) - -**What's wired.** -- Headless detection service — no UI panel. Helper-card "Camera role" select on each paired phone offers `Operator / Overhead localization / Mount on `. Choosing Overhead sets `settings.arucoOverheadPhoneId` (persisted) and points the detection loop at that phone's existing preview tile in the helpers card. No second video element, no second decoder. -- SVG overlay paints detected markers directly on the helper's preview (`patchArucoOverlay`-style — same shape as the deleted phone-on-robot tracker, retargeted at the helpers tile). -- Detection via `js-aruco2` from jsDelivr (`cv.js` + `aruco.js` + `posit1.js`), dictionary `ARUCO_4X4_50`. Printable marker sheets in `docs/assets/aruco_markers_0.pdf` and `_1.pdf`. Pose via `POS.Posit` using `settings.arucoMarkerSizeMm` + focal-length heuristic (`max(w,h) * 0.85`) — no calibration file. -- Marker → robot binding: prefers explicit `entry.arucoMarkerId` (persisted in localStorage; set via `window.bindArucoMarker(robotId, markerId)`). Falls back to positional `entries[m.id]` only when NO entry has claimed that id. Hits write `entry.arucoPosition = { x, y, headingDeg, markerSizeMm, updatedAt }`. - -**What hasn't been confirmed.** -- Focal-length heuristic accuracy against a real ruler ("perfect" to "off by 30%" both plausible without ground truth). -- ARUCO_4X4_50 detection reliability on a phone-camera feed via WebRTC (compression, autofocus hunting, rolling-shutter under motion). -- Multi-robot orchestration end-to-end: two robots, two markers, two bindings, both `arucoPosition`s update on the same scan, motion planner consumes both without drift. Wedge demo for the primitive. -- Ultra-wide-by-default for "Back" sharing (`docs/mobile.js` `openCameraStream`) means a phone designated for overhead localization will feed an ultra-wide stream — barrel distortion + a much shorter focal length than the `max(w,h)*0.85` heuristic assumes. The aruco detector itself will likely still find markers; pose estimation will be biased. If overhead aruco gets promoted out of unproven, the right fix is to force a non-widening lens on phones designated as overhead, or take a per-deviceId intrinsic from a one-time calibration. - -**To validate.** Print sheet 0 + sheet 1, tape marker 0 on Pi-01 and marker 1 on Pi-02. Pair a phone, share its camera, set role to "Overhead localization." Bind explicitly: `window.bindArucoMarker("", 0)` and `window.bindArucoMarker("", 1)`. Confirm both robots' `arucoPosition` update simultaneously on each detection, metric XY within ~20 mm of tape-measured ground truth at ~50 cm camera height. If it holds, promote: line in `README.md` perception section, bullet in `DEV.md` "When to reach for what." - -**Why bother.** Sub-pixel deterministic pose for a tagged object is the only roadmap primitive that closes the visual-servo loop without a depth sensor — and the substrate for the multi-robot-orchestration direction in `.claude/CLAUDE.md`. Drives `entry.arucoPosition` which the motion controller consumes as ground truth (subject to its staleness gate — `aruco.js` does NOT clear stale entries when a robot leaves frame; consumer's job). - -## Grounding DINO open-vocab detector — deleted (May 2026) - -Lived in `docs/grounding.js` as the open-vocab fallback when MediaPipe COCO's 80 classes couldn't cover a target. Disabled after real-world false positives (medium-confidence "stop sign.[SEP]" matches against a robot-vacuum dock — BERT separator token leaking through the post-processor). Deleted entirely once Claude vision via `view_robot_frame` was confirmed to fill the same role with scene reasoning the bbox-only detector couldn't do. - -**Why deleting rather than fixing.** The role this module filled — "give Pip a way to localize 'the yellow can' or 'the book on the bag'" — is now served by the planner itself. Pip sends a frame to Claude, Claude reads the scene, plans the next action. No bbox needed when the planner can reason. Re-arming the closed-vocab variant would duplicate the role with worse semantics (no scene context, false-positive history) AND keep a 151 MB model download in the asset graph. - -**What to revisit if it comes back.** A future need for sub-second open-vocab bboxes at the rate the LLM can't serve (Claude vision is ~1–2 s round-trip; bbox-rate use cases want ~100 ms). At that point: re-evaluate Grounding DINO 1.5, owlv2, or YOLO-World — but only after a use case earns it. Reactive open-vocab is not on the wedge today. - -## YOLO26n closed-vocab detector (`docs/yolo26.js`) - -Faster sibling for reactive-tier use cases (visual servo, gamepad-overlay tracking). Wired behind `/detector yolo26` with the registry in `docs/detectors.js`; MediaPipe stays the default. ONNX runtime via WebGPU EP with WASM fallback, ~10 MB COCO model fetched from HuggingFace on first use. - -**What hasn't been confirmed.** End-to-end accuracy vs MediaPipe EfficientDet-Lite0 on the same scenes, WebGPU EP stability across the Chrome/Edge versions students will run, first-fetch UX on classroom WiFi (10 MB ONNX + onnxruntime-web bytes). Promote to default — or remove from the registry — only after a side-by-side run. Out of `README.md` and `DEV.md` until then. - -## Laptop camera → phone feed (helper card role "Send to phone") - -Local-cam helper card gains a third role alongside Overhead. Selecting "Send to phone" opens the camera via getUserMedia and `peer.addTrack`s the video track on every paired phone; the phone displays it in the existing `phone-cam-section` since it's "incoming forwarded video from desktop" — the same sink robot cameras already use. Runtime-only state (`_phoneFeedLocalId` in phone-helpers.js), not persisted across reloads. - -**Latent.** `phone-cam-section` displays one stream at a time (`v.srcObject = e.streams[0]`, last-wins). When both a robot camera and the laptop-cam are routed to the same phone simultaneously, whichever fires `peer.onTrack` last wins; there is no UI on the phone to switch back. The existing `available-sources` / `subscribe-source` picker handles this per-robot but is not yet generalized across owner types. Acceptable for the single-source case the prototype is built around; if multi-source coexistence becomes the steady-state demo, generalize the picker (own-id namespace = `"robot:" | "local:"`, single global active per phone) before adding more source kinds. Documented in the audit report that prompted this work. - ---- - -# Forks in the road — alternatives evaluated, with revisit triggers - -Paths we looked at and chose not to take, with the specific change in project direction that would make us revisit. Distinct from competitors (which compete for the same user decision) — these are *adjacent technical paths* we declined. - -## Espressif KVS WebRTC SDK for ESP32 - -**Evaluated:** May 2026. Espressif's first-party WebRTC stack ([awslabs/amazon-kinesis-video-streams-webrtc-sdk-c@beta-reference-esp-port](https://github.com/awslabs/amazon-kinesis-video-streams-webrtc-sdk-c/tree/beta-reference-esp-port), HEAD 119617b7 at evaluation time). Ships an AppRTC-mode example targeting classic ESP32. Active development, monthly sync to upstream awslabs releases, 1.2k stars. - -**What it would buy us.** Eliminates three of our four libpeer/esp_peer patches at the chip level: chip is DTLS CLIENT by default (so the fragmented-ClientHello bug is sidestepped without patching), SDP answerer emits `setup:active` directly, MID copied from remote offer, ICE agent silently ignores TCP candidates. The four-patch shape in `CLAUDE.md`'s WebRTC section collapses to one (mbedTLS Kconfig). Cert flow returns to chip-side ECDSA generation (~9 KB flash cost we currently save). - -**Why not now.** Signaling is hardwired to HTTPS+WebSocket against AWS KVS or `webrtc.espressif.com`. Swapping in means writing a custom `signaling_client_if` implementation that takes offers/answers off our BLE `SIGNAL` characteristic and feeds the SDK's `kvs_peer_connection_if`. The plug point is documented (`CUSTOM_SIGNALING.md` in their tree), but the work doesn't buy us anything the libpeer patches don't already deliver — our wedge is precisely "BLE-signaled, no internet rendezvous." We'd also inherit a 3 MB factory partition expectation that's marginal on classic ESP32's 4 MB flash. - -**Revisit trigger.** If a hosted-mode / internet-rendezvoused operator surface lands on the roadmap (share-a-link demos, remote tele-op, third-party robots controlling our robots), KVS WebRTC SDK is the prebuilt path. **Do not reinvent BLE-on-KVS at that point — switch outright.** The libpeer + four-patch setup made sense for BLE-only; it does not earn its keep against a stack that handles cloud signaling for free. - -**Bonus capability worth knowing.** KVS WebRTC Split Mode distributes signaling to ESP32-C6 (light sleep) and streaming to ESP32-P4 (deep sleep until wake-on-signal) — the only battery-powered WebRTC camera architecture in the ecosystem. Irrelevant to mains-powered robots today; remember it if low-power ever becomes a constraint. diff --git a/.gitattributes b/.gitattributes deleted file mode 100644 index 835cf6bc..00000000 --- a/.gitattributes +++ /dev/null @@ -1,9 +0,0 @@ -# Hidden-prefix folders are encrypted with git-crypt + the personal key -# stored in Apple Keychain. After clone, run `gcp-unlock` (defined in -# dotfiles' .zshrc) to install the filter and decrypt the working tree. -# .claude/ stays plain text — it's not secrets, it's project context -# (wedge, lenses, agent prompts) that benefits from being readable in -# code review and on github.com without a key dance. -.docs/** filter=git-crypt diff=git-crypt -.private/** filter=git-crypt diff=git-crypt -.secret/** filter=git-crypt diff=git-crypt diff --git a/DEV.md b/DEV.md index a9244834..9598bb56 100644 --- a/DEV.md +++ b/DEV.md @@ -67,4 +67,3 @@ State the page can't see: - **Dev flags → URL.** Per-session diagnostics that shouldn't persist. - **User preferences → Settings.** Build the panel once there are 3+ real persistent preferences. -- Keep this doc in sync when adding a URL flag, `window.*` handle, or IndexedDB store. diff --git a/HARDWARE.md b/HARDWARE.md index ad11b3eb..30bc7d12 100644 --- a/HARDWARE.md +++ b/HARDWARE.md @@ -27,7 +27,7 @@ Default firmware pins: left `forward=14, backward=15`, right `forward=13, backwa **Leave the L298N's ENA/ENB jumpers ON.** The 5V tie-up keeps the H-bridge enabled and lets PWM ride the direction pins. Forward = `forward-pin=PWM, backward-pin=LOW`; reverse = swap. Separate direction + enable control needs 6 GPIOs we don't have. -GPIO 15 is a strap pin (must be HIGH at boot for normal serial output). L298N's IN pins are high-impedance CMOS, but if your board has a weak pull-down on IN that fights the strap, add a 10k pull-up from GPIO 15 to 3.3V. Symptom: garbled serial during the first second of boot. Harmless if you don't need that bootloader log. +GPIO 15 is a strap pin — needs HIGH at boot for normal serial output. L298N's IN pins are high-impedance CMOS, but if your board has a weak pull-down on IN that fights the strap, add a 10k pull-up from GPIO 15 to 3.3V. Symptom: garbled serial during the first second of boot. Harmless if you don't need that bootloader log. ### Optional hardware mods (for stability under load) @@ -65,11 +65,11 @@ Requires a USB-C **data** cable (not charge-only). Power-only variants look iden ## Board-specific knobs -Two variables need to match your ESP32 board: +Two variables track the ESP32 board: - **target** for `idf.py set-target` — `esp32` for CAM-MB; `esp32s3` for S3. Per-target defaults in `sdkconfig.defaults.esp32` / `sdkconfig.defaults.esp32s3`. - **`LED_PIN`** in `firmware/esp32_robot_idf/main/pin_config.c` — GPIO 33 active-low on CAM-MB. S3 boards vary; many use a WS2812 neopixel (GPIO 48 on DevKitC-S3) which needs a different driver entirely. The dashboard's Pinout editor overrides at runtime via NVS, no rebuild. The IDF partition layout (1.9 MB OTA slots, otadata at 0xE000) matches arduino-esp32's `min_spiffs` so a fielded ESP32 originally flashed with the .ino can OTA into this firmware without bricking. -After changing either, push to `main` — CI rebuilds and publishes. Run `make publish-firmware` locally only to preview before pushing. +After changing either, push to `main` — CI rebuilds and publishes. `make publish-firmware` previews locally before pushing. diff --git a/README.md b/README.md index 0b0d0fc7..1febbe89 100644 --- a/README.md +++ b/README.md @@ -2,50 +2,30 @@ **Open a tab, pair a robot, ship code.** -[![Live](https://img.shields.io/badge/live-better--robotics.github.io-blue)](https://better-robotics.github.io/) -[![Build firmware](https://github.com/jonasneves/better-robotics/actions/workflows/build-firmware.yml/badge.svg)](https://github.com/jonasneves/better-robotics/actions/workflows/build-firmware.yml) -[![Web Bluetooth](https://img.shields.io/badge/Web%20Bluetooth-Chrome%20%7C%20Edge-orange)](#browser-support) +[![Build firmware](https://github.com/better-robotics/better-robotics.github.io/actions/workflows/build-firmware.yml/badge.svg)](https://github.com/better-robotics/better-robotics.github.io/actions/workflows/build-firmware.yml) ## What this is -Open a Chrome tab. Pair a robot over BLE. Write JavaScript that drives it. - -```js -// Multi-robot is a forEach. -for (const r of robots) { - await r.led(true); - await r.move({ left: 30, right: 30, durationMs: 400 }); - await r.led(false); -} -``` - -- **Browser is the IDE.** Scripts panel + capability cards. localStorage is the file system; BLE is the runtime link. -- **Models run in the browser too.** Open-vocab detector runs client-side. No GPU server, no inference bill. -- **Two authorable surfaces, co-equal:** user code (you write JS) and Pip (a tool-using LLM with ask-human, currently Claude). Both bound by the same firmware safety floor. -- **No backend, no accounts.** Static-site dashboard; no data leaves the browser. +Browser is the IDE. Coding panel + capability cards. localStorage is the file system; BLE is the runtime link. ## Architecture ``` -┌──────────────────┐ BLE GATT (always on) ┌──────────────────┐ -│ Chrome browser │ ◄────────────────────────────────────► │ Robot firmware │ -│ (Web Bluetooth) │ commands · state · ops · triggers │ (ESP32 or Pi) │ +┌──────────────────┐ ┌──────────────────┐ +│ Chrome browser │ ◄────── BLE GATT (control plane) ────► │ Robot firmware │ +│ (Web Bluetooth) │ commands · state │ (ESP32 or Pi) │ └──────────────────┘ └──────────────────┘ ▲ ▲ - ├─────────── WiFi (data plane, optional) ────────────────── ┤ - │ camera (WebRTC ↔ HTTP MJPEG, per-camera toggle) │ + ├──────────────────── WiFi (data plane) ────────────────────┤ + │ camera (WebRTC · HTTP) │ │ │ - └─────── USB-C (recovery plane, last-resort, Pi only) ───── ┘ - ECM ethernet · ACM serial console + └───────────────── USB-C (recovery plane) ──────────────────┘ + ECM ethernet · ACM serial console ``` - **Control plane — BLE.** Always on. Commands, telemetry, state changes, ops. ~1–3 Mbps, reliable, network-free. Pairing UI is the gatekeeper; no credentials cross the air. -- **Data plane — WiFi, optional.** Onboarded via BLE when needed. Carries video (per-camera toggle between WebRTC and HTTP MJPEG), large OTA, cloud LLM calls. Robots work fully without it. -- **Recovery plane — USB-C, last-resort (Pi).** Composite USB gadget (ECM + ACM serial) under its own systemd unit, independent of robot firmware. Dashboard exposes an xterm.js terminal over this. - -**Why BLE for control:** classroom and demo WiFi rarely cooperates (blocked multicast, captive portals, client isolation). BLE sidesteps all three. Robot advertises on boot; laptop scans and sees every robot in the room. - -**Safety on disconnect.** Actuator characteristics (motor, servo, pump, relay) ship with a firmware watchdog. Every write resets a timer; if no write lands in the window, firmware reverts to a safe default. Silence is the trigger, not a redundant radio. +- **Data plane — WiFi, optional.** Onboarded via BLE when needed. Carries video, large OTA, cloud LLM calls. Robots work fully without it. +- **Recovery plane — USB-C.** Composite USB gadget (ECM + ACM serial) under its own systemd unit, independent of robot firmware. Dashboard exposes an xterm.js terminal over this. ## Quickstart @@ -53,9 +33,9 @@ for (const r of robots) { 1. Open [better-robotics.github.io](https://better-robotics.github.io/) in Chrome or Edge. 2. Flash or prepare hardware: - - **ESP32 on USB:** click **Flash firmware** — bins come from GitHub Pages, no local toolchain. - - **Pi 4 with a flashed SD card:** click **Customize card** (or hit the URL with `?prepare`) and point it at the mounted boot partition. -3. Click **Scan**, pair a robot, toggle LED, onboard WiFi, drive motors. Future updates go over BLE via **Update firmware**. + - **ESP32 on USB:** click **Flash firmware** + - **Pi 4 with a flashed SD card:** click **Customize card** and point it at the mounted boot partition. +3. Click **Scan**, pair a robot, toggle LED, onboard WiFi, drive motors. ### Develop locally @@ -65,10 +45,6 @@ make flash # build ESP32 firmware, upload over USB make preview # serve the dashboard at http://localhost:8000 ``` -Pi firmware is Python; see [`firmware/pi_robot/README.md`](firmware/pi_robot/README.md) for the SD-card prep flow and BLE service spec. - -Commit and push. CI rebuilds firmware artifacts on `firmware/**` changes and commits them back; devices pick up new versions via OTA. - ## Repo layout ``` @@ -79,20 +55,9 @@ tests/ Pure-function unit tests · make smoke .claude/ Agent + project context ``` -ESP32 and Pi expose the same service UUID and characteristic UUIDs, so the dashboard talks to either without conditional logic. `docs/` is the GitHub Pages publish root — the site is the directory, no build step. The dashboard is flat by convention — naming prefixes carry subsystem boundaries; see `.claude/CLAUDE.md` for the subsystem map. - -## Further reading - -- [**Hardware guide**](HARDWARE.md) — recommended boards, board-specific knobs, driver notes. -- [**Pi firmware**](firmware/pi_robot/README.md) — BLE service spec, SD-card prep details, Bookworm/Trixie troubleshooting. -- [**User code**](USER-CODE.md) — how to write scripts in the browser; the `robot` API surface. -- [**Developer reference**](DEV.md) — URL flags, console handles, Chrome `chrome://` diagnostic pages. +ESP32 and Pi expose the same service UUID and characteristic UUIDs, so the dashboard talks to either without conditional logic. `docs/` is the GitHub Pages publish root — the site is the directory, no build step. ## Browser support Web Bluetooth: Chrome, Edge, Opera on desktop and Android. Not Safari. Firefox only behind a flag. -## License - -[MIT](LICENSE). - diff --git a/SMOKE.md b/SMOKE.md index 4789ede4..6542e291 100644 --- a/SMOKE.md +++ b/SMOKE.md @@ -1,6 +1,6 @@ # Smoke checklist -Manual verification before merging structural changes (UI redesign, render-pattern shifts, capability refactors, BLE protocol tweaks). If a row breaks, user-visible value broke. Don't ship. +Manual verification before merging structural changes (UI redesign, render-pattern shifts, capability refactors, BLE protocol tweaks). A broken row means user-visible value broke. Pure-function tests live in `tests/`; run with `make smoke`. Below needs hardware. @@ -28,6 +28,8 @@ Pure-function tests live in `tests/`; run with `make smoke`. Below needs hardwar - [ ] **Motors — pulse-bounded LLM path:** Pip-issued motor command with `duration_ms` stops at end of window without a separate stop call (firmware auto-stop). Control-loop invariant; regression means planner-layer code can leave the robot moving between decisions. - [ ] **Phone Stop button:** from a paired phone, tapping Stop relays through the desktop's BLE session and halts a moving robot. With no robot connected, button surfaces "no robot connected" inline. Safety primitive must be legible, no silent no-op. - [ ] **Phone Share camera:** Front is selected by default; tapping Share opens the front camera and a helper card appears on the desktop. While sharing, tapping Back swaps to the rear camera within ~1 s — same helper card, no flash/disconnect (replaceTrack path). Tapping Stop sharing clears the helper card. +- [ ] **Phone attached-mode:** mount the phone on a robot via the robot card's "Mount camera" picker. Phone screen flips to full-screen Pip face (no operator chrome, no Stop button — operator is remote, audience in-room reaches for the robot itself if needed). Detaching restores normal phone UI. Disconnect mid-attach → phone shows normal reconnect surface; reconnecting re-flips to attached. +- [ ] **Pip face on attached phone (default):** mount the phone; default is `phoneAttachedMode: "pip-face"` so the screen shows Pip's robot icon (head, antennas, ears, spark) with two morphing eyes inside. Eyes blink at jittered 2–5s intervals when idle. `/demo dance` → eyes shift direction in sync with motor calls. `get_robot_detections` → eyes scan left/right oscillating. `ask_human` → eyes rotate asymmetrically (raised brows). `/demo stopsign` with a stop sign held up → eyes briefly widen (alert, gold), then halted-squint (gray + sleep Z's drift) after the halt. - [ ] **WiFi** Scan returns networks (or empty if none); Join succeeds → status shows "WiFi " in meta. - [ ] **Camera (ESP32)** renders when WiFi joined. Per-camera transport toggle (WebRTC ↔ HTTP MJPEG) switches the live view without page reload; both transports paint frames. - [ ] **Camera (Pi)** WebRTC stream comes up once `pi-robot-rtc.service` is healthy; ICE survives Pi reboot. diff --git a/USER-CODE.md b/USER-CODE.md index 689ccbdf..93e364bb 100644 --- a/USER-CODE.md +++ b/USER-CODE.md @@ -35,18 +35,18 @@ const move = await pip.ask("Scene: chair ahead. Reply: forward, left, right, sto }); ``` -In scope inside a script: `robot`, `robots`, `phones`, `pip`, `sleep(ms)`, `log(...)`, `speak(text)`. The Scripts dialog ships templates; pick one from the dropdown to load it. +In scope inside a script: `robot`, `robots`, `phones`, `pip`, `sleep(ms)`, `log(...)`, `speak(text)`. The Scripts dialog ships templates. -The `pip` namespace is deliberately thin: today just `pip.ask(prompt, opts?)`, returning the LLM's text response. It's the seam between "user wrote the orchestration" and "the LLM decided this step" — same shape Pip uses internally, exposed so the two surfaces aren't siloed. +The `pip` namespace is thin: `pip.ask(prompt, opts?)` returns the LLM's text response. It's the seam between "user wrote the orchestration" and "the LLM decided this step" — same shape Pip uses internally, exposed so the two surfaces aren't siloed. ## Safety floor Firmware enforces motor watchdog + pulse duration cap + ultrasonic dist_cm forward-clip regardless of who issued the writes. User code, Pip, joypad — all see the same limits. -`robot.move()` calls `pulseMotors`, carrying the same 50–2000 ms duration cap the LLM is bound by. Magnitude is the signed-byte range, no LLM-specific clamp. Dashboard-side clamps are advisory; firmware enforcement is binding. +`robot.move()` calls `pulseMotors`, carrying the same 50–2000 ms duration cap the LLM is bound by. Magnitude is the signed-byte range, no LLM-specific clamp. Dashboard clamps are advisory; firmware is binding. ## Deployment model -User code lives in the browser, not on the robot. No upload-to-Pi, no GH Actions push, no `scp`-from-the-dashboard. +User code lives in the browser, not on the robot. No upload-to-Pi, no GH Actions push, no `scp`. -If a robot ever needs to run behavior with the dashboard disconnected for minutes+ (outside the wedge today — see `.claude/CLAUDE.md → Anti-drift guards`), the path forward is the existing OTA pipeline: drop user code into a `/home/robot/user/` slot via BLE OTA, have `pi_robot.py` import it via a typed plugin API. No new sync server needed. +If a robot needs to run behavior with the dashboard disconnected for minutes+ (outside the wedge today — see `.claude/CLAUDE.md → Anti-drift guards`), the path forward is the existing OTA pipeline: drop user code into a `/home/robot/user/` slot via BLE OTA, have `pi_robot.py` import it via a typed plugin API. No new sync server needed. diff --git a/docs/app-menu.js b/docs/app-menu.js index 98f3bf9c..73a173fa 100644 --- a/docs/app-menu.js +++ b/docs/app-menu.js @@ -291,7 +291,7 @@ export function setReportIssueLink(anchor, version) { ].join("\n"); } function refresh() { - anchor.href = `https://github.com/jonasneves/better-robotics/issues/new?body=${encodeURIComponent(buildBody())}`; + anchor.href = `https://github.com/better-robotics/better-robotics.github.io/issues/new?body=${encodeURIComponent(buildBody())}`; } refresh(); // Refresh just-in-time so any errors that happened between page load diff --git a/docs/app.js b/docs/app.js index 7e2508d7..672046cb 100644 --- a/docs/app.js +++ b/docs/app.js @@ -6,37 +6,38 @@ import { ALL as CAPABILITIES, setCapabilityRenderer } from "./capabilities/index import { setOpen as capSetOpen } from "./capabilities/runtime/cap-section.js"; import { setBleRenderers, loadPaired, scanForNew, connect, disconnect, forgetDevice, -} from "./ble-lifecycle.js"; +} from "./ble/ble-lifecycle.js"; import { - formatUptime, formatWifi, formatWifiShort, formatResetReason, + formatUptime, formatWifiShort, formatResetReason, formatRssi, rssiSeverity, tempSeverity, } from "./format.js"; import { updateFirmware, updateFromFile } from "./capabilities/ota.js"; import { restartService, rebootRobot, enrollKey } from "./capabilities/runtime/command.js"; -import { initGamepad } from "./gamepad.js"; +import { initGamepad } from "./input/gamepad.js"; import { initMotorsKeyboard } from "./capabilities/runtime/signed-pair.js"; // prepare.js / pinout.js / recovery.js are lazy-loaded on first use (~750 LOC // combined, none of it needed for first paint). See the dynamic import() // calls in the DOMContentLoaded wiring below. import { initAuthUI, fingerprint as dashFingerprint, pubkeySsh, onKeyChange } from "./auth.js"; import { initPasswordsUI } from "./passwords.js"; -import { initAssistant } from "./assistant.js"; -import { initPhones, listPhones } from "./phones.js"; +import { initAssistant } from "./pip/assistant.js"; +import { initPhones, listPhones } from "./pair/phones.js"; +import { initPhoneScreenModePlugin } from "./pair/phone-screen-mode-plugin.js"; import { initHelpers, setHelpersRobotRenderer, attachPhoneCameraTo, getPhoneAttachment, -} from "./phone-helpers.js"; +} from "./pair/phone-helpers.js"; // aruco.js is wired through phone-helpers.js — phone helpers can be designated // as the overhead camera; detection runs against the helper's existing // preview tile. No init call here. -import "./aruco.js"; +import "./perception/aruco.js"; import { watcherCap } from "./watcher.js"; import { setupServiceWorker, wireInstallMenuItem, wireCheckUpdatesMenuItem, wireHardRefresh, wireDiagnosticsMenuItem, setReportIssueLink, readSwVersion, } from "./app-menu.js"; import { initRobotPresence } from "./wifi-presence.js"; -import { wireLogDialog } from "./log-dialog.js"; +import { wireLogDialog } from "./recovery/log-dialog.js"; setCapabilityRenderer((entry) => renderEntry(entry)); setHelpersRobotRenderer((entry) => renderEntry(entry)); @@ -754,10 +755,10 @@ async function _setConsoleMode(mode) { $("console-mode-pi")?.setAttribute("aria-pressed", String(mode === "pi")); $("console-mode-esp")?.setAttribute("aria-pressed", String(mode === "esp")); if (mode === "pi") { - const mod = await import("./recovery.js"); + const mod = await import("./recovery/recovery.js"); mod.init(); } else { - const mod = await import("./esp-serial.js"); + const mod = await import("./recovery/esp-serial.js"); mod.init(); } } @@ -918,7 +919,7 @@ document.addEventListener("DOMContentLoaded", () => { closeMenu(); const entry = state.devices.get(id); if (!entry || entry.status !== "connected" || !entry.fwInfo) return; - const mod = await import("./pinout.js"); + const mod = await import("./recovery/pinout.js"); mod.openPinoutDialog(id); }); // Shell — lazy-import so xterm.js + WebRTC plumbing only load when the @@ -928,7 +929,7 @@ document.addEventListener("DOMContentLoaded", () => { closeMenu(); const entry = state.devices.get(id); if (!entry || entry.fwType !== "pi" || entry.status !== "connected") return; - const mod = await import("./shell.js"); + const mod = await import("./recovery/shell.js"); mod.openShellDialog(id); }); $("menu-console").addEventListener("click", () => { @@ -1116,10 +1117,10 @@ document.addEventListener("DOMContentLoaded", () => { $("setup-dialog").close(); // Release any console-held port before installEsp32 picks a new one. await Promise.all([ - import("./recovery.js").then(m => m.releasePort?.()).catch(() => {}), - import("./esp-serial.js").then(m => m.releasePort?.()).catch(() => {}), + import("./recovery/recovery.js").then(m => m.releasePort?.()).catch(() => {}), + import("./recovery/esp-serial.js").then(m => m.releasePort?.()).catch(() => {}), ]); - const { installEsp32 } = await import("./esp-serial.js"); + const { installEsp32 } = await import("./recovery/esp-serial.js"); await installEsp32(); }); } @@ -1139,17 +1140,21 @@ document.addEventListener("DOMContentLoaded", () => { initPhones(); initHelpers(); initRobotPresence(); + // Phone-on-robot rendering: phone.attached/phone.detached resolve + // into a screen mode and the desktop sends it over WebRTC. Off- + // switch: delete this line. + initPhoneScreenModePlugin(); // Lazy-load prepare.js on first click — it's ~230 LOC and touches the File // System Access API; no reason to pull it into first-paint. prepare.js's // openDialog() runs its own initOnce() internally so one-time setup still // happens. ?prepare URL param keeps working via the same path. $("prepare-open-btn").addEventListener("click", async () => { - const mod = await import("./prepare.js"); + const mod = await import("./recovery/prepare.js"); await mod.openDialog(); }); if (new URLSearchParams(location.search).get("prepare") !== null) { - import("./prepare.js").then(m => m.openDialog()); + import("./recovery/prepare.js").then(m => m.openDialog()); } loadPaired().then(() => { highlightKnownRobotFromUrl(); diff --git a/docs/ble-lifecycle.js b/docs/ble/ble-lifecycle.js similarity index 96% rename from docs/ble-lifecycle.js rename to docs/ble/ble-lifecycle.js index fc9ac299..afe040d2 100644 --- a/docs/ble-lifecycle.js +++ b/docs/ble/ble-lifecycle.js @@ -2,17 +2,17 @@ import { SERVICE_UUID, HEARTBEAT_SVC_UUID, HEARTBEAT_CHAR_UUID, FW_INFO_CHAR_UUID, ROBOT_STATUS_CHAR_UUID, OPS_RESPONSE_CHAR_UUID, TELEMETRY_CHAR_UUID, SIGNAL_CHAR_UUID, decodeJson } from "./ble.js"; -import { log, logFor } from "./log.js"; +import { log, logFor } from "../log.js"; import { state, persist, loadKnown, makeEntry, entryFor, attachDevice, setDisconnectHandler, -} from "./state.js"; -import { ALL as CAPABILITIES } from "./capabilities/index.js"; -import { RUNTIMES } from "./capabilities/runtime/index.js"; -import { dispatchOpsResponse } from "./ops-response.js"; -import { broadcastTargetInfo } from "./phones.js"; -import { renderHelpers } from "./phone-helpers.js"; -import { stopWatcher } from "./watcher.js"; +} from "../state.js"; +import { ALL as CAPABILITIES } from "../capabilities/index.js"; +import { RUNTIMES } from "../capabilities/runtime/index.js"; +import { dispatchOpsResponse } from "../ops-response.js"; +import { broadcastTargetInfo } from "../pair/phones.js"; +import { renderHelpers } from "../pair/phone-helpers.js"; +import { stopWatcher } from "../watcher.js"; let renderers = { renderEntry: () => {}, @@ -246,7 +246,11 @@ export async function connect(id) { entry.robotStatus = decodeJson(await statusChar.readValue()) || null; await statusChar.startNotifications(); statusChar.addEventListener("characteristicvaluechanged", (e) => { - entry.robotStatus = decodeJson(e.target.value) || null; + const next = decodeJson(e.target.value) || null; + // Firmware sometimes re-publishes identical status; skip the + // DOM patch when the payload hasn't changed. + if (JSON.stringify(next) === JSON.stringify(entry.robotStatus)) return; + entry.robotStatus = next; renderers.patchRobotStateLine(entry); // surgical, no full-card flash }); } catch { diff --git a/docs/ble.js b/docs/ble/ble.js similarity index 100% rename from docs/ble.js rename to docs/ble/ble.js diff --git a/docs/uuids.js b/docs/ble/uuids.js similarity index 100% rename from docs/uuids.js rename to docs/ble/uuids.js diff --git a/docs/capabilities/index.js b/docs/capabilities/index.js index 730c8f79..d6e1bfe6 100644 --- a/docs/capabilities/index.js +++ b/docs/capabilities/index.js @@ -1,11 +1,11 @@ // OTA remains hand-written because it bridges Pi bundle-OTA and ESP32 -// single-file OTA. Every other capability lives under ./runtime/. -import { ota, setRender as setOtaRender } from "./ota.js"; +// single-file OTA. Every other capability lives under ./runtime/. Both +// share the runtime render-bus so one setter wires the whole graph. +import { ota } from "./ota.js"; import { setRuntimeRenderer } from "./runtime/index.js"; export const ALL = [ota]; export function setCapabilityRenderer(fn) { - setOtaRender(fn); setRuntimeRenderer(fn); } diff --git a/docs/capabilities/ota.js b/docs/capabilities/ota.js index dea77a2c..cdf97461 100644 --- a/docs/capabilities/ota.js +++ b/docs/capabilities/ota.js @@ -4,7 +4,7 @@ import { OTA_DATA_CHAR_UUID, OTA_STATUS_CHAR_UUID, decodeJson, encodeJson, -} from "../ble.js"; +} from "../ble/ble.js"; import { freshUrl, escapeHtml, fetchWithTimeout } from "../dom.js"; import { logFor, log } from "../log.js"; import { state } from "../state.js"; @@ -40,8 +40,7 @@ export async function uploadFile(id, filename, destPath, contentBytes, { restart } } -let renderEntry = () => {}; -export function setRender(fn) { renderEntry = fn; } +import { renderEntry } from "./runtime/render-bus.js"; // Patch existing OTA section in place; avoids full innerHTML rewrite on // every progress tick (which would destroy hovered elements and flicker). @@ -123,7 +122,7 @@ async function streamOtaViaWebRTC(entry, bytes) { if (entry.fwType === "pi" && !entry.opsChar) { throw new Error("no ops channel — can't trigger apply"); } - const { openChannel, closePeer } = await import("../webrtc-robot.js"); + const { openChannel, closePeer } = await import("../webrtc/webrtc-robot.js"); let channel; try { channel = await openChannel(entry.id, entry.name, "ota", { diff --git a/docs/capabilities/runtime/ble-snapshot.js b/docs/capabilities/runtime/ble-snapshot.js index fc1153c1..a29200fc 100644 --- a/docs/capabilities/runtime/ble-snapshot.js +++ b/docs/capabilities/runtime/ble-snapshot.js @@ -3,7 +3,7 @@ // Pairs a write-trigger char with a notify-out chunked stream. Same envelope // the OTA path uses (just outbound here): 0x01 begin+u32 len, 0x02 chunk, // 0x03 commit, 0xff err+text. ~10-30 KB JPEG over BLE → ~1-2s per shot. -import { UUIDS_BY_CAP } from "../../ble.js"; +import { UUIDS_BY_CAP } from "../../ble/ble.js"; import { escapeHtml } from "../../dom.js"; import { logFor } from "../../log.js"; import { capSection, setOpen } from "./cap-section.js"; diff --git a/docs/capabilities/runtime/coalesced-write.js b/docs/capabilities/runtime/coalesced-write.js new file mode 100644 index 00000000..2b23440e --- /dev/null +++ b/docs/capabilities/runtime/coalesced-write.js @@ -0,0 +1,34 @@ +// Drop-intermediate-values BLE writer shared by level / rgb / +// signed-pair caps. Web Bluetooth refuses overlapping GATT writes, +// so a stream of dashboard updates (slider drag, joypad poll) needs +// to coalesce: only the most-recently-set value survives, prior +// pending writes are discarded. Three caps had near-identical +// implementations; this is the one. +// +// State per cap lives on the entry: `${capName}Pending` (last value +// queued; null when idle) and `${capName}Sending` (re-entry guard). +// `encode(value)` returns the Uint8Array payload to ship. + +import { logFor } from "../../log.js"; + +export async function coalescedWrite(entry, capName, value, encode) { + const ch = entry[`${capName}Char`]; + if (!ch) return; + entry[`${capName}Pending`] = value; + if (entry[`${capName}Sending`]) return; + entry[`${capName}Sending`] = true; + try { + while (entry[`${capName}Pending`] != null) { + const next = entry[`${capName}Pending`]; + entry[`${capName}Pending`] = null; + try { + await ch.writeValueWithResponse(encode(next)); + } catch (err) { + logFor(entry, `${capName} write failed: ${err.message}`); + break; + } + } + } finally { + entry[`${capName}Sending`] = false; + } +} diff --git a/docs/capabilities/runtime/command.js b/docs/capabilities/runtime/command.js index 692b9606..1c0d7408 100644 --- a/docs/capabilities/runtime/command.js +++ b/docs/capabilities/runtime/command.js @@ -1,6 +1,6 @@ // Schema: { name: "ops", char: "…d9c", type: "command" } // Op-name vocabulary must match Pi's `_ops_handle_write` dispatcher. -import { UUIDS_BY_CAP, encodeJson } from "../../ble.js"; +import { UUIDS_BY_CAP, encodeJson } from "../../ble/ble.js"; import { logFor } from "../../log.js"; import { state } from "../../state.js"; diff --git a/docs/capabilities/runtime/level.js b/docs/capabilities/runtime/level.js index 974143e5..af5aca9c 100644 --- a/docs/capabilities/runtime/level.js +++ b/docs/capabilities/runtime/level.js @@ -7,35 +7,16 @@ // each in-flight write resolves. Without it, dragging stalls with // "GATT operation already in progress". -import { UUIDS_BY_CAP } from "../../ble.js"; -import { logFor } from "../../log.js"; +import { UUIDS_BY_CAP } from "../../ble/ble.js"; import { capSection } from "./cap-section.js"; - +import { coalescedWrite } from "./coalesced-write.js"; import { renderEntry } from "./render-bus.js"; export async function setLevelValue(entry, capName, value) { - const ch = entry[`${capName}Char`]; - if (!ch) return; const range = entry.capSchema?.find(s => s.name === capName)?.range || [0, 100]; const [mn, mx] = range; const v = Math.max(mn, Math.min(mx, Math.round(Number(value) || 0))); - entry[`${capName}Pending`] = v; - if (entry[`${capName}Sending`]) return; - entry[`${capName}Sending`] = true; - try { - while (entry[`${capName}Pending`] != null) { - const next = entry[`${capName}Pending`]; - entry[`${capName}Pending`] = null; - try { - await ch.writeValueWithResponse(Uint8Array.of(next & 0xff)); - } catch (err) { - logFor(entry, `${capName} write failed: ${err.message}`); - break; - } - } - } finally { - entry[`${capName}Sending`] = false; - } + await coalescedWrite(entry, capName, v, (n) => Uint8Array.of(n & 0xff)); } export function makeLevelCap(schema) { diff --git a/docs/capabilities/runtime/mjpeg-restream.js b/docs/capabilities/runtime/mjpeg-restream.js index 4bc0cec3..0a3c757c 100644 --- a/docs/capabilities/runtime/mjpeg-restream.js +++ b/docs/capabilities/runtime/mjpeg-restream.js @@ -13,7 +13,7 @@ // firmware serves Access-Control-Allow-Origin: * so this works out of the // box; non-compliant MJPEG sources will fall through to a silent stream. -import { notifyRobotStreamChange } from "../../phones.js"; +import { notifyRobotStreamChange } from "../../pair/phones.js"; const FPS = 15; diff --git a/docs/capabilities/runtime/mjpeg-stream.js b/docs/capabilities/runtime/mjpeg-stream.js index 03aec235..7ab570d2 100644 --- a/docs/capabilities/runtime/mjpeg-stream.js +++ b/docs/capabilities/runtime/mjpeg-stream.js @@ -7,7 +7,7 @@ import { capSection } from "./cap-section.js"; import { startMjpegForward, stopMjpegForward } from "./mjpeg-restream.js"; import { persist } from "../../state.js"; import { startWatcher, stopWatcher } from "../../watcher.js"; -import { isDetectorFailed } from "../../detectors.js"; +import { isDetectorFailed } from "../../perception/detectors.js"; import { renderEntry } from "./render-bus.js"; diff --git a/docs/capabilities/runtime/rgb.js b/docs/capabilities/runtime/rgb.js index 18af61e0..1f3e3e85 100644 --- a/docs/capabilities/runtime/rgb.js +++ b/docs/capabilities/runtime/rgb.js @@ -7,10 +7,9 @@ // the latest pending color and flush after the in-flight BLE write // resolves so we don't pile up "GATT operation already in progress". -import { UUIDS_BY_CAP } from "../../ble.js"; -import { logFor } from "../../log.js"; +import { UUIDS_BY_CAP } from "../../ble/ble.js"; import { capSection } from "./cap-section.js"; - +import { coalescedWrite } from "./coalesced-write.js"; import { renderEntry } from "./render-bus.js"; function toHex(r, g, b) { @@ -26,26 +25,7 @@ function fromHex(hex) { } export async function setRgbValue(entry, hex) { - const ch = entry.rgbChar; - if (!ch) return; - const [r, g, b] = fromHex(hex); - entry.rgbPending = [r, g, b]; - if (entry.rgbSending) return; - entry.rgbSending = true; - try { - while (entry.rgbPending) { - const next = entry.rgbPending; - entry.rgbPending = null; - try { - await ch.writeValueWithResponse(Uint8Array.of(next[0], next[1], next[2])); - } catch (err) { - logFor(entry, `rgb write failed: ${err.message}`); - break; - } - } - } finally { - entry.rgbSending = false; - } + await coalescedWrite(entry, "rgb", fromHex(hex), ([r, g, b]) => Uint8Array.of(r, g, b)); } export function makeRgbCap(schema) { diff --git a/docs/capabilities/runtime/signed-pair.js b/docs/capabilities/runtime/signed-pair.js index fde23516..bfc440ac 100644 --- a/docs/capabilities/runtime/signed-pair.js +++ b/docs/capabilities/runtime/signed-pair.js @@ -6,39 +6,24 @@ // // Drop-intermediate-values write path: pointer moves and keyboard ticks // fire faster than BLE writes can complete. -import { UUIDS_BY_CAP } from "../../ble.js"; +import { UUIDS_BY_CAP } from "../../ble/ble.js"; import { escapeHtml } from "../../dom.js"; -import { log, logFor } from "../../log.js"; +import { log } from "../../log.js"; import { state } from "../../state.js"; -import { attachJoypad, mix } from "../../joypad.js"; +import { attachJoypad, mix } from "../../input/joypad.js"; import { capSection } from "./cap-section.js"; - +import { coalescedWrite } from "./coalesced-write.js"; import { renderEntry } from "./render-bus.js"; // Clamp-on-write — callers don't have to check declared range. export async function setPairValue(entry, capName, left, right) { - const ch = entry[`${capName}Char`]; - if (!ch) return; const range = entry.capSchema?.find(s => s.name === capName)?.range || [-100, 100]; const [mn, mx] = range; const clamp = (v) => Math.max(mn, Math.min(mx, Math.round(Number(v) || 0))); - entry[`${capName}Pending`] = [clamp(left), clamp(right)]; - if (entry[`${capName}Sending`]) return; - entry[`${capName}Sending`] = true; - try { - while (entry[`${capName}Pending`]) { - const [l, r] = entry[`${capName}Pending`]; - entry[`${capName}Pending`] = null; - try { - await ch.writeValueWithResponse(Uint8Array.of(l & 0xff, r & 0xff)); - } catch (err) { - logFor(entry, `${capName} write failed: ${err.message}`); - break; - } - } - } finally { - entry[`${capName}Sending`] = false; - } + await coalescedWrite( + entry, capName, [clamp(left), clamp(right)], + ([l, r]) => Uint8Array.of(l & 0xff, r & 0xff), + ); } export function makeSignedPairCap(schema) { @@ -101,6 +86,11 @@ export function makeSignedPairCap(schema) { cleanup(entry) { entry[charField] = null; entry[leftField] = entry[rightField] = 0; + // Match level/rgb cleanup — without this, a session that calls + // cleanup without re-running initEntry would leave Sending=true + // and block every future coalescedWrite forever. + entry[`${name}Sending`] = false; + entry[`${name}Pending`] = null; }, renderSection(entry) { diff --git a/docs/capabilities/runtime/toggle.js b/docs/capabilities/runtime/toggle.js index 06145c02..c031ce0c 100644 --- a/docs/capabilities/runtime/toggle.js +++ b/docs/capabilities/runtime/toggle.js @@ -1,7 +1,7 @@ // Schema: { name: "led", char: "…d92", type: "toggle" } // State on entry[Char] (BLE handle) + entry[On] (bool); back- // compat with the prior hand-written LED module's entry.ledOn. -import { UUIDS_BY_CAP } from "../../ble.js"; +import { UUIDS_BY_CAP } from "../../ble/ble.js"; import { logFor } from "../../log.js"; import { capSection } from "./cap-section.js"; diff --git a/docs/capabilities/runtime/webrtc-installable.js b/docs/capabilities/runtime/webrtc-installable.js index 05cec0ac..a5db067d 100644 --- a/docs/capabilities/runtime/webrtc-installable.js +++ b/docs/capabilities/runtime/webrtc-installable.js @@ -3,17 +3,17 @@ // install?: { pkg: "camera", confirm: "..." } } // Chunked opcode protocol both ways (browser→robot via signal, // robot→browser via status notify). Install via the `command` cap. -import { UUIDS_BY_CAP, CHUNK_BYTES, encodeJson, decodeJson } from "../../ble.js"; +import { UUIDS_BY_CAP, CHUNK_BYTES, encodeJson, decodeJson } from "../../ble/ble.js"; import { escapeHtml } from "../../dom.js"; import { logFor } from "../../log.js"; import { persist } from "../../state.js"; -import { fetchIceServers } from "../../pairing.js"; -import { registerExternalPc, unregisterExternalPc } from "../../webrtc-robot.js"; +import { fetchIceServers } from "../../pair/pairing.js"; +import { registerExternalPc, unregisterExternalPc } from "../../webrtc/webrtc-robot.js"; import { installPackage } from "./command.js"; import { capSection } from "./cap-section.js"; -import { notifyRobotStreamChange } from "../../phones.js"; +import { notifyRobotStreamChange } from "../../pair/phones.js"; import { startWatcher, stopWatcher } from "../../watcher.js"; -import { isDetectorFailed } from "../../detectors.js"; +import { isDetectorFailed } from "../../perception/detectors.js"; const OP_BEGIN = 0x01; const OP_CHUNK = 0x02; diff --git a/docs/capabilities/runtime/wifi-scan.js b/docs/capabilities/runtime/wifi-scan.js index 3afaa263..95da1196 100644 --- a/docs/capabilities/runtime/wifi-scan.js +++ b/docs/capabilities/runtime/wifi-scan.js @@ -3,7 +3,7 @@ // chars: { scan: "…d93", join: "…d94", status: "…d95" } } // Three-char protocol: scan (read + notify list), join (write {s,p}), // status (read + notify {st, ssid, err, ip?}). -import { UUIDS_BY_CAP, decodeJson, encodeJson } from "../../ble.js"; +import { UUIDS_BY_CAP, decodeJson, encodeJson } from "../../ble/ble.js"; import { capSection } from "./cap-section.js"; import { escapeHtml } from "../../dom.js"; import { logFor } from "../../log.js"; @@ -239,8 +239,12 @@ export function makeWifiScanCap(schema) { entry[statusState] = decodeJson(await entry[statusField].readValue()) || { st: "idle" }; await entry[statusField].startNotifications(); entry[statusField].addEventListener("characteristicvaluechanged", (e) => { - entry[statusState] = decodeJson(e.target.value) || { st: "idle" }; - const { st, ssid, err: errMsg } = entry[statusState]; + const next = decodeJson(e.target.value) || { st: "idle" }; + // Skip when payload is identical — ESP32 re-publishes status + // on flap/coex events without state change. + if (JSON.stringify(next) === JSON.stringify(entry[statusState])) return; + entry[statusState] = next; + const { st, ssid, err: errMsg } = next; logFor(entry, `${name} ${st}${ssid ? ` [${ssid}]` : ""}${errMsg ? ` — ${errMsg}` : ""}`); renderEntry(entry); }); diff --git a/docs/event-bus.js b/docs/event-bus.js new file mode 100644 index 00000000..53783344 --- /dev/null +++ b/docs/event-bus.js @@ -0,0 +1,28 @@ +// Typed pub/sub for cross-cutting events. A new "X-like but slightly +// different" topic is a smell — investigate before adding one. + +export const TOPICS = Object.freeze({ + TOOL_CALL: "tool.call", // { tool, input } + TOOL_RESULT: "tool.result", // { tool, ok, error } + WATCHER_FIRE: "watcher.fire", // { entry, detection, kind } + PHONE_ATTACHED: "phone.attached", // { phoneId, robotId, robotLabel } + PHONE_DETACHED: "phone.detached", // { phoneId } +}); + +const _subs = new Map(); + +export function on(topic, fn) { + let set = _subs.get(topic); + if (!set) { set = new Set(); _subs.set(topic, set); } + set.add(fn); + return () => { set.delete(fn); }; +} + +export function emit(topic, payload) { + const set = _subs.get(topic); + if (!set || set.size === 0) return; + for (const fn of set) { + try { fn(payload); } + catch (err) { console.error(`[bus] ${topic} subscriber threw:`, err); } + } +} diff --git a/docs/firmware/bins/aithinker_cam/bootloader.bin b/docs/firmware/bins/aithinker_cam/bootloader.bin index d8d08563..f97fb347 100644 Binary files a/docs/firmware/bins/aithinker_cam/bootloader.bin and b/docs/firmware/bins/aithinker_cam/bootloader.bin differ diff --git a/docs/firmware/bins/aithinker_cam/firmware.bin b/docs/firmware/bins/aithinker_cam/firmware.bin index 4c09902d..45fad1db 100644 Binary files a/docs/firmware/bins/aithinker_cam/firmware.bin and b/docs/firmware/bins/aithinker_cam/firmware.bin differ diff --git a/docs/firmware/bins/aithinker_cam/manifest.json b/docs/firmware/bins/aithinker_cam/manifest.json index e5291fe8..90abb5c9 100644 --- a/docs/firmware/bins/aithinker_cam/manifest.json +++ b/docs/firmware/bins/aithinker_cam/manifest.json @@ -1,8 +1,8 @@ { "board": "aithinker_cam", "chip": "esp32", - "version": "1c84f7d", - "built_at": "2026-05-23T01:13:52Z", + "version": "bd44006", + "built_at": "2026-05-23T16:39:21Z", "files": [ { "path": "bootloader.bin", "offset": "0x1000" }, { "path": "partitions.bin", "offset": "0x8000" }, diff --git a/docs/firmware/bins/aithinker_cam_webrtc/bootloader.bin b/docs/firmware/bins/aithinker_cam_webrtc/bootloader.bin index 416a4339..dee8a0d5 100644 Binary files a/docs/firmware/bins/aithinker_cam_webrtc/bootloader.bin and b/docs/firmware/bins/aithinker_cam_webrtc/bootloader.bin differ diff --git a/docs/firmware/bins/aithinker_cam_webrtc/firmware.bin b/docs/firmware/bins/aithinker_cam_webrtc/firmware.bin index 3d59f5ac..a7ea25e8 100644 Binary files a/docs/firmware/bins/aithinker_cam_webrtc/firmware.bin and b/docs/firmware/bins/aithinker_cam_webrtc/firmware.bin differ diff --git a/docs/firmware/bins/aithinker_cam_webrtc/manifest.json b/docs/firmware/bins/aithinker_cam_webrtc/manifest.json index 28c53ae4..b9ebd9e3 100644 --- a/docs/firmware/bins/aithinker_cam_webrtc/manifest.json +++ b/docs/firmware/bins/aithinker_cam_webrtc/manifest.json @@ -1,8 +1,8 @@ { "board": "aithinker_cam_webrtc", "chip": "esp32", - "version": "1c84f7d", - "built_at": "2026-05-23T01:13:52Z", + "version": "bd44006", + "built_at": "2026-05-23T16:39:19Z", "files": [ { "path": "bootloader.bin", "offset": "0x1000" }, { "path": "partitions.bin", "offset": "0x8000" }, diff --git a/docs/firmware/bins/c3_supermini/bootloader.bin b/docs/firmware/bins/c3_supermini/bootloader.bin index 69c6ddf6..e35bc132 100644 Binary files a/docs/firmware/bins/c3_supermini/bootloader.bin and b/docs/firmware/bins/c3_supermini/bootloader.bin differ diff --git a/docs/firmware/bins/c3_supermini/firmware.bin b/docs/firmware/bins/c3_supermini/firmware.bin index 54a1f235..2bc000af 100644 Binary files a/docs/firmware/bins/c3_supermini/firmware.bin and b/docs/firmware/bins/c3_supermini/firmware.bin differ diff --git a/docs/firmware/bins/c3_supermini/manifest.json b/docs/firmware/bins/c3_supermini/manifest.json index dff0fedd..c785a176 100644 --- a/docs/firmware/bins/c3_supermini/manifest.json +++ b/docs/firmware/bins/c3_supermini/manifest.json @@ -1,8 +1,8 @@ { "board": "c3_supermini", "chip": "esp32c3", - "version": "1c84f7d", - "built_at": "2026-05-23T01:13:56Z", + "version": "bd44006", + "built_at": "2026-05-23T16:39:21Z", "files": [ { "path": "bootloader.bin", "offset": "0x0" }, { "path": "partitions.bin", "offset": "0x8000" }, diff --git a/docs/firmware/bins/devkit/bootloader.bin b/docs/firmware/bins/devkit/bootloader.bin index 8a1edc5d..b28c2d0a 100644 Binary files a/docs/firmware/bins/devkit/bootloader.bin and b/docs/firmware/bins/devkit/bootloader.bin differ diff --git a/docs/firmware/bins/devkit/firmware.bin b/docs/firmware/bins/devkit/firmware.bin index 46488096..902fb7f6 100644 Binary files a/docs/firmware/bins/devkit/firmware.bin and b/docs/firmware/bins/devkit/firmware.bin differ diff --git a/docs/firmware/bins/devkit/manifest.json b/docs/firmware/bins/devkit/manifest.json index 946eb8c3..b35a073e 100644 --- a/docs/firmware/bins/devkit/manifest.json +++ b/docs/firmware/bins/devkit/manifest.json @@ -1,8 +1,8 @@ { "board": "devkit", "chip": "esp32", - "version": "1c84f7d", - "built_at": "2026-05-23T01:13:53Z", + "version": "bd44006", + "built_at": "2026-05-23T16:39:22Z", "files": [ { "path": "bootloader.bin", "offset": "0x1000" }, { "path": "partitions.bin", "offset": "0x8000" }, diff --git a/docs/firmware/pi_robot/SYSTEMD.md b/docs/firmware/pi_robot/SYSTEMD.md new file mode 100644 index 00000000..b9f5f886 --- /dev/null +++ b/docs/firmware/pi_robot/SYSTEMD.md @@ -0,0 +1,25 @@ +# Pi systemd patterns + +## Unit preconditions belong in the script, not in `Condition*` + +`ConditionPathExists=`, `ConditionFileNotEmpty=`, etc. evaluate **once** at +unit-start time and silently skip the unit when false — no retry, no log +noise the operator can search for, no recovery without manual `systemctl +start`. When the prerequisite is racy (asynchronous kernel-driver probes, +hotplug events, network reachability, anything not synchronously guaranteed +by an `After=` ordering), a missed check turns the unit invisibly inert +until the next reboot, and even that may race the same way. + +Pattern instead: drop the `Condition*` and wait inside the `ExecStart` +script with a bounded poll loop. The script makes the timeout legible (logs +a clear failure on exhaustion), the unit gets to use `Restart=on-failure` +for self-healing, and a future contributor can read the wait-condition next +to the work it gates. The `usb-gadget.service` → `usb-gadget-setup.sh` pair +is the reference shape: 10 s poll for `/sys/class/udc` to populate, clean +exit-1 with a message if dwc2 never publishes. + +If the precondition really is synchronous and unambiguous (a config file +the user wrote, the existence of a hardware feature already enumerated at +boot), `Condition*` is fine. The line is "does this become true +asynchronously after the unit's `After=` ordering?" — if yes, wait in the +script. diff --git a/docs/firmware/pi_robot/wheels/dbus_fast-5.0.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl b/docs/firmware/pi_robot/wheels/dbus_fast-5.0.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl deleted file mode 100644 index 403a9691..00000000 Binary files a/docs/firmware/pi_robot/wheels/dbus_fast-5.0.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl and /dev/null differ diff --git a/docs/firmware/pi_robot/wheels/dbus_fast-5.0.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl b/docs/firmware/pi_robot/wheels/dbus_fast-5.0.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl deleted file mode 100644 index eae5caf2..00000000 Binary files a/docs/firmware/pi_robot/wheels/dbus_fast-5.0.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl and /dev/null differ diff --git a/docs/firmware/pi_robot/wheels/dbus_fast-5.0.4-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl b/docs/firmware/pi_robot/wheels/dbus_fast-5.0.4-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl new file mode 100644 index 00000000..2a9c16c6 Binary files /dev/null and b/docs/firmware/pi_robot/wheels/dbus_fast-5.0.4-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl differ diff --git a/docs/firmware/pi_robot/wheels/dbus_fast-5.0.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl b/docs/firmware/pi_robot/wheels/dbus_fast-5.0.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl new file mode 100644 index 00000000..98f3e4ce Binary files /dev/null and b/docs/firmware/pi_robot/wheels/dbus_fast-5.0.4-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl differ diff --git a/docs/format.js b/docs/format.js index 177255ce..91e2501d 100644 --- a/docs/format.js +++ b/docs/format.js @@ -66,22 +66,9 @@ export function formatUptime(telemetry) { return `up ${Math.floor(s / 86400)}d`; } -// "WiFi 192.168.1.42" / "WiFi joining…" / null when nothing useful to show. -// Status shape matches pi_robot.py's wifi-status JSON ({st, ssid, ip}). -export function formatWifi(wifiStatus) { - const w = wifiStatus; - if (!w) return null; - if (w.st === "joined") return `WiFi ${w.ip || w.ssid || "joined"}`; - if (w.st === "joining") return "WiFi joining…"; - if (w.st === "scanning") return "WiFi scanning"; - if (w.st === "failed") return "WiFi failed"; - return null; // idle / unknown — caller renders nothing -} - -// Terser WiFi for the primary row, where width is precious — drops the IP -// (which lives in the system line / WiFi section). "WiFi" / "WiFi joining…" / -// "WiFi failed". Stays null for idle so an offline robot's row doesn't carry -// an empty label. +// "WiFi" / "WiFi joining…" / "WiFi failed", null for idle so an +// offline robot's row doesn't carry an empty label. Status shape +// matches pi_robot.py's wifi-status JSON ({st, ssid, ip}). export function formatWifiShort(wifiStatus) { const w = wifiStatus; if (!w) return null; diff --git a/docs/index.html b/docs/index.html index 7a04d511..57d3f74d 100644 --- a/docs/index.html +++ b/docs/index.html @@ -139,7 +139,7 @@

- Report an issue + Report an issue

@@ -328,7 +328,7 @@

- JS that drives connected robots over BLE. Runs in this tab — nothing is uploaded. In scope: robot, robots, phones, pip, sleep, log, speak. Cmd/Ctrl-Enter to run. USER-CODE.md. + JS that drives connected robots over BLE. Runs in this tab — nothing is uploaded. In scope: robot, robots, phones, pip, sleep, log, speak. Cmd/Ctrl-Enter to run. USER-CODE.md.
diff --git a/docs/gamepad.js b/docs/input/gamepad.js similarity index 93% rename from docs/gamepad.js rename to docs/input/gamepad.js index eb691a1a..bbf57c58 100644 --- a/docs/gamepad.js +++ b/docs/input/gamepad.js @@ -1,8 +1,8 @@ // Polling stops when the last pad disconnects so idle cost is zero. -import { $ } from "./dom.js"; -import { log } from "./log.js"; -import { state } from "./state.js"; -import { sendPairById } from "./capabilities/runtime/signed-pair.js"; +import { $ } from "../dom.js"; +import { log } from "../log.js"; +import { state } from "../state.js"; +import { sendPairById } from "../capabilities/runtime/signed-pair.js"; const GAMEPAD_DEADZONE = 0.10; let _gamepadTargetId = null; diff --git a/docs/joypad.js b/docs/input/joypad.js similarity index 100% rename from docs/joypad.js rename to docs/input/joypad.js diff --git a/docs/mobile-tilt-drive.js b/docs/input/mobile-tilt-drive.js similarity index 99% rename from docs/mobile-tilt-drive.js rename to docs/input/mobile-tilt-drive.js index 2216734a..1a8844d5 100644 --- a/docs/mobile-tilt-drive.js +++ b/docs/input/mobile-tilt-drive.js @@ -1,4 +1,4 @@ -import { $ } from "./dom.js"; +import { $ } from "../dom.js"; import { mix } from "./joypad.js"; // Phone-as-steering-wheel + on-screen throttle pedals. Rolling the phone diff --git a/docs/log.js b/docs/log.js index b4d369ed..7d6e6f85 100644 --- a/docs/log.js +++ b/docs/log.js @@ -1,6 +1,5 @@ import { $ } from "./dom.js"; -let _lastLogNode = null; let _lastLogMsgNode = null; let _lastLogNameNode = null; let _lastLogKey = null; @@ -90,7 +89,6 @@ export const log = (msg, name = "") => { if (name && name === _lastLogName && _lastLogNameNode) { _lastLogNameNode.classList.add("dup"); } - _lastLogNode = line; _lastLogMsgNode = msgSpan; _lastLogNameNode = nameSpan; _lastLogName = name; diff --git a/docs/mobile.js b/docs/mobile.js index e04449c9..d48f5307 100644 --- a/docs/mobile.js +++ b/docs/mobile.js @@ -1,17 +1,17 @@ import { $ } from "./dom.js"; -import { joinPairingRoom } from "./pairing.js"; -import { attachJoypad } from "./joypad.js"; +import { joinPairingRoom } from "./pair/pairing.js"; +import { attachJoypad } from "./input/joypad.js"; import { getMyPubkeyB64 } from "./signal-sdk/v1/peer-key.js"; import { makeTrustStore } from "./trust.js"; import { setupServiceWorker, wireInstallMenuItem, wireCheckUpdatesMenuItem, wireHardRefresh, wireDiagnosticsMenuItem, setReportIssueLink, readSwVersion, } from "./app-menu.js"; -import { wireTiltDrive, stopTilt } from "./mobile-tilt-drive.js"; +import { wireTiltDrive, stopTilt } from "./input/mobile-tilt-drive.js"; import { showReconnect, hideReconnect, wireReconnect, cameraUnavailableReason, -} from "./mobile-qr-scan.js"; -import { startNearbyDiscovery, deviceLabel } from "./mobile-nearby-discovery.js"; +} from "./pair/mobile-qr-scan.js"; +import { startNearbyDiscovery, deviceLabel } from "./pair/mobile-nearby-discovery.js"; const _trust = makeTrustStore("better-robotics:trust:v1"); let _peer = null; @@ -67,38 +67,54 @@ function wireStopButton() { } -// Wire: see askHuman() in phones.js. One ask on screen at a time; a second -// replaces the first, prior resolves as skipped when its server-side timer -// fires. -function showAsk(msg) { +// Shared phone-side dialog for ask-human and camera-share-request. +// One dialog on screen at a time; a second showPhoneAskDialog call +// replaces the first (the prior's pending response resolves through +// the server-side timeout). `options` is either: +// - array of strings → tappable answer buttons (each calls onRespond +// with its label, once) +// - array of {label, onClick} → custom click handler (e.g. for the +// camera-share Share button that needs to run async work) +// `freeText` enables the text input fallback when no options exist. +function showPhoneAskDialog({ question, imageDataUrl, options, freeText, skipValue, onRespond }) { const dialog = $("phone-ask-dialog"); const img = $("phone-ask-image"); const q = $("phone-ask-question"); const optsEl = $("phone-ask-options"); const free = $("phone-ask-free"); const freeInput = $("phone-ask-free-input"); - - if (msg.imageDataUrl) { img.src = msg.imageDataUrl; img.hidden = false; } - else { img.hidden = true; img.src = ""; } - q.textContent = msg.question || ""; - + let responded = false; + const close = () => { if (!responded) { responded = true; dialog.close(); } }; const respond = (answer) => { - _peer?.send({ type: "ask-reply", askId: msg.askId, answer }); + if (responded) return; + responded = true; + onRespond(answer); dialog.close(); }; + if (imageDataUrl) { img.src = imageDataUrl; img.hidden = false; } + else { img.hidden = true; img.src = ""; } + q.textContent = question || ""; + optsEl.innerHTML = ""; - if (Array.isArray(msg.options) && msg.options.length > 0) { - free.hidden = true; - for (const opt of msg.options) { + const hasOptions = Array.isArray(options) && options.length > 0; + if (hasOptions) { + for (const opt of options) { const b = document.createElement("button"); b.type = "button"; b.className = "ask-option sm"; - b.textContent = String(opt); - b.addEventListener("click", () => respond(String(opt)), { once: true }); + if (typeof opt === "string") { + b.textContent = opt; + b.addEventListener("click", () => respond(opt), { once: true }); + } else { + b.textContent = opt.label; + b.addEventListener("click", () => opt.onClick({ respond, close }), { once: true }); + } optsEl.appendChild(b); } - } else { + } + + if (freeText && !hasOptions) { free.hidden = false; freeInput.value = ""; free.onsubmit = (e) => { @@ -106,78 +122,58 @@ function showAsk(msg) { const v = freeInput.value.trim(); if (v) respond(v); }; + } else { + free.hidden = true; } - $("phone-ask-skip").onclick = () => respond(null); + $("phone-ask-skip").onclick = () => respond(skipValue); if (!dialog.open) dialog.showModal(); - // Autofocus the free input when there are no tappable options, so the - // keyboard pops up immediately on mobile. - if (free.hidden === false) setTimeout(() => freeInput.focus(), 50); + // Autofocus the free input so the soft keyboard pops up on mobile. + if (!free.hidden) setTimeout(() => freeInput.focus(), 50); } -// Desktop relayed a "please share your camera" prompt over the data -// channel. Browsers won't let getUserMedia() run without a user gesture -// in this tab; the Share button click below IS that gesture, so -// toggleShareCamera() called synchronously from the handler can call -// getUserMedia successfully. The handler awaits the share flow and -// reports back so the desktop's startHelperCamera tool can resolve +function showAsk(msg) { + showPhoneAskDialog({ + question: msg.question, + imageDataUrl: msg.imageDataUrl, + options: msg.options, + freeText: true, + skipValue: null, + onRespond: (answer) => _peer?.send({ type: "ask-reply", askId: msg.askId, answer }), + }); +} + +// Browsers won't let getUserMedia() run without a user gesture in +// this tab; the Share button click below IS that gesture. The handler +// reports back so the desktop's startHelperCamera tool resolves // instead of dead-ending on a string error. -// -// Reuses the phone-ask-dialog DOM rather than introducing a parallel -// modal — same Share / Not now affordance shape as askHuman. function showCameraShareRequest(msg) { - const dialog = $("phone-ask-dialog"); - const img = $("phone-ask-image"); - const q = $("phone-ask-question"); - const optsEl = $("phone-ask-options"); - const free = $("phone-ask-free"); - let responded = false; - - img.hidden = true; img.src = ""; - q.textContent = "Pip wants to use this phone's camera. Share it?"; - free.hidden = true; - optsEl.innerHTML = ""; - - const respond = (result, error) => { - if (responded) return; - responded = true; - _peer?.send({ type: "camera-share-result", requestId: msg.requestId, result, error }); - dialog.close(); - }; - - const shareBtn = document.createElement("button"); - shareBtn.type = "button"; - shareBtn.className = "ask-option sm"; - shareBtn.textContent = _shareStream ? "Already sharing" : "Share camera"; - shareBtn.addEventListener("click", async () => { - // Already-sharing fast path — desktop sometimes asks before its - // onTrack handler has registered the stream we already sent. - if (_shareStream) { respond("shared"); return; } - // toggleShareCamera() awaits getUserMedia internally; the user - // gesture from this click propagates through the first await per - // the user-activation spec, so the permission dialog (if any) is - // allowed to show. The {ok, error} return surfaces the real - // permission / device error to the desktop instead of collapsing - // every failure to "user dismissed". - try { - const res = await toggleShareCamera(); - if (res?.ok) respond("shared"); - else respond("error", res?.error || "getUserMedia returned no stream"); - } catch (err) { - respond("error", err.message || String(err)); - } - }, { once: true }); - optsEl.appendChild(shareBtn); - - const cancelBtn = document.createElement("button"); - cancelBtn.type = "button"; - cancelBtn.className = "ask-option sm"; - cancelBtn.textContent = "Not now"; - cancelBtn.addEventListener("click", () => respond("denied"), { once: true }); - optsEl.appendChild(cancelBtn); - - $("phone-ask-skip").onclick = () => respond("denied"); - if (!dialog.open) dialog.showModal(); + const send = (result, error) => _peer?.send({ + type: "camera-share-result", requestId: msg.requestId, result, error, + }); + showPhoneAskDialog({ + question: "Pip wants to use this phone's camera. Share it?", + skipValue: "denied", + onRespond: (answer) => send(answer ?? "denied"), + options: [ + { + label: _shareStream ? "Already sharing" : "Share camera", + onClick: async ({ respond, close }) => { + // Desktop sometimes asks before its onTrack handler has + // registered the stream we already sent — short-circuit. + if (_shareStream) { respond("shared"); return; } + try { + const res = await toggleShareCamera(); + if (res?.ok) respond("shared"); + else { send("error", res?.error || "getUserMedia returned no stream"); close(); } + } catch (err) { + send("error", err.message || String(err)); close(); + } + }, + }, + { label: "Not now", onClick: ({ respond }) => respond("denied") }, + ], + }); } // Pairing layer fires onTrack per track; both video tracks of one stream @@ -185,18 +181,25 @@ function showCameraShareRequest(msg) { function onPeerTrack(e) { const v = $("phone-cam"); const section = $("phone-cam-section"); + const waiting = $("phone-cam-waiting"); const stream = e.streams?.[0]; if (!stream) return; if (v.srcObject !== stream) v.srcObject = stream; section.hidden = false; - // When the remote ends the track (laptop user clicked Stop), hide the - // section so the phone doesn't show a frozen last frame as if it were live. + if (waiting) waiting.hidden = true; + // When the remote ends the track (laptop user clicked Stop), surface + // the "no stream" state instead of a frozen last frame. In operator- + // cam mode that means resurrecting the waiting overlay; in default + // mode it means hiding the whole section. for (const t of stream.getTracks()) { t.addEventListener("ended", () => { - // If all tracks are ended, hide. Other tracks may still be live. if (stream.getTracks().every(t2 => t2.readyState === "ended")) { - section.hidden = true; v.srcObject = null; + if (_currentScreenMode === "operator-cam") { + if (waiting) waiting.hidden = false; + } else { + section.hidden = true; + } } }); } @@ -262,6 +265,7 @@ function renderCameraPicker() { function onPeerMessage(msg) { if (msg.type === "ask") { showAsk(msg); return; } if (msg.type === "request-camera-share") { showCameraShareRequest(msg); return; } + if (msg.type === "screen-mode") { applyScreenMode(msg.mode, msg.robotLabel); return; } if (msg.type === "available-sources") { _availableSources.set(msg.robotId, { sources: msg.sources || [], active: msg.active || null, @@ -295,6 +299,72 @@ function onPeerMessage(msg) { } } +// Phone-on-robot screen modes (set by the desktop via setPhoneScreenMode +// when the operator mounts a phone via attachPhoneCameraTo): +// "operator-cam" — fullscreen incoming video (the operator's face if +// a local cam's "Send to phone" role is on; black otherwise). +// "default" — normal operator companion UI. +// In attached mode the sticky Stop button stays visible (semi- +// transparent) so anyone in the room can still halt the robot. Desktop +// owns the choice; the phone has no local override. Reset on peer. +// onClose so a disconnect leaves the user with normal UI to reconnect. +let _currentScreenMode = "default"; +function applyScreenMode(mode, robotLabel) { + const body = document.body; + const section = $("phone-cam-section"); + const waiting = $("phone-cam-waiting"); + const waitingName = $("phone-cam-waiting-name"); + const v = $("phone-cam"); + if (mode === _currentScreenMode) { + body.dataset.attachedTo = robotLabel || ""; + if (waitingName) waitingName.textContent = robotLabel || "this robot"; + return; + } + body.classList.remove("phone-mounted", "phone-attached"); + delete body.dataset.attachedTo; + if (mode === "operator-cam") { + body.classList.add("phone-mounted", "phone-attached"); + body.dataset.attachedTo = robotLabel || ""; + // Surface the camera section even before a track lands so the screen + // isn't an unlabeled black void. Hide the section's normal "tap to + // switch source" affordance; the waiting overlay takes over. + if (section) section.hidden = false; + const hasStream = !!v?.srcObject; + if (waiting) waiting.hidden = hasStream; + if (waitingName) waitingName.textContent = robotLabel || "this robot"; + } else { + // Leaving attached mode: the section's visibility goes back to + // "shown only when a stream is present" (onPeerTrack toggles it). + if (section && !v?.srcObject) section.hidden = true; + if (waiting) waiting.hidden = true; + } + _currentScreenMode = mode === "operator-cam" ? mode : "default"; + // Keep the phone screen on while it's mounted on a robot — otherwise + // iOS dims and locks after ~30s of no tap, which breaks the operator- + // cam relay. Acquire unconditionally; iOS may ignore without a recent + // gesture, in which case the visibilitychange handler retries when + // the user returns. + if (_currentScreenMode === "default") releaseWakeLock(); + else acquireWakeLock(); +} + +// Screen Wake Lock — held while the phone is attached to a robot. +// Auto-released by the browser when the tab is backgrounded; we clear +// the ref on visibility=hidden and re-acquire on visibility=visible if +// still attached. iOS requires the request to land near a user gesture +// for first-time acquire; pairing tap chain usually satisfies this. +let _wakeLock = null; +async function acquireWakeLock() { + if (_wakeLock || !("wakeLock" in navigator)) return; + try { _wakeLock = await navigator.wakeLock.request("screen"); } + catch { _wakeLock = null; } +} +async function releaseWakeLock() { + if (!_wakeLock) return; + try { await _wakeLock.release(); } catch {} + _wakeLock = null; +} + function wireJoypad() { const pad = $("phone-joypad"); const knob = pad?.querySelector(".joypad-knob"); @@ -317,6 +387,11 @@ function wireBackgroundStop() { stopTilt(); _peer?.send({ type: "drive", l: 0, r: 0 }); _stopSharing(); + // Browser auto-releases the wake lock on background; drop the ref + // so a re-acquire on return doesn't short-circuit. + _wakeLock = null; + } else if (_currentScreenMode !== "default") { + acquireWakeLock(); } }); } @@ -680,6 +755,10 @@ async function init() { $("phone-cam-section").hidden = true; _stopSharing(); $("phone-share").hidden = true; + // Exit attached-mode on disconnect so the user lands on normal UI + // to reconnect from. Desktop will re-send "attached" on reconnect + // if this phone was mounted (see phones.js phone-connect path). + applyScreenMode("default"); showReconnect("Lost the desktop. Scan a fresh QR to reconnect."); startNearbyDiscovery(); }); diff --git a/docs/mobile-nearby-discovery.js b/docs/pair/mobile-nearby-discovery.js similarity index 96% rename from docs/mobile-nearby-discovery.js rename to docs/pair/mobile-nearby-discovery.js index f7742f59..998851aa 100644 --- a/docs/mobile-nearby-discovery.js +++ b/docs/pair/mobile-nearby-discovery.js @@ -1,7 +1,7 @@ -import { $ } from "./dom.js"; -import { discover } from "./signal-sdk/v1/discover.js"; -import { getMyPubkeyB64 } from "./signal-sdk/v1/peer-key.js"; -import { pairRequestClient } from "./signal-sdk/v1/pair-request.js"; +import { $ } from "../dom.js"; +import { discover } from "../signal-sdk/v1/discover.js"; +import { getMyPubkeyB64 } from "../signal-sdk/v1/peer-key.js"; +import { pairRequestClient } from "../signal-sdk/v1/pair-request.js"; // LAN discovery — request/accept flow. // diff --git a/docs/mobile-qr-scan.js b/docs/pair/mobile-qr-scan.js similarity index 99% rename from docs/mobile-qr-scan.js rename to docs/pair/mobile-qr-scan.js index 74b64ad6..da264dbd 100644 --- a/docs/mobile-qr-scan.js +++ b/docs/pair/mobile-qr-scan.js @@ -1,4 +1,4 @@ -import { $ } from "./dom.js"; +import { $ } from "../dom.js"; let _scanStream = null; let _scanRaf = 0; diff --git a/docs/pairing.js b/docs/pair/pairing.js similarity index 99% rename from docs/pairing.js rename to docs/pair/pairing.js index 00ae42bb..8a59e8ef 100644 --- a/docs/pairing.js +++ b/docs/pair/pairing.js @@ -2,8 +2,7 @@ // hard failure (channel closed and ICE restart didn't recover within the // grace window) counts as "disconnected, rescan QR". // -// Signal protocol (~/Github/jonasneves/signal/src/server/room.js): -// connect wss://signal.neevs.io/{room}/ws +// Signal protocol (wss://signal.neevs.io/{room}/ws): // send { type: "signal", peer: myPeerId, data: { offer|answer|ice } } // recv { type: "state", peers: { peerId: lastSignal } } // once, on connect // { type: "signal", peer: theirPeerId, data: {...} } @@ -13,7 +12,7 @@ // fixed role key. The server's `state` snapshot recovers signals sent // before late-joiners arrive; applied only when we're not already on a // healthy connection. -import { SIGNAL_WS, TURN_URL } from "./endpoints.js"; +import { SIGNAL_WS, TURN_URL } from "../endpoints.js"; // TURN proxy mints short-lived Cloudflare Realtime creds. STUN stays in // line as a zero-roundtrip fallback so a degraded proxy (offline, rate- // limited, mis-deployed) still gives us STUN-only pairing instead of nothing. @@ -51,7 +50,7 @@ const QUEUE_MAX = 1000; // as long as the dialog is — cleanup happens on dialog close. const ICE_TIMEOUT_MS = 30000; -import { parseCandidate, probeNetwork } from "./net-probe.js"; +import { parseCandidate, probeNetwork } from "../net-probe.js"; // Per-attempt diagnostic capture: every local + remote ICE candidate this // side has seen during the most recent pair attempt. The Diagnostics diff --git a/docs/phone-helpers.js b/docs/pair/phone-helpers.js similarity index 94% rename from docs/phone-helpers.js rename to docs/pair/phone-helpers.js index 820981f0..bd1c5131 100644 --- a/docs/phone-helpers.js +++ b/docs/pair/phone-helpers.js @@ -1,8 +1,9 @@ -import { $, escapeHtml } from "./dom.js"; +import { $, escapeHtml } from "../dom.js"; import { listPhones, setPhonesChangeHandler, notifyRobotStreamChange, requestPhoneCameraShare, setPhoneFeedStream } from "./phones.js"; -import { state } from "./state.js"; -import { settings, saveSettings } from "./settings.js"; -import { setOverheadSource, clearOverheadSource } from "./aruco.js"; +import { emit as busEmit, TOPICS } from "../event-bus.js"; +import { state } from "../state.js"; +import { settings, saveSettings } from "../settings.js"; +import { setOverheadSource, clearOverheadSource } from "../perception/aruco.js"; // Permanent print-marker affordance, rendered whenever a helper is the // active overhead source. Single source of truth (no duplication into @@ -51,6 +52,7 @@ export function initHelpers() { if (navigator.mediaDevices?.addEventListener) { navigator.mediaDevices.addEventListener("devicechange", () => enumerateLocalCameras()); } + wireDelegation(); render(); } @@ -222,6 +224,7 @@ export function attachPhoneCameraTo(phoneId, robotId) { } if (!robotId) { _phoneAttachments.delete(phoneId); + busEmit(TOPICS.PHONE_DETACHED, { phoneId }); } else { if (settings.arucoOverheadPhoneId === phoneId) { settings.arucoOverheadPhoneId = null; @@ -230,6 +233,8 @@ export function attachPhoneCameraTo(phoneId, robotId) { _phoneAttachments.set(phoneId, robotId); const ps = _phoneStreams.get(phoneId); if (ps?.stream) routeAttachedStream(phoneId, ps.stream); + const robot = state.devices.get(robotId); + busEmit(TOPICS.PHONE_ATTACHED, { phoneId, robotId, robotLabel: robot?.name || null }); } render(); } @@ -359,7 +364,7 @@ function renderPhoneCard(p) { // robot card (see attachPhoneCameraTo callers in app.js). The mounted // status surfaces in the meta line above. const currentRole = isOverhead ? "overhead" : "operator"; - const picker = (live && !attachedRobot) ? ` + const rolePicker = (live && !attachedRobot) ? `