Motion planning: diff drive pose controller + dashboard UI#1
Open
JaredBaileyDuke wants to merge 7 commits into
Open
Motion planning: diff drive pose controller + dashboard UI#1JaredBaileyDuke wants to merge 7 commits into
JaredBaileyDuke wants to merge 7 commits into
Conversation
Adds packages/control — a C++ library with two PoseController implementations (continuous and spin-move-spin), diff drive kinematics, and pybind11 bindings for future compiled use. Adds firmware/pi_robot/motion_control.py — pure Python mirror of the C++ interface that runs on the Pi today. Wires into pi_robot.py: three new BLE characteristics (motion-goal, motion-pose, motion-status), wheel config from pi-robot.conf, 20 Hz tick loop, and automatic cancellation on manual joystick input. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Redefines the 2-byte motor protocol as (forward%, turn%) instead of (left%, right%). The Pi converts to physical (v, ω) using wheel_separation and max_wheel_speed from pi-robot.conf, then applies diff drive kinematics to get correct wheel PWM values. Handles backward turns correctly without a sign flip. Dashboard senders updated to match: - joypad: sends raw (forward, turn) instead of pre-mixed (left, right) - keyboard: removes mix() call, sends (throttle, turn) directly - gamepad: switches from dual-stick tank to left-Y forward + right-X turn The 4-byte LLM pulse path is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a Motion cap card (pose-control type) that appears on Pi robots with motors enabled. Supports two modes selectable per robot: - Manual: virtual joypad wired to motorsChar (same diff-drive kinematics as the Motors card) - Pose Control: X/Y (m) + theta (°) inputs, controller toggle (Spin→Move→Spin / Continuous), Go/Cancel buttons, live status pill, collapsible wheel-config overrides (separation, radius, max speed) Dashboard changes: - public/capabilities/runtime/motion.js — new makeMotionCap - public/ble.js — UUIDS_BY_CAP.motion + import 3 motion UUIDs - public/capabilities/runtime/index.js — register "pose-control" type - public/capabilities/runtime/cap-section.js — motion defaults open - public/styles.css — motion card layout styles Firmware changes: - pi_robot.py _motion_start + _motion_goal_handle_write now accept optional wheel_sep / wheel_r / max_spd overrides from the goal message; fall back to pi-robot.conf values when absent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Before sending a pose goal, write entry.cvPosition to motionPoseChar so the robot's odometry starts from the camera-measured position rather than its dead-reckoned estimate. Fix is skipped if cvPosition is absent or stale (> 10s old). The pose tab shows the last CV fix and dims it when stale. No-ops until the cv branch is merged and entry.cvPosition is populated. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the Go-button-triggered CV fix with a 500ms polling loop (startCvWatch/stopCvWatch) that automatically writes odometry resets to motionPoseChar whenever a fresh, non-stale cvPosition arrives. No user interaction required; no CV state visible in the motion card. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Member
|
Oops, just seeing your PRs right now as I was disconnecting. I will take a look at them tomorrow, but ping me on Teams if you have anything blocking you. |
Collaborator
Author
|
There is no rush. Just feeling inspired today and stayed up late working. I'm bringing in my balance bot today so that we can test the code whenever you are free next |
8 tasks
Merge picks up motor calibration orientation, ESP32 camera lifecycle, overhead ArUco helper relocation, and other main-line changes since the branch's base. Resolved conflict in firmware/pi_robot/pi_robot.py: kept both the new motors_orientation flips from main (ORIENT_SWAP, ORIENT_INVERT_A/B) and the wheel-geometry config additions from motion_planning (WHEEL_SEPARATION, WHEEL_RADIUS, MAX_WHEEL_SPEED). Motion controller writes go through _apply_motors so the orientation flips apply to pose-driven motion as well. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> # Conflicts: # firmware/pi_robot/pi_robot.py
jonasneves
added a commit
that referenced
this pull request
May 20, 2026
Four latency optimizations layered onto the OpenAI TTS path. None touch the Web Speech fallback or change behavior for users without an API key configured. 1. MediaSource streaming. Old path: await res.blob() — wait for the FULL mp3 before any audio plays (~500-1000ms TTFA). New: pipe the response body chunk-by-chunk into a MediaSource via a backpressure queue, audio starts as soon as the decoder has enough header data (~50-100ms after first chunk lands). Saves 200-400ms on every single speak(). Falls back to buffer-then-play if MSE isn't supported (some Safari versions) or the codec mismatches. 2. gpt-4o-mini-tts. Replaces tts-1. Lower TTFB (~250ms vs ~400ms by OpenAI's published benchmarks) AND supports a free-form `instructions` parameter for voice character — no more picking from 6 fixed voices. Default instructions: "peppy, youthful, slightly excited energy. Sound like a young hero — quick, enthusiastic, friendly. American accent, slightly higher pitched." Direct response to the "voice too adult / slightly sensual" feedback on onyx. tts-1 ignores the field; we only send it on gpt-4o-* models. 3. Connection pre-warm. First speak() fires a parallel HEAD on /v1/models to pay the TLS handshake (~150-250ms) once instead of inside the first user-visible speak. Subsequent calls reuse the kept-alive HTTP/2 connection from the browser pool. Once per page load, fire-and-forget. 4. (Already done last commit) — speak() returns a Promise that resolves on actual audio.onended, tool dispatch awaits it. Together with #1, the next motion fires the moment audio truly ends — no more truncation, no over-padding. currentVoice() debug helper now reports the active engine + model + voice + whether streaming is supported, so DevTools `currentVoice()` gives a complete picture without introspecting the module. Backpressure queue inside playStreamingMSE: reader can outpace the decoder (chunks arrive faster than MediaSource can append them); queue absorbs the burst and drains on each updateend event. Cleanup is idempotent (resolved flag) so error / cancel paths don't double- resolve.
jonasneves
added a commit
that referenced
this pull request
May 20, 2026
Three fixes for the introduce demo:
1. TTS reading punctuation aloud. Expressive instructions ("peppy,
excited") can make gpt-4o-mini-tts vocalize trailing periods as
"dot" — and once the Cache API stores a bad render, it replays
forever. Removed every trailing period and ellipsis from the
introduce strings; the words carry their own prosody. Bumped
CACHE_NAME from tts-v1 → tts-v2 to invalidate the accumulated bad
renders (any phrase that got cached with a "dot" pronunciation gets
re-rendered fresh on next play). Pattern: never end a TTS string
with a period unless you specifically want the model to render
a pause + close.
2. "Follow you around" ending. Previous version did a settle-spin
after the line, which (a) repeated the spin gesture from two lines
earlier and (b) didn't illustrate "following" at all. Replaced
with a smooth S-curve forward (1.4s gentle right arc + 1.4s gentle
left arc), both moving forward — reads as "I'd come this way to
find you" instead of "I'd spin in place."
3. "Spin" line that wasn't audible. Same root cause as #1 — the
"spin..." render (with ellipsis) got cached badly and replayed
silent or glitched. Clean text "spin" + cache bump fixes it on
next play.
General principle worth keeping: tts-v2 cache namespace is the kill
switch for any future bad-render accumulation; bump it whenever model,
voice, instructions, or expressive style changes (or as a hammer
when you see weird audio).
9b7decb to
23b8179
Compare
jonasneves
added a commit
that referenced
this pull request
May 24, 2026
Four latency optimizations layered onto the OpenAI TTS path. None touch the Web Speech fallback or change behavior for users without an API key configured. 1. MediaSource streaming. Old path: await res.blob() — wait for the FULL mp3 before any audio plays (~500-1000ms TTFA). New: pipe the response body chunk-by-chunk into a MediaSource via a backpressure queue, audio starts as soon as the decoder has enough header data (~50-100ms after first chunk lands). Saves 200-400ms on every single speak(). Falls back to buffer-then-play if MSE isn't supported (some Safari versions) or the codec mismatches. 2. gpt-4o-mini-tts. Replaces tts-1. Lower TTFB (~250ms vs ~400ms by OpenAI's published benchmarks) AND supports a free-form `instructions` parameter for voice character — no more picking from 6 fixed voices. Default instructions: "peppy, youthful, slightly excited energy. Sound like a young hero — quick, enthusiastic, friendly. American accent, slightly higher pitched." Direct response to the "voice too adult / slightly sensual" feedback on onyx. tts-1 ignores the field; we only send it on gpt-4o-* models. 3. Connection pre-warm. First speak() fires a parallel HEAD on /v1/models to pay the TLS handshake (~150-250ms) once instead of inside the first user-visible speak. Subsequent calls reuse the kept-alive HTTP/2 connection from the browser pool. Once per page load, fire-and-forget. 4. (Already done last commit) — speak() returns a Promise that resolves on actual audio.onended, tool dispatch awaits it. Together with #1, the next motion fires the moment audio truly ends — no more truncation, no over-padding. currentVoice() debug helper now reports the active engine + model + voice + whether streaming is supported, so DevTools `currentVoice()` gives a complete picture without introspecting the module. Backpressure queue inside playStreamingMSE: reader can outpace the decoder (chunks arrive faster than MediaSource can append them); queue absorbs the burst and drains on each updateend event. Cleanup is idempotent (resolved flag) so error / cancel paths don't double- resolve.
jonasneves
added a commit
that referenced
this pull request
May 24, 2026
Three fixes for the introduce demo:
1. TTS reading punctuation aloud. Expressive instructions ("peppy,
excited") can make gpt-4o-mini-tts vocalize trailing periods as
"dot" — and once the Cache API stores a bad render, it replays
forever. Removed every trailing period and ellipsis from the
introduce strings; the words carry their own prosody. Bumped
CACHE_NAME from tts-v1 → tts-v2 to invalidate the accumulated bad
renders (any phrase that got cached with a "dot" pronunciation gets
re-rendered fresh on next play). Pattern: never end a TTS string
with a period unless you specifically want the model to render
a pause + close.
2. "Follow you around" ending. Previous version did a settle-spin
after the line, which (a) repeated the spin gesture from two lines
earlier and (b) didn't illustrate "following" at all. Replaced
with a smooth S-curve forward (1.4s gentle right arc + 1.4s gentle
left arc), both moving forward — reads as "I'd come this way to
find you" instead of "I'd spin in place."
3. "Spin" line that wasn't audible. Same root cause as #1 — the
"spin..." render (with ellipsis) got cached badly and replayed
silent or glitched. Clean text "spin" + cache bump fixes it on
next play.
General principle worth keeping: tts-v2 cache namespace is the kill
switch for any future bad-render accumulation; bump it whenever model,
voice, instructions, or expressive style changes (or as a hammer
when you see weird audio).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
motion_control.py— diff drive pose controller withspin_move_spinandcontinuousmodes; BLE service exposinggoal,pose, andstatuscharacteristicspackages/control/—DiffDrive,Pose2D,SpinMoveSpinController,ContinuousControllerwith Python bindings via pybind11motionPoseCharwhenever a freshentry.cvPositionarrives from the overhead camera panel — keeps odometry aligned with camera ground truth with no user action requiredTest plan
Pose · Move · PoseandContinuousmodes, confirm robot reaches goal and status updates🤖 Generated with Claude Code