Skip to content

Stabilize Android OpenClaw progress, streaming, and audio routing#60

Open
ryosuzuki wants to merge 69 commits into
Intent-Lab:mainfrom
ryosuzuki:ryosuzuki/beta
Open

Stabilize Android OpenClaw progress, streaming, and audio routing#60
ryosuzuki wants to merge 69 commits into
Intent-Lab:mainfrom
ryosuzuki:ryosuzuki/beta

Conversation

@ryosuzuki
Copy link
Copy Markdown

@ryosuzuki ryosuzuki commented May 26, 2026

Summary

  • Stabilizes Android glasses streaming restart and foreground-service cleanup.
  • Adds OpenClaw progress updates that Gemini can speak during long-running tool calls.
  • Improves OpenClaw tool result handling, duplicate execute coalescing, image attachment flow, and developer session controls.
  • Fixes Android camera/video settings so Start on Phone no longer forces video streaming back on.
  • Stabilizes Bluetooth audio routing by using the phone mic with AirPods/A2DP media output, avoiding silent SCO mic capture and preserving Gemini playback to Bluetooth media output.
  • Merges the latest main changes and resolves OpenClaw WebSocket routing conflicts.

Validation

  • Ran ./gradlew :app:compileDebugKotlin with Android Studio JBR.
  • Installed and tested on Pixel 8 Pro.
  • Verified video streaming can stay disabled when starting on phone.
  • Verified AirPods Pro route logs: input routed to Pixel built-in mic, output routed to AirPods A2DP/media, Gemini audio received, and AudioTrack.write writes PCM to AirPods.
  • Tested with Meta Ray-Ban glasses; glasses stream reached STARTING, STARTED, and STREAMING.
  • Checked Android/iOS app source for conflict markers, debug BGTEST logs, Japanese-specific strings, and personal IP additions.

Notes

  • No OpenClaw server change is required for the current progress-speech behavior; the Android app consumes existing OpenClaw progress/event data and passes concise updates through Gemini.
  • Local Secrets.kt remains gitignored and is not included.
  • Local .idea/ files are untracked and intentionally not included.
  • iOS build was not completed locally because the installed Xcode is missing the required iOS 26.2 platform destination.

Lee-daeho and others added 30 commits March 16, 2026 14:53
Accumulate voice transcripts as a scrollable chat dialog with user
messages as blue bubbles (right) and AI responses (left). Tool calls
shown as centered status pills. Tab switcher at top lets users toggle
between camera feed and chat view during active Gemini sessions.
Auto-switches to chat tab in audio-only mode.
- extract_glass_sessions.py: extracts all glass session data from OpenClaw store
- analyze_glass_sessions.py: computes basic stats, tool latency, category breakdown
- classify_with_llm.py: refined keyword classification with system msg filtering
- ANALYSIS_REPORT.md: P1 (Xiaoan) usage report with all metrics
- RemoteLogger.swift: fire-and-forget event logger matching IntentOS pattern
  Logs: voice:user, voice:ai, voice:tool_call, voice:tool_result,
  session:start, session:end
- GeminiSessionViewModel: wired logging at session start/end, turn complete,
  and tool call initiation/result
- server/index.js: added /api/logs POST/GET endpoint, stores JSONL per day
  in server/logs/ directory
- .gitignore: exclude server/logs/ (sensitive session data)
- server/index.js: write logs to ~/.openclaw/visionclaw-logs/ instead of
  server/logs/ so OpenClaw can discover and read them via file tools
- extract_voice_logs.py: script to extract and analyze voice interaction
  logs (complements extract_glass_sessions.py for OpenClaw tool-call logs)
… (iOS)

- New capture_photo Gemini tool declaration alongside execute tool
- ToolCallRouter intercepts capture_photo locally (not sent to OpenClaw)
- PhotoCaptureStore persists captured frames as JPEG with JSON manifest
- GalleryView (3-column grid) and GalleryDetailView (full screen + share + delete)
- Gallery button in StreamView top bar when Gemini is active
- Capture toast notification on successful photo save
- execute tool gains include_image boolean param (default false)
- Gemini sets include_image=true only when task needs visual context
- ToolCallRouter passes latest camera frame when flag is set
- OpenClawBridge sends image as base64 in OpenAI vision format
- Conversation history changed to [String: Any] for multimodal content
- No image sent on text-only tasks (no latency impact)
- RemoteLogger.kt: fire-and-forget event logger matching iOS implementation
  Logs: voice:user, voice:ai, voice:tool_call, voice:tool_result,
  session:start, session:end
- GeminiSessionViewModel: wired logging at session start/end, turn complete,
  and tool call initiation/result
… (Android)

- capture_photo tool declaration + local interception in ToolCallRouter
- PhotoCaptureStore: persistent JPEG gallery with JSON manifest
- GalleryScreen (3-column grid) + GalleryDetailScreen (share + delete)
- Gallery button always visible in StreamScreen top bar
- Capture toast on successful photo save
- FileProvider paths updated for captures directory sharing
… INTERRUPT scheduling

- execute tool changed from BLOCKING to NON_BLOCKING (Gemini keeps talking)
- Tool responses include scheduling=INTERRUPT (Gemini interrupts to speak result)
- Image tasks routed through WebSocket chat.send with attachments (not HTTP)
- OpenClawEventClient gains sendChatMessage + chat event handling
- OpenClawBridge routes image tasks through eventClient WebSocket
- System prompt updated with stronger include_image guidance
- Messages persist across session stop/start (not cleared)
- Session divider with date/time inserted between sessions
- Timestamps shown on messages when 2+ minutes apart or after divider
- First message in each session always shows timestamp
sseanliu and others added 27 commits March 27, 2026 11:51
…s (iOS + Android)

- upload_server.py: tiny HTTP server saves JPEGs to ~/.openclaw/media/visionclaw/
- OpenClawBridge uploads JPEG to upload server (port = gateway port + 3)
- File path appended to task text as [image_file_path] so agent can read/copy/save
- Agent can both SEE the image (via chat.send attachment) AND access the file on disk
@ryosuzuki ryosuzuki changed the title [codex] Stabilize Android OpenClaw progress and glasses streaming Stabilize Android OpenClaw progress and glasses streaming May 26, 2026
@ryosuzuki ryosuzuki changed the title Stabilize Android OpenClaw progress and glasses streaming Stabilize Android OpenClaw progress, streaming, and audio routing Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants