Stabilize Android OpenClaw progress, streaming, and audio routing#60
Open
ryosuzuki wants to merge 69 commits into
Open
Stabilize Android OpenClaw progress, streaming, and audio routing#60ryosuzuki wants to merge 69 commits into
ryosuzuki wants to merge 69 commits into
Conversation
Accumulate voice transcripts as a scrollable chat dialog with user messages as blue bubbles (right) and AI responses (left). Tool calls shown as centered status pills. Tab switcher at top lets users toggle between camera feed and chat view during active Gemini sessions. Auto-switches to chat tab in audio-only mode.
- extract_glass_sessions.py: extracts all glass session data from OpenClaw store - analyze_glass_sessions.py: computes basic stats, tool latency, category breakdown - classify_with_llm.py: refined keyword classification with system msg filtering - ANALYSIS_REPORT.md: P1 (Xiaoan) usage report with all metrics
- RemoteLogger.swift: fire-and-forget event logger matching IntentOS pattern Logs: voice:user, voice:ai, voice:tool_call, voice:tool_result, session:start, session:end - GeminiSessionViewModel: wired logging at session start/end, turn complete, and tool call initiation/result - server/index.js: added /api/logs POST/GET endpoint, stores JSONL per day in server/logs/ directory - .gitignore: exclude server/logs/ (sensitive session data)
- server/index.js: write logs to ~/.openclaw/visionclaw-logs/ instead of server/logs/ so OpenClaw can discover and read them via file tools - extract_voice_logs.py: script to extract and analyze voice interaction logs (complements extract_glass_sessions.py for OpenClaw tool-call logs)
… (iOS) - New capture_photo Gemini tool declaration alongside execute tool - ToolCallRouter intercepts capture_photo locally (not sent to OpenClaw) - PhotoCaptureStore persists captured frames as JPEG with JSON manifest - GalleryView (3-column grid) and GalleryDetailView (full screen + share + delete) - Gallery button in StreamView top bar when Gemini is active - Capture toast notification on successful photo save
- execute tool gains include_image boolean param (default false) - Gemini sets include_image=true only when task needs visual context - ToolCallRouter passes latest camera frame when flag is set - OpenClawBridge sends image as base64 in OpenAI vision format - Conversation history changed to [String: Any] for multimodal content - No image sent on text-only tasks (no latency impact)
- RemoteLogger.kt: fire-and-forget event logger matching iOS implementation Logs: voice:user, voice:ai, voice:tool_call, voice:tool_result, session:start, session:end - GeminiSessionViewModel: wired logging at session start/end, turn complete, and tool call initiation/result
… (Android) - capture_photo tool declaration + local interception in ToolCallRouter - PhotoCaptureStore: persistent JPEG gallery with JSON manifest - GalleryScreen (3-column grid) + GalleryDetailScreen (share + delete) - Gallery button always visible in StreamScreen top bar - Capture toast on successful photo save - FileProvider paths updated for captures directory sharing
… INTERRUPT scheduling - execute tool changed from BLOCKING to NON_BLOCKING (Gemini keeps talking) - Tool responses include scheduling=INTERRUPT (Gemini interrupts to speak result) - Image tasks routed through WebSocket chat.send with attachments (not HTTP) - OpenClawEventClient gains sendChatMessage + chat event handling - OpenClawBridge routes image tasks through eventClient WebSocket - System prompt updated with stronger include_image guidance
… calls (iOS + Android)
…session disconnect
- Messages persist across session stop/start (not cleared) - Session divider with date/time inserted between sessions - Timestamps shown on messages when 2+ minutes apart or after divider - First message in each session always shows timestamp
… notifications enabled (iOS + Android)
…end permission (iOS + Android)
…ough SSH tunnel (iOS + Android)
…s (iOS + Android) - upload_server.py: tiny HTTP server saves JPEGs to ~/.openclaw/media/visionclaw/ - OpenClawBridge uploads JPEG to upload server (port = gateway port + 3) - File path appended to task text as [image_file_path] so agent can read/copy/save - Agent can both SEE the image (via chat.send attachment) AND access the file on disk
…w bubble (iOS + Android)
…ent responses (iOS + Android)
…ve/save tasks (iOS + Android)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
executecoalescing, image attachment flow, and developer session controls.mainchanges and resolves OpenClaw WebSocket routing conflicts.Validation
./gradlew :app:compileDebugKotlinwith Android Studio JBR.AudioTrack.writewrites PCM to AirPods.STARTING,STARTED, andSTREAMING.BGTESTlogs, Japanese-specific strings, and personal IP additions.Notes
Secrets.ktremains gitignored and is not included..idea/files are untracked and intentionally not included.