Skip to content

feat: unified image analysis pipeline, bot image support, and mobile-web enhancements#90

Merged
GCWing merged 5 commits intoGCWing:mainfrom
bobleer:pr/unified-image-pipeline-mobile-enhancements-20250309
Mar 9, 2026
Merged

feat: unified image analysis pipeline, bot image support, and mobile-web enhancements#90
GCWing merged 5 commits intoGCWing:mainfrom
bobleer:pr/unified-image-pipeline-mobile-enhancements-20250309

Conversation

@bobleer
Copy link
Contributor

@bobleer bobleer commented Mar 8, 2026

Summary

  • Unified image analysis pipeline: Centralize vision pre-analysis in backend ConversationCoordinator for all platforms (desktop, mobile, bot), with automatic fallback to multimodal when no vision model is configured. Add ImageAnalysisStarted/Completed events propagated through Tauri & WebSocket transports.
  • Feishu bot image support: Handle post (rich-text) and image message types; download and compress user-sent images via Feishu API (≤1MB); pass image contexts through the full execution pipeline.
  • Mobile-web enhancements: Overhaul ChatPage with theme-aware syntax highlighting, copy buttons, line numbers; add TodoCard and TaskToolCard components; iOS-style push/pop navigation animations; light/dark theme polish.
  • Desktop frontend improvements: Image thumbnails with lightbox in user messages, image-analyzing indicator, NavBar menu auto-alignment, QR click-to-copy, settings panel section descriptions (i18n).
  • Turn lifecycle fixes: Properly persist cancelled/failed turns, emit DialogTurnCancelled unconditionally, sequential event routing for correct text chunk ordering.

Changes

Backend (Rust)

  • coordinator.rs: Add pre_analyze_images_if_needed(), finalize_turn_in_workspace(); carry original_user_input and user_message_metadata in DialogTurnStarted
  • session_manager.rs: Add fail_dialog_turn() method
  • agentic.rs (events): Add ImageAnalysisStarted, ImageAnalysisCompleted events
  • tauri.rs, websocket.rs: Emit new image analysis events and extended turn metadata
  • feishu.rs: Support post/image message types, download images, compression
  • command_router.rs: Pass image_contexts through forward pipeline
  • remote_server.rs: Image attachment handling improvements
  • lib.rs (desktop): Sequential event routing for ordering guarantees
  • lib.rs (relay-server): 10MB body limit for command endpoint

Frontend (TypeScript/React)

  • UserMessageItem: Image thumbnails + lightbox
  • VirtualItemRenderer: image-analyzing virtual item type
  • EventHandlerModule: Handle ImageAnalysisStarted/Completed, temp turn management
  • useMessageSender: Simplified (vision strategy delegated to backend)
  • FlowChatStore, PersistenceModule: Image metadata persistence
  • AgentAPI, AgenticEventListener: New event listener APIs
  • NavBar: Auto left/right menu alignment
  • RemoteConnectDialog: QR click-to-copy
  • Settings configs + locales: Section descriptions (en-US, zh-CN)

Mobile Web

  • ChatPage: Code blocks with copy/line numbers, TodoCard, TaskToolCard, theme-aware markdown
  • App.tsx: iOS-style navigation transitions
  • SCSS: Comprehensive styling updates across all components
  • Theme: Light/dark consistency improvements

Test Plan

  • Desktop: Send message with pasted image → verify image thumbnail appears, lightbox works, vision pre-analysis runs
  • Desktop: Send text-only message → verify no regression in normal chat flow
  • Desktop: Cancel a running turn → verify turn is properly marked and persisted
  • Feishu bot: Send image in chat → verify image is downloaded, analyzed, and response is correct
  • Feishu bot: Send rich-text post with images → verify text and images are both processed
  • Mobile web: Navigate between pages → verify push/pop animations
  • Mobile web: View code blocks → verify syntax highlighting, copy button, line numbers
  • Mobile web: View TodoWrite tool results → verify TodoCard renders correctly
  • Remote connect: Click QR code → verify URL copied to clipboard
  • Settings panels: Verify section descriptions appear in both en-US and zh-CN

Made with Cursor

bowen628 added 5 commits March 9, 2026 00:19
…web enhancements

## Backend
- Centralize vision pre-analysis in ConversationCoordinator for all platforms
  (desktop, mobile, bot) with automatic fallback to multimodal
- Add ImageAnalysisStarted/Completed events and propagate through Tauri & WebSocket transports
- Carry original_user_input and image metadata in DialogTurnStarted for correct UI display
- Add SessionManager::fail_dialog_turn() for proper failed-turn persistence
- Fix cancelled-turn handling: emit DialogTurnCancelled unconditionally, persist partial content
- Route internal events sequentially to preserve text chunk ordering
- Simplify image context references (remove AnalyzeImage tool hints)

## Feishu Bot
- Support post (rich-text) and image message types, not just text
- Download and compress user-sent images via Feishu message resources API (≤1MB)
- Pass image contexts through command router to execution pipeline
- Fix Unicode truncation boundary in bot response messages

## Relay Server
- Increase command endpoint body limit to 10MB for image payloads
- Simplify deploy script (remove --build-mobile option)

## Desktop Frontend
- Display image thumbnails in UserMessageItem with lightbox preview
- Add image-analyzing indicator in VirtualItemRenderer
- Handle ImageAnalysisStarted/Completed events with temp turn management
- Simplify useMessageSender by delegating vision strategy to backend
- Persist and restore image metadata in dialog turns
- NavBar menu auto-aligns left/right based on viewport position
- QR code click-to-copy URL in RemoteConnectDialog
- Fix Markdown code block styling for inline-code within pre tags
- Add section descriptions to all settings panels (i18n en-US & zh-CN)

## Mobile Web
- Overhaul ChatPage: theme-aware syntax highlighting, copy buttons, line numbers
- Add TodoCard and TaskToolCard components for rich tool display
- Add iOS-style push/pop navigation animations
- Improve light/dark theme consistency and SCSS styling across all components

Made-with: Cursor
…web UI

- Coordinator: return user-friendly error (zh/en) when no vision model
  is configured instead of silently falling back to multimodal
- Remote connect: downgrade stale-session decrypt warning to debug level
- Mobile ChatPage: add error toast with auto-dismiss, truncate long git
  branch names in header
- Mobile SessionListPage: truncate workspace/branch names, replace Switch
  text with icon, simplify session item layout and descriptions
- RemoteConnectDialog: show ngrok usage link when connected via ngrok
- Add ngrokUsageLink i18n keys (en-US, zh-CN)

Made-with: Cursor
- coordinator.rs: keep both scheduler_notify_tx (upstream DialogScheduler)
  and workspace_turn_status tracking (our turn persistence)
- NavBar.scss/tsx: accept upstream UI overhaul that removed logo menu from
  NavBar (moved elsewhere in the new scene-based layout)

Made-with: Cursor
- AppState: set global workspace_path on startup for correct config resolution
- Mobile ThemeProvider: rewrite theme application using injected <style>
  element + inline fallbacks for reliable mobile WebKit rendering;
  replace CSS transition with opacity crossfade on theme switch
- Mobile index.html: use single theme-color meta, set color-scheme on html
- Remove unused theme-transitioning CSS class from global.scss
- Rename "vision model" to "image understanding model" in analyzing
  indicators (ChatPage + VirtualItemRenderer)

Made-with: Cursor
…increase WS size

- ChatInput: enforce max image count across all input paths (file picker,
  paste, drag-drop, MCP app message); show warning when limit reached
- Remote server: compress data-URL images to ≤100KB JPEG thumbnails before
  sending to mobile clients for faster loading
- Feishu bot: cap images at 5 per message, notify user when excess discarded
- Relay server: increase WebSocket max message size to 64MB for image payloads

Made-with: Cursor
@GCWing GCWing merged commit cc8981c into GCWing:main Mar 9, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants