This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Android Accessibility Inspector Service - exposes Android accessibility tree data through a WebSocket server for external inspection and automation tools. The service captures accessibility node information, screenshots, and enables remote UI automation through gestures and actions.
Security Warning: This service exposes all screen content through WebSocket. Disable when not in use.
# Build the app
./gradlew build
# Clean build
./gradlew clean
# Build debug APK
./gradlew assembleDebug
# Build release APK
./gradlew assembleRelease
# Install debug build to connected device
./gradlew installDebug
# Run tests
./gradlew test
# Run instrumented tests (requires connected device/emulator)
./gradlew connectedAndroidTest- AccessibilityInspector (
AccessibilityInspector.java): Main accessibility service that captures UI tree data, handles screenshot capture, and processes automation commands - SocketService (
SocketService.java): WebSocket server running on port 38301 that handles client connections and message routing - ServiceActivity (
ServiceActivity.java): Simple launcher activity
-
Capture Process: Client sends
{"message":"capture"}→ SocketService broadcasts intent → AccessibilityInspector captures tree using TreeDebug → JSON tree data sent to all connected WebSocket clients -
Action Process: Client sends action command → SocketService directly calls AccessibilityInspector methods → Result sent back through WebSocket
- Tree Capture: Captures accessibility node tree with/without non-important views
- Screenshot Integration: Base64 encoded screenshots bundled with tree data
- Accessibility Event Forwarding: Real-time forwarding of user interactions with "before" tree context
- UI Automation:
- Element actions (click, focus, text input) via resourceId or hashCode
- Gesture automation (tap, swipe, scroll) via coordinates
- Activity launching with multiple launch types
- Real-time Communication: WebSocket server with JSON message format
- AndroidAsync: WebSocket server implementation
- Google Accessibility Utils: Tree traversal and node manipulation utilities
- Auto-Value: Value class generation
- Guava: Utility collections and functions
Connection: ws://localhost:38301/ (use adb forward tcp:38301 tcp:38301)
The service uses two different capture methods optimized for different use cases:
Commands: capture, captureNotImportant
- Method: Full metadata extraction (
TreeDebug.logNodeTrees()) - Speed: 5-15 seconds per capture
- Data: Complete debugging information
- Screen coordinates (
getBoundsInScreen()) - Clickable text analysis (URLs, emails, phone numbers)
- Locale/language detection
- Action lists and complex metadata
- DP scaling calculations
- Screen coordinates (
- Use Case: Manual debugging, detailed analysis, development tools
Messages: stableTree type
- Method: Optimized structural capture (
TreeDebug.logNodeTreesFast()) - Speed: 200-300ms per capture (faster)
- Data: Essential structure only
- Node hierarchy and relationships
- Text content and descriptions
- Visibility state and basic properties
- Node identification (hashCode, className, resourceId)
- Use Case: Real-time monitoring, automation, change detection
Trade-off: Manual captures provide complete data but are slow. Stable trees provide fast updates but with minimal metadata.
The service supports two distinct message flows:
Flow: Client sends command → Service responds
{"message":"capture"} // Request tree (important views only) - SLOW/FULL method
{"message":"capture", "visibleOnly":true} // Request tree with invisible leaf filtering (reduces size)
{"message":"captureNotImportant"} // Request tree (all views) - SLOW/FULL method
{"message":"captureNotImportant", "visibleOnly":true} // Request tree (all views) with filtering
{"message":"ping"} // Connection test
{"message":"performAction", "resourceId":"...", "action":"CLICK"}
{"message":"performGesture", "gestureType":"TAP", "x":100, "y":200}
{"message":"launchActivity", "launchType":"PACKAGE", "packageName":"com.example.app"}
{"message":"findByViewId", "viewId":"com.example.app:id/button"}
{"message":"findByText", "text":"Submit"} // DEPRECATED - use customFindByText
{"message":"customFindByText", "text":"Submit"} // Exact text match (case-sensitive)
{"message":"findByRegex", "pattern":".*Submit.*"} // Regex pattern matching
{"message":"customFindByViewId", "viewId":"..."} // Alternative viewId search
{"message":"findByProps", "properties":{"text":"Submit","isClickable":true}} // Property-based search
// All find methods support optional verbose flag for additional node properties:
{"message":"findByViewId", "viewId":"com.example:id/button", "verbose":true}
{"message":"customFindByText", "text":"Submit", "verbose":true}
{"message":"findByRegex", "pattern":"[0-9]+", "verbose":true}
{"message":"findByProps", "properties":{"isClickable":true}, "verbose":true}Tree Data Response:
{
"type": "tree",
"children": [...] // TreeDebug format
}Note: This format change has been tested for backward compatibility with the accompanying Inspector App.
Action/Gesture/Launch Results:
{
"type": "actionResult|gestureResult|launchResult",
"success": true,
"message": "Description"
}Ping Response:
{"message": "pong"}Find Response (All Methods):
{
"type": "findResult",
"success": true,
"viewId": "com.example.app:id/button", // Only present for findByViewId
"text": "Submit", // Only present for findByText/customFindByText
"method": "customFindByText", // Only present for custom methods
"stats": "Window: Slack - Total nodes: 190, ...", // Only present for customFindByText
"count": 2,
"nodes": [
{
"hashCode": 123456,
"className": "android.widget.Button",
"text": "Submit",
"contentDescription": "",
"viewIdResourceName": "com.example.app:id/button",
"isClickable": true,
"isEnabled": true,
"isFocusable": true,
"isFocused": false,
"isScrollable": false,
"isCheckable": false,
"isChecked": false,
"isSelected": false,
"boundsInScreen": {"left": 100, "top": 200, "right": 300, "bottom": 250}
}
]
}Response Field Guide:
method: Present only for custom methods (identifies which implementation was used)stats: Present only forcustomFindByText(provides tree analysis for debugging)viewId: Present forfindByViewIdandcustomFindByViewIdcommandstext: Present forfindByTextandcustomFindByTextcommandspattern: Present forfindByRegexcommandsproperties: Present forfindByPropscommands
customFindByText: Exact, case-sensitive matching in BOTH text and contentDescription fields
{"message": "customFindByText", "text": "Activity"}
// Searches both node.getText() and node.getContentDescription()
// Finds: "Activity" in either field ✅
// Misses: "activity", "ACTIVITY", "My Activity" ❌findByRegex: Flexible pattern matching in BOTH text and contentDescription fields
{"message": "findByRegex", "pattern": "(?i)activity"} // Case-insensitive in either field
{"message": "findByRegex", "pattern": ".*[0-9]+.*"} // Contains numbers in either field
{"message": "findByRegex", "pattern": "^[A-Z][a-z]+$"} // Capitalized words in either field
{"message": "findByRegex", "pattern": "(btn|button)"} // Multiple options in either field
// Searches both node.getText() and node.getContentDescription()findByProps: Multi-property matching
{"message": "findByProps", "properties": {"text": "Submit", "isClickable": true}}
{"message": "findByProps", "properties": {"viewIdResourceName": "com.Slack:id/button", "isEnabled": true}}
{"message": "findByProps", "properties": {"className": "Button", "text": "OK", "isClickable": true}}Supported Properties:
- String properties:
text,contentDescription,className,viewIdResourceName/resourceId/viewId - Boolean properties:
isClickable,isEnabled,isFocusable,isFocused,isScrollable,isCheckable,isChecked,isSelected - Integer properties:
childCount
Note: Found nodes use a different format than tree nodes. Find results include:
- Direct boolean properties (
isClickable,isEnabled, etc.) hashCodefor element identification- Simplified structure for easier processing
Tree nodes use the TreeDebug format with nested metadata objects.
All find methods support an optional verbose flag (defaults to false). When verbose: true, additional properties are included:
Basic properties (always included):
hashCode,parentHashCode,className,text,contentDescription,viewIdResourceName- State:
isClickable,isEnabled,isFocusable,isFocused,isScrollable,isCheckable,isChecked,isSelected boundsInScreen(object with left, top, right, bottom)
Verbose properties (only with verbose: true):
- Text properties:
hintText,errorText,tooltipText,paneTitle - Reference properties:
labeledByHashCode(hashCode of the node that labels this one) - Additional states:
isLongClickable,isVisibleToUser,isImportantForAccessibility,isContentInvalid,isScreenReaderFocusable - Collection properties:
collectionInfo: {rowCount, columnCount} for grids/listscollectionItemInfo: {rowIndex, columnIndex, rowSpan, columnSpan} for items in collections
- Action list: Array of available actions with format: [{id: number, label: string}, ...]
- Other properties:
windowId,childCount
Note: Properties requiring API > 28 (stateDescription, roleDescription) are not currently available due to build configuration constraints.
Example verbose response:
{
"hashCode": 123456,
"parentHashCode": 789012,
"className": "android.widget.Button",
"text": "Submit",
// ... basic properties ...
"hintText": "Tap to submit form",
"isVisibleToUser": true,
"isLongClickable": false,
"actionList": [
{"id": 16, "label": null}, // ACTION_CLICK
{"id": 1, "label": null} // ACTION_FOCUS
],
"childCount": 0
}
## **Tree Capture Size Optimization**
### **visibleOnly Parameter**
Both `capture` and `captureNotImportant` commands support an optional `visibleOnly` parameter to reduce tree size:
```json
{"message":"capture", "visibleOnly":true}
{"message":"captureNotImportant", "visibleOnly":true}How it works:
- Default behavior (
visibleOnly: false or omitted): Full tree with all nodes - Filtered behavior (
visibleOnly: true): Removes invisible leaf nodes while preserving tree structure
Benefits:
- ✅ Reduces tree size by removing invisible UI elements that serve no functional purpose
- ✅ Prevents Android Intent size limit (~973KB) failures on complex apps
- ✅ Faster WebSocket transmission
- ✅ Preserves structural containers (invisible nodes with children)
Use cases:
- Large apps (news apps, social media): Use
visibleOnly:trueto avoid size limit failures - Debugging invisible elements: Use
visibleOnly:false(default) to see all nodes - Production automation: Use
visibleOnly:truefor faster, more reliable captures
Note: The same filtering is automatically applied to stable trees to ensure reliable delivery.
| Method | Type | Status | Use Case |
|---|---|---|---|
findByViewId |
Native | ✅ Recommended | Finding elements by resource ID |
findByText |
Native | ❌ Deprecated | Use customFindByText instead |
customFindByText |
Custom | ✅ Recommended | Exact text match (case-sensitive) |
findByRegex |
Custom | ✅ Recommended | Pattern matching with regex |
findByProps |
Custom | ✅ Recommended | Multi-property search with JSON criteria |
customFindByViewId |
Custom | ✅ Alternative | Debugging viewId issues |
Through extensive testing with comparison scripts, we discovered that Android's native findAccessibilityNodeInfosByText() method has significant limitations:
- Semantic Filtering: Filters out navigation/UI labels while keeping "content" text
- Case Sensitivity: Strictly case-sensitive, unlike our custom implementation
- Missing Visible Elements: Skips prominent UI elements like tab names ("Activity", "Later") while finding app names ("Slack")
Test Results Example (from test_native_vs_custom.py):
Searching for: 'Activity' (visible tab in Slack UI)
Native method: 0 nodes found ❌ Misses visible UI
Custom method: 1 nodes found ✅ Finds visible UI
The deprecation decision was based on systematic testing using several diagnostic scripts:
test_native_vs_custom.py: Direct comparison showing native method missing visible UI elementstest_case_sensitivity.py: Ruled out case sensitivity as the sole issuetest_find_differences.py: Identified specific nodes missed by native methodtest_viewid_consistency.py: Confirmed nativefindByViewIdworks correctly
Key Finding: Native findByText consistently missed 5-17 visible UI elements per search, while findByViewId showed perfect consistency between native and custom implementations.
Flow: User interacts with device → Android generates AccessibilityEvent → Service automatically broadcasts to all clients
{
"type": "accessibilityEvent",
"eventType": "VIEW_CLICKED|VIEW_SELECTED|VIEW_FOCUSED|SCROLL_SEQUENCE_END|TEXT_SEQUENCE_END|WINDOW_STATE_CHANGED",
"timestamp": 1751544001234,
"packageName": "com.example.app",
"className": "android.widget.Button",
"source": {...}
}The service automatically sends stable UI trees when the interface becomes stable:
{
"type": "stableTree",
"timestamp": 1751544000734,
"children": [...] // Current stable UI state
}How Stable Trees Work:
The service continuously monitors for UI changes via WINDOW_CONTENT_CHANGED events. After 1 second of UI stability (no content changes), it captures the current tree. To prevent duplicate messages, trees are compared semantically (ignoring volatile node IDs) and only sent when actual content changes occur. This provides clients with up-to-date UI snapshots without spam.
{
"type": "error",
"message": "Error description"
}- Service flags configured in
onServiceConnected()- modifyallFlagsconstant to adjust capture behavior - Two capture modes: important views only vs all views (controlled by
hideNotImportant()/showNotImportant())
- All message parsing in
SocketRequestCallback.onConnected() - Direct method calls to AccessibilityInspector instance (stored statically)
- Error responses follow
{"type":"[messageType]Result", "success":false, "message":"..."}
- Uses
TreeDebug.logNodeTrees()from Google accessibility utils - JSON format for tree data transmission
- Screenshot capture integrated with tree data
Implementation: The service uses a simplified approach for automatic tree broadcasting:
- UI Monitoring:
WINDOW_CONTENT_CHANGEDevents reset a 1-second stability timer - Tree Capture: After 1 second of stability, the current UI tree is captured
- Deduplication: Trees are compared semantically (excluding volatile node IDs) to prevent duplicate sends
- Automatic Broadcast: Only when content actually changes, a
stableTreemessage is sent to all clients
Key Files:
AccessibilityInspector.java: ContainshandleUIContentChange(),hasTreeChanged(), andremoveNodeIds()methodsTreeDebug.java: Modified to include node IDs in tree data but exclude them from comparison
Node ID Handling:
- Node IDs are included in JSON tree data sent to clients (for reference purposes)
- Node IDs are stripped during tree comparison to avoid false positives from Android's volatile object references
- Comparison uses
removeNodeIds()to recursively clean trees before string comparison
- Add JSON parsing in
SocketService.SocketRequestCallback - Add method implementation in
AccessibilityInspector - Follow result pattern:
send[ActionType]Result(boolean, String)
- Min SDK: 28 (Android 9)
- Target SDK: 31 (Android 12)
- Gesture Support: Requires API 24+ (checked at runtime)
- Java Version: 11
Problem: Stable tree capture taking 10-15 seconds, making the system unresponsive.
Solution: Created optimized fast capture path specifically for stable trees:
- Fast Tree Capture Method: Added
TreeDebug.logNodeTreesFast()andnodeDebugDescriptionJsonFast() - Eliminated Expensive Operations: Removed system calls and complex processing:
- ❌
getBoundsInScreen()calls (2-5ms per node - was #1 bottleneck) - ❌
getNodeClickableStrings()(regex analysis: 1-3ms per node) - ❌
getNodeLocaleStrings()(linguistic analysis: 1-2ms per node) - ❌ Complex metadata extraction (action lists, DP scaling, object creation)
- ❌
- Preserved Essential Data: Tree structure, node identification, basic text, visibility, core properties
- Dual Architecture: Fast path for stable trees, unchanged regular path for manual captures
Performance Results:
- Before: 10-15 seconds for stable tree capture
- After: 200-300ms for stable tree capture
- Compatibility: Regular tree capture unchanged for debugging tools
Debug Enhancements:
- Enhanced debug_client with timing measurements and detailed event properties
- Added optional WINDOW_CONTENT_CHANGED event forwarding (controlled by
SEND_WINDOW_CONTENT_CHANGED_EVENTS) - Improved event type mapping for better debugging visibility
Problem: The original complex event-tree association system sent repeated identical trees and complicated the codebase.
Solution: Simplified to automatic stable tree broadcasting with smart deduplication:
- Removed Complex Event Association: Eliminated
treeBeforeEventmessages and event-specific tree handlers - Simplified Message Flow: Events and trees are now sent independently
- Added Smart Deduplication: Trees are compared semantically, ignoring volatile Android node IDs
- Improved Performance: Clients only receive trees when UI content actually changes
Code Changes:
- Removed
sendClickEventWithBeforeTree(),sendFocusEventWithBeforeTree(),sendSelectionEventWithBeforeTree()methods - Added
hasTreeChanged()with semantic comparison (strips node IDs) - Added
removeNodeIds()andremoveNodeIdsFromArray()for recursive ID removal - Modified
handleUIContentChange()to use content-based comparison - Kept node IDs in client data for reference while excluding them from comparison
Benefits:
- Dramatically reduced duplicate tree messages
- Simplified client implementations (no need to correlate events with trees)
- Maintained backward compatibility with existing tree structure
- Added comprehensive diff logging for debugging
- WebSocket server can become unresponsive; may require service restart or device reboot
- Null pointer exceptions possible during active screen updates
- Service process may not terminate properly when accessibility service is disabled
- Samsung Phone app doesn't generate scroll events (app-specific limitation)
- Modern apps typically don't provide scroll delta data (
totalScrollX/Yare often 0) - VIEW_SELECTED events are rare in modern apps (most use clicks instead)
System UI Filtering: The TreeDebug.logNodeTrees() method intentionally filters out system UI elements from captured trees, including:
- Status bar (contains clock, battery, signal indicators, notifications)
- Navigation bar
- Windows that are not
isActive() - Windows with pane titles "Status bar" or "Notification shade."
Root Window Selection: TreeDebug uses getRootInActiveWindow() instead of each window's actual root, which may miss content in non-active windows.
Workaround: Use findByViewId and findByText commands to access system UI elements that don't appear in tree captures:
// These work for system UI elements:
{"message": "findByViewId", "viewId": "com.android.systemui:id/clock"}
{"message": "findByText", "text": "7:45"}
// These miss system UI elements:
{"message": "capture"}
{"message": "captureNotImportant"}Future Solution: To capture system UI elements in trees, TreeDebug.logNodeTrees() would need modification to:
- Remove status bar/navigation bar filtering (lines 104-105, 113 in TreeDebug.java)
- Use each window's actual root instead of
getRootInActiveWindow()(lines 76-77) - Include non-active windows if desired (lines 58-60)
User-initiated vs App-initiated Scrolling:
- User scrolls: Multiple timestamps in
scrollTimestampsarray (continuous gesture) - App scrolls: Single timestamp (programmatic animation, like ViewPager transitions)
Scroll Direction Detection:
- Horizontal scrolls: May have
totalScrollX ≠ 0(ViewPager page changes) - Vertical scrolls: Usually
totalScrollX = 0, totalScrollY = 0regardless of source - Note: Scroll delta reliability varies by app implementation
Examples:
- Clicking ViewPager tab → Single timestamp + horizontal scroll values
- User swiping list → Multiple timestamps + zero scroll values
- App page transitions → Single timestamp + variable scroll values
tests/debug_client.py: Primary debugging tool that shows detailed message analysis
cd tests && python3 debug_client.py- Displays message type in header (🎯 ACCESSIBILITY EVENT, 🌳 TREE MESSAGE, etc.)
- Shows JSON preview (truncated at 1000 characters)
- Handles both string and byte messages
tests/quick_test.py: Simple connectivity test for basic functionality
cd tests && python3 quick_test.py- Minimal output showing message types and sizes
- Good for quick connection verification
- Lighter output for basic testing
tests/test_capture_commands.py: Tests tree capture functionality
cd tests && python3 test_capture_commands.py- Tests
capture(important nodes only) andcaptureNotImportant(all nodes) - Shows JSON size, node counts, and response times
- Verifies WebSocket connection health with ping/pong
tests/test_find_interactive.py: Interactive element search tool
cd tests && python3 test_find_interactive.py- Menu-driven interface for searching elements
- Supports findByViewId and findByText searches
- Shows all node properties including bounds, states, and IDs
- Filters out accessibility events automatically
tests/test_find_and_click.py: Demonstrates find + action workflow
cd tests && python3 test_find_and_click.py- Searches for elements by text
- Allows selection from multiple results
- Performs click actions on selected elements
- Shows detailed properties for all found elements
tests/test_actions.py: Tests performAction commands
cd tests && python3 test_actions.py- Tests various action types (CLICK, FOCUS, LONG_CLICK, SET_TEXT)
- Uses found elements from findByText/findByViewId
- Tests both hashCode and resourceId targeting
- Tests error cases (missing parameters)
tests/test_gestures.py: Tests performGesture commands
cd tests && python3 test_gestures.py- Tests all gesture types (TAP, SWIPE, SCROLL, LONG_PRESS, DOUBLE_TAP)
- Tests predefined scroll directions (UP, DOWN, LEFT, RIGHT)
- Tests optional parameters (duration, end coordinates)
- Tests error cases (missing gestureType, invalid coordinates)
tests/test_launch.py: Tests launchActivity commands
cd tests && python3 test_launch.py- Tests various launch types (PACKAGE, ACTIVITY, INTENT)
- Tests common apps (Settings, Calculator, Browser)
- Tests different intent actions and data formats
- Tests error cases (missing parameters, invalid packages)
tests/test_findByProps.py: Tests property-based searching
cd tests && python3 test_findByProps.py- Tests JSON criteria matching across multiple properties
- Supports string, boolean, and integer property matching
- Tests error cases and complex property combinations
tests/test_findByRegex.py: Tests regex pattern matching
cd tests && python3 test_findByRegex.py- Tests regex patterns in both text and contentDescription fields
- Supports case-insensitive matching and complex patterns
- Tests error cases (invalid regex patterns)
tests/test_verbose_find.py: Tests verbose mode for find methods
cd tests && python3 test_verbose_find.py- Tests additional properties returned with verbose flag
- Compares basic vs verbose response formats
- Tests across all find method types
tests/test_visible_only.py: Tests tree size optimization
cd tests && python3 test_visible_only.py- Tests visibleOnly parameter for capture commands
- Compares tree sizes with/without filtering
- Demonstrates size reduction benefits
tests/test_native_vs_custom.py: Compare native vs custom find methods
cd tests && python3 test_native_vs_custom.py- Direct performance and accuracy comparison
- Reveals native method limitations
- Used to identify findByText deprecation need
tests/test_viewid_consistency.py: Verify viewId method reliability
cd tests && python3 test_viewid_consistency.py- Confirms native findByViewId works correctly
- Shows why only findByText needs deprecation
- Tests various viewId formats and edge cases
tests/test_bounds_comparison.py: Tests coordinate bounds accuracy
cd tests && python3 test_bounds_comparison.py- Compares bounds data across different methods
- Verifies coordinate consistency
- Useful for gesture targeting validation
Connection Setup:
# Forward port from Android device
adb forward tcp:38301 tcp:38301
# Then run any test script
cd tests && python3 test_script_name.py