Status: v0.3.0 shipped to Play Store internal testing + sideload track. Phase 0, Phase 1, Phase 2 (terminal preview), Phase 3 (bridge channel), Phase 4 (security hardening per ADR 15), and Phase 5 (polish + CI/CD) are partially-or-fully shipped. Phase V (voice mode) shipped 2026-04-12. v0.4 bridge feature expansion in progress on feature/bridge-feature-expansion — see docs/plans/2026-04-13-bridge-feature-expansion.md.
Repo: Codename-11/hermes-relay
Updated: 2026-04-18
A native Android app for the Hermes agent platform. Not just remote phone control — a full bidirectional interface between you and your Hermes server from anywhere.
Three capabilities in one app:
| Channel | Direction | What |
|---|---|---|
| Chat | Phone ↔ Agent | Talk to any Hermes agent profile (Victor, Mizu, etc.) with full streaming |
| Terminal | Phone ↔ Server | Secure remote shell access to the Hermes server via tmux |
| Bridge | Agent → Phone | Agent controls the phone (taps, types, screenshots — upstream functionality) |
One persistent WSS connection. One pairing flow. Three multiplexed channels.
What it is not:
- Not a web wrapper — native Kotlin + Jetpack Compose
- Not phone-only — the bridge channel gives the agent hands on your device
- Not a replacement for Discord/Telegram — it's a first-party Hermes client with capabilities those platforms can't offer (terminal, bridge)
- Secure by default — WSS only (no plaintext WebSocket option). TLS certificate pinning for production.
- Realtime everything — streaming chat responses, live terminal output, instant bridge feedback. No polling.
- Clean UX — Material 3, minimal setup, one pairing flow for all channels.
- Offline-aware — graceful degradation when connection drops. Auto-reconnect with exponential backoff.
- Single connection — one WSS pipe multiplexes all three channels. Efficient, simple to reason about.
- Server-side state — the app is a thin client. Sessions, history, memory all live on the Hermes server.
┌─────────────────────────────────────────────────────┐
│ Android App (Compose) │
│ │
│ ┌─────────┐ ┌──────────┐ ┌────────┐ ┌────────┐ │
│ │ Chat │ │ Terminal │ │ Bridge │ │Settings│ │
│ │ Tab │ │ Tab │ │ Tab │ │ Tab │ │
│ └────┬────┘ └────┬─────┘ └───┬────┘ └────────┘ │
│ │ │ │ │
│ ┌────┴────────────┴─────────────┴────┐ │
│ │ Connection Manager (WSS) │ │
│ │ Channel Multiplexer │ │
│ │ Auth + Session Management │ │
│ └────────────────┬───────────────────┘ │
└───────────────────┼──────────────────────────────────┘
│ WSS (TLS 1.3)
│
┌───────────────────┼──────────────────────────────────┐
│ Hermes Server (Docker-Server) │
│ │ │
│ ┌────────────────┴───────────────────┐ │
│ │ Relay Server (Python) │ │
│ │ Port 8767 (WSS) │ │
│ │ │ │
│ │ ┌─────────┐ ┌────────┐ ┌───────┐ │ │
│ │ │ Chat │ │Terminal│ │Bridge │ │ │
│ │ │ Router │ │ PTY │ │Router │ │ │
│ │ └────┬────┘ └───┬───┘ └───┬───┘ │ │
│ └───────┼──────────┼─────────┼─────┘ │
│ │ │ │ │
│ ┌───────┴───┐ ┌───┴──┐ ┌──┴──────────────┐ │
│ │ WebAPI │ │ tmux │ │AccessibilityServ.│ │
│ │ /api/... │ │ PTY │ │(on phone) │ │
│ │ (aiohttp) │ │ │ │ │ │
│ └───────────┘ └──────┘ └──────────────────┘ │
└──────────────────────────────────────────────────────┘
All communication flows over a single WebSocket connection. Messages use a typed envelope:
{
"channel": "chat" | "terminal" | "bridge" | "system",
"type": "<event_type>",
"id": "<message_uuid>",
"payload": { ... }
}Connection lifecycle, auth, keepalive.
| Type | Direction | Payload |
|---|---|---|
auth (pairing mode) |
App → Server | { pairing_code, ttl_seconds?, grants?, device_name, device_id } — ttl_seconds / grants come from the phone's TTL picker dialog; host metadata wins over phone metadata when both are present |
auth (session mode) |
App → Server | { session_token, device_name, device_id } — ttl/grants are not re-sent; server keeps the grant table keyed on the original pair |
auth.ok |
Server → App | { session_token, server_version, profiles[], expires_at, grants, transport_hint } — see below |
auth.fail |
Server → App | { reason } |
ping |
Both | { ts } |
pong |
Both | { ts } |
auth.ok extended fields (added in ADR 15 — see docs/decisions.md):
| Field | Type | Meaning |
|---|---|---|
expires_at |
epoch seconds or null |
Session lifetime. null means never-expire (user explicitly picked "Never" in the TTL picker). Server-side math.inf serializes as null. |
grants |
{ channel: epoch | null } |
Per-channel expiries. Keys today: chat, terminal, bridge. Each grant is clamped to the session lifetime — a grant cannot outlive its session. null means the grant shares the session's never-expire. |
transport_hint |
"wss" / "ws" / "unknown" |
What the server believes the phone is actually connected over. Drives the transport security badge and the TTL picker's default option on re-pair. |
profiles |
[{name, model, description, system_message}] |
Added v0.6.0. Relay-advertised list of upstream Hermes profiles discovered at ~/.hermes/profiles/*/, plus a synthetic "default" entry for the root config. system_message carries the profile's SOUL.md content and may be null. Empty list when RELAY_PROFILE_DISCOVERY_ENABLED=0. See docs/decisions.md §21. |
Note: Chat connects directly to the Hermes API Server via HTTP/SSE (see Section 6.2) — it does not traverse the relay. Voice, bridge, terminal, notifications, and inbound media DO go through the relay. The chat SSE event types are:
| Event | Direction | Payload |
|---|---|---|
session.created |
Server → App | { session_id, run_id, title? } |
run.started |
Server → App | { session_id, run_id, user_message: { id, role, content } } |
message.started |
Server → App | { session_id, run_id, message: { id, role } } |
assistant.delta |
Server → App | { session_id, run_id, message_id, delta } |
tool.progress |
Server → App | { session_id, run_id, message_id, delta } |
tool.pending |
Server → App | { session_id, run_id, tool_name, call_id } |
tool.started |
Server → App | { session_id, run_id, tool_name, call_id, preview?, args } |
tool.completed |
Server → App | { session_id, run_id, tool_call_id, tool_name, args, result_preview } |
tool.failed |
Server → App | { session_id, run_id, call_id, tool_name, error } |
assistant.completed |
Server → App | { session_id, run_id, message_id, content, completed, partial, interrupted } |
run.completed |
Server → App | { session_id, run_id, message_id, completed, partial, interrupted, api_calls? } |
error |
Server → App | { message, error } |
done |
Server → App | { session_id, run_id, state: "final" } |
Session management uses the REST API (GET/POST /api/sessions, PATCH/DELETE /api/sessions/{id}).
PTY streaming — raw terminal I/O.
| Type | Direction | Payload |
|---|---|---|
terminal.attach |
App → Server | { session_name?, cols, rows } |
terminal.attached |
Server → App | { session_name, pid } |
terminal.input |
App → Server | { data } (raw keystrokes) |
terminal.output |
Server → App | { data } (raw ANSI output) |
terminal.resize |
App → Server | { cols, rows } |
terminal.detach |
App → Server | { session_name? } — preserves tmux session |
terminal.kill |
App → Server | { session_name? } — destroys tmux session and kills the shell |
Phone control — mirrors upstream relay protocol.
| Type | Direction | Payload |
|---|---|---|
bridge.command |
Server → App | { request_id, method, path, params?, body? } |
bridge.response |
App → Server | { request_id, status, result } |
bridge.status |
App → Server | { accessibility_enabled, overlay_enabled, battery } |
Pairing is QR-driven. The operator runs the pair command on the host — either /hermes-relay-pair from any Hermes chat surface (backed by the devops/hermes-relay-pair skill) or the hermes-pair shell shim (a thin wrapper around python -m plugin.pair). Both share the same implementation in plugin/pair.py. The command probes for a running relay, generates a fresh 6-char code, pre-registers it with the relay via the loopback-only POST /pairing/register endpoint, then embeds the relay URL + code + chosen TTL + per-channel grants + HMAC signature (and the API server credentials) in a single QR payload. The phone scans once, confirms the TTL and grants via a picker dialog, and is configured for both chat AND terminal/bridge.
As of v3 (ADR 24), the QR can also carry an ordered list of endpoint candidates (lan / tailscale / public / operator-defined roles). A single pairing covers every network the phone might be on — the phone picks the highest-priority reachable candidate at connect time and re-probes on network change. The single-URL top-level fields still appear in v3 QRs for backward compatibility; old phones ignore endpoints via ignoreUnknownKeys = true, new phones prefer endpoints and fall back to the top-level URL when the array is absent. See docs/remote-access.md for the operator-facing setup per mode.
1. Operator runs /hermes-relay-pair (or hermes-pair) on the Hermes host,
optionally with --ttl <duration>, --grants terminal=7d,bridge=1d,
--mode {auto,lan,tailscale,public} (default auto), --public-url <url>.
2. The pair command reads the API server config (host/port/key) from
~/.hermes/config.yaml or ~/.hermes/.env, and auto-detects candidate
endpoints: LAN IP via routing lookup; Tailscale hostname via
tailscale.status() when the CLI is present; public URL from
--public-url when provided. Strict-priority ordering (lan → tailscale
→ public) with 0 = highest. --mode lan/tailscale/public emits only
that candidate.
3. If a relay is reachable at localhost:RELAY_PORT (default 8767):
a. Mint a fresh 6-char code from A-Z / 0-9
b. Compute the transport hint (wss / ws) from the relay's TLS config
c. POST /pairing/register { code, ttl_seconds, grants, transport_hint,
endpoints? } (loopback only — the relay clears all rate-limit
blocks on success so stale blocks don't prevent legitimate re-pair)
d. Build the payload dict (`hermes: 3` when endpoints present, else
`hermes: 2`), HMAC-SHA256-sign it with the host-local secret at
~/.hermes/hermes-relay-qr-secret (auto-created, 32 bytes, mode
0o600), attach as `sig` field. Canonicalization preserves array
order — priority is meaningful, not alphabetic.
4. Render QR + plain-text block (includes "Pair: for 30 days" or
"Pair: indefinitely" + per-channel grant labels + per-endpoint role
chips when endpoints are present).
5. Phone scans the QR → parses HermesPairingPayload (see §3.3.1).
6. Phone stores the API server URL + key. When endpoints are present,
stores the ordered candidate list in PairingPreferences; otherwise
synthesizes a single priority-0 `role: lan` (or `role: tailscale`
when the top-level host matches `100.64.0.0/10` / `.ts.net`) entry
from the top-level fields for forward-compat.
7. SessionTtlPickerDialog opens with the QR's operator-chosen TTL
preselected (or default 30d on wss/Tailscale, 7d on plain ws). User
picks: 1d / 7d / 30d / 90d / 1y / Never. Never-expire warns inline
but is always selectable — user intent is the trust model.
8. Phone opens WSS to the relay with the pairing code + confirmed
ttl_seconds + grants in the first system/auth envelope.
9. Relay consumes the code (host-registered metadata wins over phone-sent
metadata — operator policy is authoritative), creates a Session with
the resolved TTL + grants + transport_hint, returns session token +
expires_at + grants + transport_hint in auth.ok.
10. Phone stores the session token in the Android Keystore (StrongBox-
preferred) with fallback to EncryptedSharedPreferences on older /
unsupported devices. On the first wss handshake, records the cert
SHA-256 fingerprint in CertPinStore (TOFU). Subsequent connects
verify against the stored pin via OkHttp's CertificatePinner.
11. Future connections use the session token directly. Rate limiter,
session expiry, and per-channel grants all enforced at the relay.
12. Session expires on ttl_seconds (or never); individual grants may
expire sooner. Paired Devices screen lists all devices with per-row
revoke.
Old API-only QRs (no relay block, no hermes field, or hermes: 1) still parse — the phone just skips the relay setup step and can be paired against a relay later via Settings. v1 QRs with a relay block (no TTL / grants / sig fields) still parse via ignoreUnknownKeys; the phone treats missing TTL as "prompt the user with defaults". v3 QRs with an endpoints array (ADR 24) also parse on v0.6.x and earlier clients — they ignore the array and keep using the top-level fields. New clients prefer endpoints and fall back to the top-level fields when absent.
Re-pair explicitly resets the TOFU pin for the target host (applyServerIssuedCodeAndReset(code, relayUrl) wipes CertPinStore[host:port]) — a QR rescan is taken as consent to possibly-new certificate material. This is the documented recovery path when a relay restarts with a new self-signed cert.
Phase 3 (bridge) will introduce a symmetric phone-generates-code, host-approves flow. The POST /pairing/approve route is stubbed in this cycle — same wire shape as /pairing/register, same loopback gate — with a # TODO(Phase 3) pointing at the pending-codes store + operator approval UI that still needs to be built.
Biometric gate on the app side for terminal access (fingerprint/face) remains planned.
{
"hermes": 3,
"host": "172.16.24.250",
"port": 8642,
"key": "api-bearer-token",
"tls": false,
"relay": {
"url": "ws://172.16.24.250:8767",
"code": "ABCD12",
"ttl_seconds": 2592000,
"grants": { "terminal": 2592000, "bridge": 604800 },
"transport_hint": "ws"
},
"endpoints": [
{ "role": "lan", "priority": 0,
"api": { "host": "192.168.1.100", "port": 8642, "tls": false },
"relay": { "url": "ws://192.168.1.100:8767", "transport_hint": "ws" } },
{ "role": "tailscale", "priority": 1,
"api": { "host": "hermes.tail-scale.ts.net", "port": 8642, "tls": true },
"relay": { "url": "wss://hermes.tail-scale.ts.net:8767", "transport_hint": "wss" } },
{ "role": "public", "priority": 2,
"api": { "host": "hermes.example.com", "port": 443, "tls": true },
"relay": { "url": "wss://hermes.example.com/relay", "transport_hint": "wss" } }
],
"sig": "base64url-hmac-sha256"
}hermes— payload version.1is the legacy shape (no new fields);2is set when any v2-only field (ttl_seconds,grants,transport_hint) is present in therelayblock;3is set whenendpointsis present (ADR 24). All three versions parse on the current Android client.endpoints— optional ordered list of endpoint candidates. When present, the phone uses these in strict-priority order (0 = highest) and re-probes reachability on network change. When absent, the phone synthesizes a single priority-0 candidate from the top-levelhost/port/tls+relay.url/transport_hintfields.roleis an open string (known valueslan/tailscale/publicget styled UI; anything else renders as "Custom VPN ()"). Per-endpoint entries intentionally carry onlyapi+relay— the pairing code, TTL, and grants stay at the top level because they're per-pair artifacts, not per-endpoint. Full schema in ADR 24.- Top-level fields (
host/port/key/tls) configure the direct-chat Hermes API Server. Unchanged since v1. relay— optional and nullable. Present only when the pair command found a running relay and successfully pre-registered a pairing code with it.relay.url— full WebSocket URL (ws://for dev,wss://for production).relay.code— 6-char one-shot pairing code fromA-Z / 0-9. Expires 10 minutes after registration.relay.ttl_seconds— optional. Operator-chosen session lifetime in seconds.0means never expire. When present, the phone's TTL picker preselects this value; when missing, the phone picks a default based on transport hint (wss → 30d, ws → 7d). The user always confirms via the picker dialog.relay.grants— optional. Per-channel expiries in seconds-from-now. Map keys:"terminal","bridge". Each grant is clamped server-side to the overall session TTL — a grant cannot outlive its session. Default caps if unspecified: terminal 30 days, bridge 7 days.relay.transport_hint— optional."wss"or"ws". Used by the phone as the default for the transport security badge and to compute the TTL picker's default option.sig— optional. Base64 HMAC-SHA256 of the canonicalized payload (sort_keys=True, separators=(",", ":"),sigfield excluded from canonical form). Computed with a host-local secret at~/.hermes/hermes-relay-qr-secret. Phones parse and storesigbut do not verify it yet — full verification requires a secret distribution mechanism the protocol doesn't yet define.- The Android parser uses
kotlinx.serializationwithignoreUnknownKeys = true, so future fields can be added without breaking older app builds.RelayPairing.ttlSeconds/grants/transportHintare all nullable with defaults.
Implementation references:
- Server-side payload builder + CLI flags:
plugin/pair.py→build_payload(sign=True, endpoints=...)/pair_command()/parse_duration()/parse_grants();--mode {auto,lan,tailscale,public}+--public-url <url> - Server-side HMAC:
plugin/relay/qr_sign.py→canonicalize/sign_payload/verify_payload/load_or_create_secret— canonical form preservesendpointsarray order and role strings verbatim - Phone-side endpoint model:
app/src/main/kotlin/.../data/Endpoint.kt→EndpointCandidate/ApiEndpoint/RelayEndpoint/displayLabel() - Phone-side parser:
app/src/main/kotlin/.../ui/components/QrPairingScanner.kt→HermesPairingPayload.endpoints+ v1/v2 synthesizer - Phone-side endpoint store:
app/src/main/kotlin/.../data/PairingPreferences.kt— per-device endpoint list - Phone-side network-aware switching:
app/src/main/kotlin/.../network/ConnectionManager.kt→resolveBestEndpoint()+NetworkCallback - Phone-side TTL picker:
app/src/main/kotlin/.../ui/components/SessionTtlPickerDialog.kt - Relay registration endpoint:
plugin/relay/server.py→handle_pairing_register(see §6 for details), accepts optionalendpointsin body - Dashboard pairing endpoint:
plugin/relay/server.py→handle_pairing_mintmints a fresh code and returns a signed payload in this exact shape; regression-tested against the Android parser inplugin/tests/test_pairing_mint_schema.py. The endpoint is loopback-only and surfaced to the dashboard viaplugin/dashboard/plugin_api.pyatPOST /api/plugins/hermes-relay/pairing.
| Layer | Implementation |
|---|---|
| Transport (default) | WSS / TLS 1.3 (preferred) |
| Transport (opt-in) | Plain ws:// — gated on InsecureConnectionAckDialog consent + reason picker (LAN-only / Tailscale or VPN / Local dev). Reason is displayed, not enforced — operator intent is the trust model. |
| Transport indicator | TransportSecurityBadge in Settings + Session sheet + Paired Devices card. Three states: 🔒 secure / 🔓 insecure with reason / 🔓 insecure unknown. |
| Pairing (host → phone) | hermes-pair / /hermes-relay-pair → POST /pairing/register (loopback-only) → QR embedded in operator's terminal or chat. |
| Pairing (phone → host, Phase 3) | Stubbed at POST /pairing/approve — same wire shape, same loopback gate. Real UX pending bridge work. |
| Session lifetime | User-selected at pair: 1d / 7d / 30d / 90d / 1y / never. Never is always selectable; operator intent is the trust model. |
| Per-channel grants | One session token carries {chat, terminal, bridge} per-channel expiries. Terminal default cap 30d, bridge default cap 7d, both clamped to session lifetime. |
| Auth envelope | {pairing_code, ttl_seconds, grants, device_name, device_id} for pairing mode; {session_token, device_name, device_id} for session-mode re-auth. Host metadata wins over phone metadata when both are present. |
auth.ok response |
{session_token, expires_at, grants, transport_hint, profiles, server_version}. math.inf expiries serialize as null. |
| Rate limiting | 5 auth attempts / 60s → 5-min block. /pairing/register clears all blocks on success so legitimate re-pair after a relay restart works immediately. |
| Token storage | SessionTokenStore — KeystoreTokenStore (StrongBox-preferred via setRequestStrongBoxBacked) with fallback to LegacyEncryptedPrefsTokenStore (TEE-backed EncryptedSharedPreferences). One-shot lossless migration on first launch post-upgrade. hasHardwareBackedStorage flag surfaced in UI. |
| Cert pinning | TOFU via CertPinStore — SHA-256 SPKI fingerprint recorded per host:port on first successful wss connect. Subsequent connects verify via OkHttp CertificatePinner. Pin wiped explicitly on QR re-pair (applyServerIssuedCodeAndReset). Plain ws:// short-circuits pinning entirely. |
| QR integrity | HMAC-SHA256 over canonicalized payload. Host-local secret at ~/.hermes/hermes-relay-qr-secret. Phone parses + stores the signature but does NOT verify yet (secret distribution TBD). |
| Tailscale detection | Informational only — tailscale0 interface + 100.64.0.0/10 CGNAT + .ts.net hostname checks. Displayed as a Connection-section chip. Does NOT auto-change TTL defaults. |
| Tailscale helper (first-class) | plugin/relay/tailscale.py + hermes-relay-tailscale CLI (ADR 25). Publishes the loopback relay over the tailnet via tailscale serve --bg --https=<port>; managed TLS + tailnet ACL identity. Optional, graceful-absent when the binary isn't installed. Auto-retires when upstream PR #9295 lands. See docs/remote-access.md. |
| Multi-endpoint pairing | Single QR carries an ordered list of role: lan/tailscale/public/... candidates with strict-priority selection (ADR 24). Phone re-probes reachability on every network change. Per-candidate transport_hint drives the plaintext-ws:// consent dialog. |
| Device revocation | Paired Devices screen → GET /sessions (tokens masked to 8-char prefix) / DELETE /sessions/{token_prefix} (self-revoke allowed, wipes local state + redirects to pair flow). Any paired device can revoke any other — trade-off documented in ADR 15. |
| Terminal gate | Biometric/PIN required before terminal access (planned). |
- Language: Kotlin 2.0+
- UI: Jetpack Compose + Material 3 (Material You dynamic theming)
- Navigation: Compose Navigation (type-safe)
- WebSocket: OkHttp 4.x (already in upstream, supports
wss://) - Terminal: WebView + xterm.js (v1), consider native Compose terminal later
- Serialization: kotlinx.serialization (replace Gson — faster, type-safe)
- Storage: Android Keystore (StrongBox-preferred via
KeystoreTokenStore) +EncryptedSharedPreferenceslegacy fallback viaLegacyEncryptedPrefsTokenStore; DataStore (preferences + TOFU cert pins) - DI: Manual dependency injection (no Hilt). Constructor-wired ViewModels, process-singletons where needed. Decided lean because the graph is small and dependencies are explicit.
- Biometric: AndroidX Biometric
- Min SDK: 26 (Android 8.0)
- Target SDK: 35
- Language: Python 3.11+
- Framework: aiohttp (matches existing relay)
- Terminal:
asyncio+ptymodule for PTY,libtmuxfor session management - Chat proxy: HTTP client to Hermes WebAPI (localhost:8642 or direct
run_agent) - Port: 8767 (WSS). The legacy standalone bridge relay on 8766 was retired in Phase 3 Wave 1 (2026-04-12) — the bridge channel is now multiplexed alongside chat, terminal, voice, and media on the unified relay.
- TLS: Let's Encrypt via certbot, or reverse proxy through Caddy/nginx
- CI: Lint (ktlint) → Build → Test → Upload APK artifact
- Release: Tag-triggered → version validation → signed APK → GitHub Release
- Patterns from ARC: Concurrency groups, matrix builds, version sync check
Bottom navigation bar with 4 tabs:
┌───────────────────────────────────────────────┐
│ │
│ [Active Tab Content] │
│ │
│ │
│ │
├───────┬───────────┬──────────┬────────────────┤
│ 💬 │ >_ │ 📱 │ ⚙️ │
│ Chat │ Terminal │ Bridge │ Settings │
└───────┴───────────┴──────────┴────────────────┘
- Top bar (three-layer agent model, v0.6.0). Layout from left to right:
- Connection chip — tap to open
ConnectionSwitcherSheet(all paired servers + health indicator). Auto-hidden when you only have one Connection. Seedocs/decisions.md§19. - Agent name + tappable region — tap to open the consolidated agent sheet (bottom sheet) holding Profile + Personality selection and per-session info/analytics (message count, tokens in/out, avg TTFT). Sheet is scrollable. Toast confirmations fire on Profile/Personality switch. Replaces the separate
ProfilePickerandPersonalityPickertop-bar chips that shipped in intermediate v0.5.x builds. - Remaining top-bar actions (session drawer hamburger, ambient toggle, etc.).
- Connection chip — tap to open
- Session drawer (swipe from left or hamburger icon) — session list with title, timestamp, message count. Create, switch, rename, delete.
- Chat view — message bubbles with markdown rendering, streaming text, tool call cards (Off/Compact/Detailed display modes)
- Input bar — text field with 4096 char limit,
/palette button, send button, stop button during streaming. Inline autocomplete on/keystroke + full searchable command palette (bottom sheet). Commands sourced from: 29 gateway built-ins, dynamic personalities fromconfig.agent.personalities, and server skills fromGET /api/skills. - Empty state — Logo + "Start a conversation" + suggestion chips that populate input
- Agent sheet — Profile section (v0.6.0) — upstream Hermes profiles auto-discovered by the relay at
~/.hermes/profiles/*/. Selecting one overlaysmodel+SOUL.md(assystem_message) on subsequent chat requests. Ephemeral — clears on Connection switch and app restart. Hidden when the server advertises no profiles. Seedocs/decisions.md§21. - Agent sheet — Personality section — personalities fetched from
GET /api/config(config.agent.personalities). Shows server default (fromconfig.display.personality) + all configured. Active personality name shown on assistant chat bubbles. - Streaming dots — animated pulsing 3-dot indicator replaces static "streaming..." text
- Displays: streaming delta text, tool progress cards (auto-expand while running, auto-collapse on complete), thinking/reasoning blocks (collapsible), per-message token counts + cost
- Full-screen terminal emulator (xterm.js in WebView)
- Session picker — attach to existing tmux sessions or create new
- Toolbar — Ctrl, Tab, Esc, Arrow keys (soft keys for mobile)
- Biometric gate — fingerprint/face required before showing terminal
- Supports: full ANSI color, scrollback, text selection, copy/paste
Shipped in v0.3.0; card hierarchy rewritten in v0.4.1. Rendered by BridgeScreen.kt + BridgeViewModel in this order:
- Master toggle card (
BridgeMasterToggle) — headline "Allow Agent Control" switch with aMASTERpill and leading "Master switch —" subtitle copy so the parent-gate role is legible at a glance. Gated on accessibility permission being granted; tapping the Switch when Accessibility is not granted surfaces a snackbar ("Accessibility Service must be enabled first.") with an "Open Settings" action that deep-links toACTION_ACCESSIBILITY_SETTINGSrather than silent-dropping the tap. Inline device / battery / screen / current-app rows live in-card (the old standaloneBridgeStatusCardwas dropped from the layout in v0.4.1). Info icon opens a Play-review explanation dialog that also names the "Hermes has device control" persistent notification owned by the master switch. - Permission checklist (
BridgePermissionChecklist) — tiered four-section layout shipped in v0.4.1 (Core bridge / Notification companion / Voice & camera / Sideload features). Tap-to-open Android Settings viaACTION_ACCESSIBILITY_SETTINGS,ACTION_MANAGE_OVERLAY_PERMISSION,enabled_notification_listeners, and per-rowRequestPermissionlaunchers for dangerous runtime perms. Rows fall back toACTION_APPLICATION_DETAILS_SETTINGSwhen a runtime permission has been permanently denied. Optional rows render an "Optional" Material 3 pill in aFlowRowwithsoftWrap=falseso the pill never wraps internally on narrow titles. Re-probes onLifecycle.Event.ON_RESUMEso returning from Android Settings flips rows green without navigation churn. - Advanced divider — visual separator between "operate the bridge" and "expand what the bridge can do".
- Unattended Access card (
UnattendedAccessRow, sideload-only) — opt-in toggle gated on the master toggle (enabled = masterEnabled; subtitle reads "Requires Agent Control — enable the master switch above first." when master is off). First-enable shows the scary one-time dialog covering the security model + credential-lock limitation + how to disable. Credential-lock warning renders as an inlineKeyguardDetectedAlertSurface band inside this card (was a standalone chip pre-v0.4.1, inlined so the warning lives next to the toggle that triggers it). - Safety summary card (
BridgeSafetySummaryCard) — blocklist count / destructive-verb count / countdown timer (in MM:SSduring an active idle window, elseN min idle). Tap-through toBridgeSafetySettingsScreenfor editing the blocklist / destructive verbs / auto-disable timer / status overlay / confirmation timeout. - Activity log (
BridgeActivityLog) — scrollableLazyColumncapped at 320dp +MAX_LOG_ENTRIES=100. Tap-to-expand rows showing timestamp, status (Pending / Success / Failed / Blocked), result text, and optional screenshot token. DataStore-backed viaBridgePreferences.
The bridge UI drives — and is driven by — Tier 5 safety-rails (BridgeSafetyManager, BridgeForegroundService, BridgeStatusOverlay, AutoDisableWorker). See docs/decisions.md and CLAUDE.md's file table for the full wiring.
Global unattended-access affordance (v0.4.1). When master + unattended are both on (sideload only), UnattendedGlobalBanner renders as a 28dp amber strip at the top of RelayApp's scaffold on every tab — pulsing dot + "Unattended access ON — agent can wake and drive this device" + chevron → tap navigates to Bridge. Theme-aware colours (amber-on-dark in dark mode, dark-amber-on-pale-amber in light). The banner handles visibility while the user is INSIDE Hermes-Relay; the existing WindowManager BridgeStatusOverlayChip handles visibility when the app is BACKGROUNDED. See docs/decisions.md §18 for the split rationale.
- Active agent card (v0.6.0) — top-of-screen summary card showing the current Connection / Profile / Personality. Tap navigates to Chat and auto-opens the agent sheet via the
openAgentSheetnav arg, giving Settings-originating users a one-tap path to change agent context without leaving the flow. - Connections (v0.6.0) — lists every paired Hermes server with a per-card status chip. Actions: rename (inline), re-pair (reuses
ConnectionWizardwithconnectionIdnav arg), revoke, remove. Add-connection button launches the standard QR flow. Settings treats a paired + briefly-disconnected connection as Connecting (amber) instead of Disconnected (red) to avoid scare-red during relay restarts. Seedocs/decisions.md§19. - Connection (single-server settings) — unified "Pair with your server" card (primary action: Scan QR) with a single status summary covering API server, relay, and the active paired session. Collapsible "Manual configuration" card exposes API URL / API key / Relay URL / insecure-transport toggle + "Save & Test" (calls
RelayHttpClient.probeHealth). Pair wizard cross-validates URL schemes in v0.6.0 (e.g. an API field withwss://surfaces an inline hint), and stamps the active Connection's pairing metadata on successful auth. Collapsible "Manual pairing code (fallback)" card for camera-less / SSH-only setups. Transport security badge (🔒 secure / 🔓 insecure-with-reason / 🔓 insecure-unknown) rendered inline. Paired Devices screen linked from here for the full device list + per-channel grant revoke. - Chat — Show reasoning toggle, smooth auto-scroll toggle (live-follow streaming, default on), show token usage toggle, app context prompt toggle, tool call display (Off/Compact/Detailed), streaming endpoint selector (
auto/sessions/runs), Stats for Nerds (analytics charts) - Voice — interaction mode (tap / hold / continuous), silence threshold slider, Auto-TTS toggle, provider info read from
/voice/config, language picker, Test Voice button - Notification companion — opt-in status, "Open Android Settings" action, test notification dump
- Appearance — theme (auto/light/dark), dynamic colors toggle
- Data — Backup, restore, reset with confirmation dialogs
- About — logo on dark background, dynamic version from BuildConfig, Source + Docs link buttons, credits. What's New dialog.
The relay is a new Python service that runs alongside the Hermes gateway. It owns the WSS connection to the phone and routes messages to the appropriate backend.
The canonical relay implementation lives at plugin/relay/ (consolidated into the plugin as of Phase 2). A thin compat shim at the top-level relay_server/ package delegates to it so legacy entrypoints (python -m relay_server) still work.
hermes-android/
├── plugin/relay/ # canonical implementation
│ ├── server.py # main aiohttp WSS server + HTTP routes
│ ├── auth.py # PairingManager, SessionManager, RateLimiter
│ ├── config.py # RelayConfig, PAIRING_ALPHABET
│ ├── channels/
│ │ ├── chat.py # proxies to Hermes WebAPI
│ │ ├── terminal.py # PTY-backed shell handler (Phase 2)
│ │ └── bridge.py # existing bridge protocol (stub)
│ └── __main__.py # `python -m plugin.relay`
└── relay_server/ # thin shim → plugin.relay (legacy entrypoint)
HTTP routes registered by create_app() in plugin/relay/server.py:
| Route | Method | Purpose |
|---|---|---|
/ws, / |
GET (upgrade) | WebSocket handler — main multiplexed channel |
/health |
GET | Health check — returns {status, version, clients, sessions} |
/pairing |
POST | Generate a new relay-side pairing code |
/pairing/register |
POST | Loopback only. Pre-register an externally-provided pairing code. Used by the pair command (/hermes-relay-pair skill or hermes-pair shim) to inject codes that will appear in QR payloads. Request: {"code": "ABCD12"}. Rejects non-loopback peers with HTTP 403. |
/api/profiles/{name}/config |
GET | Profile-scoped read-only config. Returns {profile, path, config, readonly: true} — config is the parsed config.yaml for ~/.hermes/ (when name == "default") or ~/.hermes/profiles/<name>/. Loopback callers skip bearer; remote callers require the relay session bearer. 404 on missing profile / missing config.yaml; 500 on yaml parse error. See §22 in decisions.md. |
/api/profiles/{name}/skills |
GET | Profile-scoped skill enumeration. Walks <profile>/skills/<category>/<skill>/SKILL.md recursively; returns {profile, skills: [{name, category, description, path, enabled: true}], total}. Same auth model as /config. name/description come from YAML frontmatter when present, else directory basename. All skills report enabled: true today — see §22 for the toggle stub. |
/api/profiles/{name}/soul |
GET | Profile-scoped raw SOUL.md read. Returns {profile, path, content, exists, size_bytes} with optional truncated: true when content exceeds the 200KB inline cap. Absent SOUL.md returns 200 with exists: false and an empty content string so the Inspector can distinguish "no soul" from transport failure. Same auth model as /config. 404 on unknown profile; 500 {error: "soul_read_failed"} on decode error. See §22 in decisions.md. |
/api/profiles/{name}/memory |
GET | Profile-scoped memory listing. Returns {profile, memories_dir, entries: [{name, filename, path, content, size_bytes, truncated}], total} for *.md files directly under <profile>/memories/ (non-recursive). Ordering: MEMORY.md first, USER.md second, remainder alphabetical. Each entry capped at 50KB inline with truncated: true when larger. Absent memories dir → 200 with empty entries array. Same auth model as /config. 404 on unknown profile. See §22 in decisions.md. |
Chat connects directly from the Android app to the Hermes API Server, bypassing the relay server entirely. This uses the Hermes Sessions API:
1. POST /api/sessions → create session → get session_id
2. POST /api/sessions/{session_id}/chat/stream → send message, get SSE stream
Authorization: Bearer <API_SERVER_KEY> (optional)
Accept: text/event-stream
Content-Type: application/json
{ "message": "Hello", "system_message": "..." }
Response: SSE stream with typed events:
event: session.created
data: {"session_id":"...","run_id":"...","title":"..."}
event: run.started
data: {"session_id":"...","run_id":"...","user_message":{"id":"...","role":"user","content":"Hello"}}
event: message.started
data: {"session_id":"...","run_id":"...","message":{"id":"...","role":"assistant"}}
event: assistant.delta
data: {"session_id":"...","run_id":"...","message_id":"...","delta":"Hello"}
event: tool.progress
data: {"session_id":"...","run_id":"...","message_id":"...","delta":"thinking..."}
event: tool.pending
data: {"session_id":"...","run_id":"...","tool_name":"terminal","call_id":"..."}
event: tool.started
data: {"session_id":"...","run_id":"...","tool_name":"terminal","call_id":"...","preview":"...","args":{...}}
event: tool.completed
data: {"session_id":"...","run_id":"...","tool_call_id":"...","tool_name":"terminal","args":{...},"result_preview":"..."}
event: tool.failed
data: {"session_id":"...","run_id":"...","call_id":"...","tool_name":"terminal","error":"..."}
event: assistant.completed
data: {"session_id":"...","run_id":"...","message_id":"...","content":"...","completed":true,"partial":false,"interrupted":false}
event: run.completed
data: {"session_id":"...","run_id":"...","message_id":"...","completed":true,"partial":false,"interrupted":false,"api_calls":3}
event: error
data: {"message":"error description","error":"..."}
event: done
data: {"session_id":"...","run_id":"...","state":"final"}
Additional API endpoints used:
3. GET /api/sessions → list all sessions
4. PATCH /api/sessions/{session_id} → rename session
5. DELETE /api/sessions/{session_id} → delete session
6. GET /api/sessions/{session_id}/messages → fetch message history
7. GET /api/config → personalities (for personality picker, `config.agent.personalities`)
8. GET /api/skills → available skills (for command palette + autocomplete)
Key classes:
- HermesApiClient — OkHttp-based HTTP/SSE client for direct API communication (chat, sessions, skills, config)
- ChatHandler — processes streaming deltas and tool call events into ChatMessage state
- ChatViewModel — orchestrates send/stream/cancel lifecycle, slash command handling
- AppAnalytics — singleton tracking TTFT, completion times, token usage, health latency, stream success rates
The relay server is not involved in chat streaming itself. It remains the home for bridge, terminal, and — as of 2026-04-11 — inbound media delivery (see 6.2a).
Tool-produced files (screenshots today, video/audio/PDF/other in the future) reach the phone via a plugin-owned file-serving surface on the relay, decoupled from the chat SSE stream itself. Only a short opaque token rides the chat stream; the bytes flow out-of-band over authenticated HTTPS.
Why this lives in the plugin, not upstream hermes-agent: APIServerAdapter.send() (in upstream gateway/platforms/api_server.py) is an explicit no-op — the HTTP API adapter does not implement send_document. Upstream's extract_media() / send_document() pipeline only fires for push platforms (Telegram, Feishu, WeChat) and non-streaming paths. On our streaming HTTP surface, MEDIA: tags in tool output have always passed through as literal text. Rather than patch upstream, we added our own endpoints and marker format. See docs/decisions.md §14 for the full trust and resource model.
Wire format:
Screenshot captured (1280x720)
MEDIA:hermes-relay://<url-safe-16-byte-token>
Server: three routes on plugin/relay/server.py:
POST /media/register— loopback-only. Body{"path", "content_type", "file_name"}. Validates path is absolute, resolves (os.path.realpath) under an allowed root, exists, is a regular file, fits underRELAY_MEDIA_MAX_SIZE_MB. Generatessecrets.token_urlsafe(16)(128 bits entropy), stores the token → entry mapping in an in-memoryOrderedDictLRU (capped atRELAY_MEDIA_LRU_CAP, TTLRELAY_MEDIA_TTL_SECONDS). Returns{ok, token, expires_at}. Used when a host-local tool explicitly wants to publish a file.GET /media/{token}— requiresAuthorization: Bearer <session_token>against the existingSessionManager(same token WSS uses). Streams the file viaweb.FileResponsewith the registered content type plusContent-Disposition: inline; filename="..."if the entry has a file name. 401 on missing/invalid bearer, 404 on unknown/expired token.GET /media/by-path?path=<abs>&content_type=<optional>— requires bearer auth. Shares the same sandbox validation as/media/registervia a commonvalidate_media_path()helper: absolute path,realpath-resolves under an allowed root, exists, is a regular file, fits under the size cap. Content-Type is the phone's hint if provided, otherwise guessed viamimetypes.guess_type(). This route exists specifically for LLM-emitted bare-path markers — upstreamagent/prompt_builder.pyinstructs the model to includeMEDIA:/absolute/path/to/filein its response text, so the bare-path form is the agent's native output, not just a fallback. 401 auth, 403 sandbox, 404 missing file.
Phone: parse → fetch → cache → render:
ChatHandler.scanForMediaMarkers()runs on everyonTextDelta, unconditionally (not gated onparseToolAnnotations). MatchesMEDIA:hermes-relay://([A-Za-z0-9_-]+)and firesonMediaAttachmentRequested(messageId, token). A second regex matches the bare-path formMEDIA:(/\S+)and firesonMediaBarePathRequested(messageId, path)— the ViewModel then callsRelayHttpClient.fetchMediaByPath()to pull bytes viaGET /media/by-path. A per-sessiondispatchedMediaMarkersset dedupes between real-time streaming scans and the post-streamfinalizeMediaMarkersreconciliation pass.loadMessageHistory(invoked by thesession_end reloadpattern at every stream complete) re-runs the same parser on server-stored content so client-injected attachments survive the wholesale state replace. Both marker forms are stripped from the rendered message text.ChatViewModelinserts a LOADINGAttachmentwithrelayTokenset immediately (message updates viaChatHandler.mutateMessage).- On Wi-Fi, or on cellular when
autoFetchOnCellularis true:RelayHttpClient.fetchMedia(token)issuesGET /media/{token}with the bearer header. URL is derived by swappingws://→http://,wss://→https://on the stored relay URL. - Bytes are checked against
maxInboundSizeMb. If oversize → FAILED placeholder. OtherwiseMediaCacheWriterwrites them tocontext.cacheDir/hermes-media/<sha1>.<ext>with LRU eviction by mtime (capped atcachedMediaCapMb) and returns acontent://URI viaFileProvider.getUriForFile(context, "${applicationId}.fileprovider", file). - The Attachment is flipped to LOADED with
cachedUriset.InboundAttachmentCarddispatches by(state × renderMode):IMAGErenders inline viaBitmapFactory.decodeByteArray+asImageBitmap;VIDEO/AUDIO/PDF/TEXT/GENERICrender as tap-to-open file cards firingACTION_VIEWwithFLAG_GRANT_READ_URI_PERMISSIONon the cached URI. - On cellular with
autoFetchOnCellularoff: the attachment stays in LOADING state witherrorMessage = "Tap to download", andmanualFetchAttachment()re-runs the fetch ignoring the cellular gate.
Fallback when relay isn't running: the tool's register_media() call fails (connection refused / timeout / non-200) → tool logs a warning and returns the legacy bare-path form (MEDIA:/tmp/...). The phone's onUnavailableMediaMarker handler inserts a FAILED Attachment with errorMessage = "Image unavailable — relay offline". Matches current behavior; placeholder is tidier than raw marker text.
Known gap — session replay across relay restarts: the MediaRegistry is in-memory. Restarting the relay invalidates all tokens. A user scrolling back into a session from yesterday sees FAILED placeholders for any now-stale token. Phone-side persistent cache (indexed by token or content hash) is the planned fix; filed as a DEVLOG follow-up.
Known gap — auto-fetch threshold slider isn't enforced today. The Settings → Inbound media → auto-fetch threshold knob is persisted but the fetch path currently only checks the cellular toggle + the hard max cap. Forward-compatibility placeholder; real enforcement needs a HEAD preflight or post-hoc byte rejection.
Key classes:
MediaRegistry(plugin/relay/media.py) — in-memory token store, thread-safe viaasyncio.Lockregister_media()(plugin/relay/client.py) — stdliburllib.requesthelper for in-process tool callersRelayHttpClient(Android) — OkHttp GET with Bearer auth + URL rewritingMediaCacheWriter(Android) — FileProvider-backed LRU cache incacheDir/hermes-media/InboundAttachmentCard(Android) — single Compose component dispatched on(state × renderMode), handles both inbound and outbound attachments
# App sends: { channel: "terminal", type: "terminal.attach", payload: { cols: 80, rows: 24 } }
# Relay:
# 1. Find or create tmux session
# 2. Open PTY attached to tmux
# 3. Stream PTY output → WebSocket
# 4. WebSocket input → PTY stdinUses asyncio.create_subprocess_exec with PTY for non-blocking I/O. tmux gives us named sessions, detach/reattach, and persistence across disconnects.
Wraps the existing relay protocol. When the agent calls android_* tools, the tool handler routes through the relay server's bridge channel to the phone.
Change from upstream: The bridge channel is part of the multiplexed WSS connection instead of a separate ws:// relay on port 8766. The legacy standalone plugin/tools/android_relay.py was retired in Phase 3 Wave 1 (2026-04-12) and its functionality migrated to two files in the unified relay: plugin/tools/android_tool.py (Hermes tools pointing at http://localhost:8767 — baseline 14 plus v0.4 expansion) and plugin/relay/channels/bridge.py (the BridgeHandler.handle_command(...) dispatcher that mints request IDs, sends bridge.command envelopes over the shared WSS pipe, and awaits matching bridge.response envelopes with a 30s timeout). HTTP routes are registered on plugin/relay/server.py between # === PHASE3-bridge-server === markers and delegate through the same handler. Wire protocol is frozen — envelopes match the legacy relay byte-for-byte.
Tools register against the Hermes plugin API in plugin/tools/android_tool.py (plus plugin/tools/android_notifications.py, plugin/tools/android_navigate.py). The Python-side tool issues an HTTP request to the relay on loopback; the relay forwards it to the phone over WSS; the phone executes it via the accessibility service and returns a structured response. Flavor gating: tools marked sideload-only are gated on BuildFlavor.current == SIDELOAD via FeatureFlags.BuildFlavor and/or require manifest permissions only declared in app/src/sideload/AndroidManifest.xml.
Baseline (pre-v0.4 — shipped in Phase 3 Wave 1):
| Tool | HTTP route | Purpose | Flavor |
|---|---|---|---|
android_ping |
GET /ping |
Liveness check — does not require master enable | both |
android_screen |
GET /screen |
Serialize the accessibility tree → ScreenContent |
both |
android_screenshot |
GET /screenshot |
MediaProjection PNG → MEDIA:hermes-relay://<token> |
both |
android_current_app |
GET /current_app |
Foregrounded package name | both |
android_get_apps (/apps legacy) |
GET /get_apps |
Installed launcher apps | both |
android_tap |
POST /tap |
Tap at (x, y) or on resolved node_id |
both |
android_tap_text |
POST /tap_text |
Find text via accessibility tree, tap it (see A9 cascade below) | both |
android_type |
POST /type |
ACTION_SET_TEXT on focused input field |
both |
android_swipe |
POST /swipe |
Gesture swipe with direction + distance | both |
android_scroll |
POST /scroll |
Scroll a specific container (resolves node_id) |
both |
android_open_app |
POST /open_app |
Launch an app by package name | both |
android_press_key |
POST /press_key |
Curated global-action vocab (home/back/recents/notifications/quick_settings) — no raw KeyEvent injection |
both |
android_wait |
POST /wait |
Clamped idle — max 15s | both |
android_setup |
POST /setup |
Permission bootstrap helper | both |
android_navigate |
(dispatches /screenshot + /tap_text//tap//type//swipe//press_key) |
Tier 4 vision-driven close-the-loop navigation | both |
android_notifications_recent |
GET /notifications/recent |
Poll the notif-listener ring buffer (loopback-only for Python tool callers) | both |
v0.4 additions — Tier A (both flavors):
| Tool | HTTP route | Purpose |
|---|---|---|
android_long_press(x, y, node_id, duration=500) |
POST /long_press |
Long-press gesture at coords or on resolved node. Gesture path wrapped in WakeLockManager.wakeForAction (see §6.4.2). |
android_drag(start_x, start_y, end_x, end_y, duration) |
POST /drag |
Single-stroke drag via GestureDescription. Wrapped in wake-lock. |
android_find_nodes(text?, class_name?, clickable?, limit) |
POST /find_nodes |
Filtered accessibility-node search across all windows (see P1 in §6.4.2). Returns a list of {node_id, text, bounds, class, clickable} records. |
android_describe_node(node_id) |
POST /describe_node |
Full property bag for a single node resolved by stable node_id. Round-trips the same ID scheme emitted by android_screen / android_find_nodes. A4 also completes the node_id resolution path in the existing /tap and /scroll routes — the IDs were previously emitted but not accepted as input. |
android_screen_hash() |
GET /screen_hash |
Returns {hash, node_count}. SHA-256 over a canonical per-node fingerprint (className + text + bounds + viewId) across the full accessibility tree. See ScreenHasher in §6.4.2. |
android_diff_screen(previous_hash) |
POST /diff_screen |
Returns {changed, hash, node_count} in a single call. Used as a cheap "did anything change?" check to skip full screen re-reads inside agent loops. |
android_clipboard_read() |
GET /clipboard |
Read primary clip via ClipboardManager.primaryClip. |
android_clipboard_write(text) |
POST /clipboard |
Set primary clip. |
android_media(action) |
POST /media |
System-wide media control via AudioManager.dispatchMediaKeyEvent + ACTION_MEDIA_BUTTON broadcast. Actions: play / pause / toggle / next / previous. |
android_macro(steps, name, pace_ms) |
(Python-side only) | Pure-Python batched workflow dispatcher. Iterates steps (each {tool, args}), stops on first failure, returns the full trace. No new HTTP route — dispatches to the existing tool handlers in-process. |
v0.4 additions — Tier B (both flavors):
| Tool | HTTP route | Purpose |
|---|---|---|
android_events(limit, since) |
GET /events |
Poll the real-time AccessibilityEvent ring buffer. Off by default — a session must enable forwarding via android_event_stream(enabled=true) before events are recorded. Privacy-sensitive; keep off unless an agent flow needs it. |
android_event_stream(enabled) |
POST /events/stream |
Opt in / out of event capture for the current session. |
android_send_intent(action, data, package, component, extras, category) |
POST /send_intent |
Raw Intent escape hatch — startActivity. Safety-gated on the target package blocklist via BridgeSafetyManager.checkPackageAllowed. |
android_broadcast(action, data, package, extras) |
POST /broadcast |
Raw sendBroadcast. Same blocklist gate as /send_intent. |
v0.4 additions — Tier C (sideload-only):
Tier C tools add runtime permissions that trigger Google Play policy review and are intentionally scoped to the sideload flavor only. The permissions are declared in app/src/sideload/AndroidManifest.xml; the googlePlay manifest does not declare them and the tools no-op via the BuildFlavor.current == SIDELOAD guard.
| Tool | HTTP route | Purpose | Permission |
|---|---|---|---|
android_location() |
GET /location |
Last-known GPS fix via LocationManager.getLastKnownLocation |
ACCESS_FINE_LOCATION |
android_search_contacts(query, limit) |
POST /search_contacts |
ContactsContract name → phone number lookup, cap on result count |
READ_CONTACTS |
android_call(number) |
POST /call |
Auto-dial via ACTION_CALL on sideload; googlePlay stub falls back to ACTION_DIAL (user must confirm in the dialer). Every call is gated on the destructive-verb confirmation modal; see §6.4.2 safety notes. |
CALL_PHONE |
android_send_sms(to, body) |
POST /send_sms |
Direct SmsManager.sendTextMessage (or sendMultipartTextMessage for long bodies) with a PendingIntent result callback. Every send is gated on the destructive-verb confirmation modal. |
SEND_SMS |
Safety integration. All HTTP routes except /ping and /current_app are gated in BridgeCommandHandler on the Bridge master toggle (bridge_master_enabled DataStore flag) and the Tier 5 three-stage safety check:
- Blocklist gate —
BridgeSafetyManager.checkPackageAllowed(currentApp)returns 403{"error": "blocked package <name>"}when the foreground package is in the blocklist (~30 banking/payments/password-manager/2FA defaults seeded viaDEFAULT_BLOCKLIST). - Destructive-verb confirmation —
/tap_textand/typecommands whose text matches the user's destructive-verb regex list (send/pay/delete/transfer/confirm/submit/ ...) suspend on aCompletableDeferred<Boolean>under awithTimeout, waiting for the user to Allow / Deny via theBridgeStatusOverlaymodal. Tier Candroid_callandandroid_send_smsalways go through this gate regardless of body content — a phone call or SMS is definitionally destructive. Denied or timed-out commands return 403{"error": "user denied destructive action", "reason": "confirmation_denied_or_timeout"}. - Auto-disable reschedule — every successful command resets the idle countdown on
BridgeSafetyManager.rescheduleAutoDisable, which flips master off after the configured idle window (default 30 min, clamped 5..120).
The newly added Tier A/B tools all flow through the same BridgeCommandHandler dispatch and are covered by the existing gates without additional wiring. Tier A tools that only read (e.g. android_screen_hash, android_clipboard_read, android_describe_node) skip the destructive-verb check but still hit the blocklist and master-enable gates. android_send_intent and android_broadcast hit the blocklist gate keyed on the target package (not just the foreground app) so an agent can't bypass the blocklist by firing an Intent at a blocked target from an allowed foreground.
The v0.4 wave includes three reliability patterns applied to existing code and one new primitive. They're listed here because they cut across every tool added above and anchor the tool surface to a more predictable baseline.
WakeLockManager — wake-scope wrapping for gesture dispatch. New object WakeLockManager at app/src/main/kotlin/com/hermesandroid/relay/power/WakeLockManager.kt exposes suspend fun <T> wakeForAction(block: suspend () -> T): T. Uses PowerManager.PARTIAL_WAKE_LOCK, ref-counted so nested calls don't release each other prematurely, with a hard 10-second timeout as a battery safety rail. ActionExecutor wraps every gesture-dispatching function (tap, tapText, typeText, swipe, scroll, longPress, drag) in wakeForAction { ... }. Read-only accessibility calls (readScreen, findNodes, describeNode, screenHash, diffScreen, currentApp, clipboardRead/Write, mediaControl) are not wrapped — they don't need the screen on. Closes the "gesture fires into the void when the screen is off" failure mode that silently broke android_tap / android_swipe whenever Bailey's phone hit idle between commands. Requires android.permission.WAKE_LOCK in the main manifest.
Multi-window ScreenReader (P1). ScreenReader.readCurrentScreen now iterates service.windows.mapNotNull { it.root } instead of the single rootInActiveWindow. Returns a merged tree where each AccessibilityNodeInfo is walked per-window and recycled in the per-iteration try/finally. Catches system overlays, popup menus, notification shade, and split-screen secondary windows — the previous single-root path silently ignored them. Node-ID scheme update: stable IDs are now prefixed w<windowIndex>:<sequentialIndex> (e.g. w0:42, w1:7) so IDs are disambiguated across windows. A single-window fallback kicks in when service.windows is empty, which happens on the googlePlay flavor without flagRetrieveInteractiveWindows (the conservative a11y config that survives Play Store policy review). Node IDs are end-to-end resolvable after A4 wired parsing into /tap and /scroll — android_find_nodes and android_describe_node emit them, and android_tap / android_scroll accept them as input, so an agent can search → describe → act without re-reading the tree.
A9 three-tier tapText cascade. ActionExecutor.tapText replaces the single-shot findNodeBoundsByText → performAction(ACTION_CLICK) path with a 3-tier fallback:
- Find node by text across all windows. If
node.isClickable→performAction(ACTION_CLICK). - Otherwise walk up the parent chain (capped at 8 levels) looking for a clickable ancestor. If found →
performAction(ACTION_CLICK)on it. - Otherwise capture the node's
getBoundsInScreen()center and fall back to a coordinatetap(cx, cy).
The ActionResult.data field indicates which tier succeeded ("direct" / "parent" / "coords") so the activity log and agent trace show how the click was resolved. Fixes a whole class of failures in real-world apps (Uber, Spotify, Instagram, Tinder) that wrap clickable content in non-clickable text or image views. Parent-chain traversal is bounded to avoid leaks — every AccessibilityNodeInfo returned by .parent is explicitly recycled before the loop reassigns.
ScreenHasher — content fingerprint for change detection. New primitive backing A5 android_screen_hash / android_diff_screen. Walks the full (multi-window) accessibility tree and computes SHA-256 over a canonical joined fingerprint of per-node triples (className + text + bounds + viewId). Returns {hash, node_count}. The hash is deliberately not stable across animation frames or live-updating text — documented limitation. Rationale: android_navigate previously re-read the full tree on every loop iteration to decide whether the last action did anything; a hash comparison is ~100× cheaper in both compute and token cost, and an agent polling for "has the page loaded yet?" can do so without dragging a full ScreenContent JSON back across the WSS each time. Phone-side: new ScreenHasher.kt alongside ScreenReader.kt. Exposed via a computeHash() extension on the serialized node model so the server can also hash a prior ScreenContent snapshot for free.
Priority: P0 — do first
- Create private GitHub repo
Codename-11/hermes-relay(or rename fork) - Set up Kotlin + Jetpack Compose project (replace upstream XML layout)
- Gradle config: Kotlin 2.0+, Compose BOM, Material 3, OkHttp, kotlinx.serialization
- Basic Compose scaffold: bottom nav, 4 tabs, placeholder screens
- GitHub Actions: build APK on push
- WSS connection manager (OkHttp WebSocket with
wss://) - Channel multiplexer (envelope format, routing)
- Basic auth flow (pairing code → token)
Priority: P0
- Server: Relay with chat channel router
- Server: Proxy to Hermes WebAPI
/api/sessions/{id}/chat/stream - Server: SSE → WebSocket bridge
- App: Chat UI (message list, input bar, streaming text)
- App: Tool progress cards (collapsible)
- App: Profile selector (list available agent profiles)
- App: Session management (create, list, switch)
- App: Auto-reconnect with exponential backoff
Status: preview shipped in v0.2.0 (2026-04-12). Biometric gate is the one open item.
- Server: PTY/tmux integration (
plugin/relay/channels/terminal.py) - Server: Terminal channel handler (attach, input, output, resize)
- App: WebView + xterm.js terminal emulator (
TerminalWebView.kt) - App: Soft keyboard toolbar — Ctrl / Tab / Esc / arrows (
ExtraKeysToolbar.kt) - App: tmux session picker with tabs (
TerminalTabBar.kt,TerminalSessionInfoSheet.kt), scrollback search (TerminalSearchBar.kt) - App: Biometric gate before terminal access (planned — see Phase 4)
- App: Terminal resize on orientation change
Status: shipped in v0.3.0 (2026-04-13). v0.4 bridge feature expansion is in progress on feature/bridge-feature-expansion — adds long-press / drag / macro / clipboard / intent-send / location / contacts / call / SMS and multi-window screen reading.
- Migrate upstream bridge protocol into multiplexed WSS — Phase 3 Wave 1, 2026-04-12 (routes registered in
plugin/relay/server.pydelegating toplugin/relay/channels/bridge.py) - Update
plugin/tools/android_tool.pyto route through the unified relay on port 8767 (was the standaloneandroid_relay.pyon 8766) - App: Bridge status UI — see §5 Bridge Tab
- App: Permission management (
BridgePermissionChecklist— accessibility, screen capture, overlay, notification listener) - App: Activity log (
BridgeActivityLog+BridgePreferences, capped at 100 entries) - App: Accessibility service (
HermesAccessibilityService+ScreenReader+ActionExecutor+BridgeCommandHandler) - App: Tier 5 safety rails —
BridgeSafetyManager(blocklist + destructive-verb confirmation + auto-disable timer),BridgeForegroundService(persistent "Hermes has device control" notification),BridgeStatusOverlay(confirmation modal + optional floating chip) - App: Flavor split — googlePlay (conservative a11y config) and sideload (full capabilities)
- Plugin: notification-listener companion channel (
android_notifications_recent) +android_navigatevision loop - v0.4 bridge feature expansion — 10 Tier A tools (long_press, drag, find_nodes, describe_node, screen_hash + diff_screen, clipboard r/w, media, macro) + 2 Tier B tools (events/event_stream, send_intent + broadcast) + 4 Tier C sideload-only tools (location, search_contacts, call, send_sms); architectural patterns —
WakeLockManagerwake-scope wrapping, multi-windowScreenReader, A9 three-tiertapTextcascade,ScreenHashercontent fingerprinting. See §6.4.1 for the tool surface table and §6.4.2 for the patterns.
Status: ADR 15 landed in v0.2.0 (2026-04-11/12). Biometric gate is the one remaining item.
- TLS support + TOFU certificate pinning (
CertPinStore— SHA-256 SPKI fingerprints perhost:port, wiped explicitly on re-pair viaapplyServerIssuedCodeAndReset; plainws://short-circuits pinning) - Android Keystore session token storage (
SessionTokenStore—KeystoreTokenStorewith StrongBox-preferred viasetRequestStrongBoxBacked,LegacyEncryptedPrefsTokenStoreTEE-backed fallback, one-shot lossless migration on first launch) - User-chosen session TTL at pair time (
SessionTtlPickerDialog— 1d / 7d / 30d / 90d / 1y / Never) - Per-channel grants on one session token (
Session.grants— chat / terminal / bridge, clamped to session lifetime) - Paired Devices screen (
PairedDevicesScreen+GET /sessions+DELETE /sessions/{prefix}+PATCH /sessions/{prefix}for extend) - Transport security badge (
TransportSecurityBadge— three states: secure / insecure-with-reason / insecure-unknown) - First-time insecure-mode ack dialog with reason picker (
InsecureConnectionAckDialog) - Tailscale detection (
TailscaleDetector— informational only) - HMAC-SHA256 QR signing (
plugin/relay/qr_sign.pywith host-local secret at~/.hermes/hermes-relay-qr-secret; phone parses + storessigbut does not verify yet — secret distribution is a follow-up) - Rate limiting on auth endpoint (
RateLimiter— 5 attempts / 60s → 5-min block;/pairing/registerclears all blocks on success so legitimate re-pair after relay restart works immediately) - Session expiry + rotation (
expires_atinauth.ok, server-sideSessionManagerenforcement) - Biometric gate for terminal access (AndroidX Biometric — not wired yet)
Status: largely shipped. v0.1.0 shipped to the Play Store under Axiom-Labs, LLC. Notification-channel-for-agent-messages is the one open item.
- GitHub Actions: lint + build + test on every push (
.github/workflows/ci.yml) - GitHub Actions: release workflow — tag-triggered signed APK + AAB upload to GitHub Release (
.github/workflows/release.yml) - Material You dynamic theming (Material 3 + dynamic color, user toggle in Appearance settings)
- Proper error states and empty states (
RelayErrorClassifier→HumanError→ globalLocalSnackbarHost; MorphingSphere-backed empty chat state) - App icon and branding (
ic_launcher*, animated splash viasplash_icon_animated.xml, MorphingSphere) - Two build flavors:
googlePlay(Play Store track, conservative Accessibility use case) andsideload(.sideloadapplicationId suffix, full feature set) - Notification channel for agent messages (not wired; Phase 6 territory)
Status: shipped 2026-04-12
Real-time voice conversation via relay-hosted TTS/STT endpoints that wrap the hermes-agent venv's configured providers. Chat still goes directly to the API server — voice adds a modality on top, not a separate channel.
Server-side (plugin/relay):
POST /voice/transcribe— multipart audio →{text, provider}. Wrapstools.transcription_tools.transcribe_audioinasyncio.to_thread.POST /voice/synthesize— JSON{text}→audio/mpegfile. Wrapstools.tts_tool.text_to_speech_tool.GET /voice/config— provider availability + current settings fromtts:/stt:in~/.hermes/config.yaml.- All three gated on the same bearer auth as
/media/*.
App-side:
VoiceRecorder(MediaRecorder / MPEG-4 AAC / m4a / 16 kHz mono) +VoicePlayer(Media3 ExoPlayer + Visualizer for amplitude) withStateFlow<Float>amplitude for the orb. A single persistentExoPlayerqueuesMediaItems for gapless concatenation — no codec re-init between sentences.VoiceViewModelstate machine (Idle / Listening / Transcribing / Thinking / Speaking / Error). Assistant text is sanitized (markdown / tool-annotations / URLs / emoji-set stripped) on each delta before a coalescing chunker (MIN_COALESCE_LEN=40,MAX_BUFFER_LEN=400secondary-break escape, 800 ms timer flush) emits sentence-scale chunks onto aChannel<String>TTS queue. Two supervisor-scoped workers — a synth worker calling/voice/synthesizeand a play worker appending to the ExoPlayer queue — run in parallel across a boundedChannel<File>(capacity=2)so sentence N+1 synthesizes while sentence N plays. Cancellation paths delete any unplayed cache files.- Server-side,
/voice/synthesizeruns a matching sanitizer (plugin/relay/tts_sanitizer.py) before handing text to the upstreamtext_to_speech_tool— defense-in-depth for any client that doesn't pre-sanitize. - Barge-in (opt-in, default off). While in
Speaking, aBargeInListenerruns a duplexAudioRecord(16 kHz mono PCM,VOICE_COMMUNICATIONsource) feeding 32 ms frames through a Silero VAD (com.github.gkonovalov:android-vad:silero).AcousticEchoCanceler+NoiseSuppressorattach to the ExoPlayer audio session so TTS output doesn't retrigger VAD. A single raw speech frame →VoicePlayer.duck()(volume 0.3f) with a 500 ms un-duck watchdog.Nconsecutive frames (2–3, sensitivity-tuned) →interruptSpeaking()(same cancellation path V4 wired for user taps). A 600 ms watchdog onVoiceRecorder.amplitudethen decides: if the user keeps talking, new turn proceeds normally; if silence wins ANDresumeAfterInterruption=true,VoiceViewModelre-enqueues the unplayed chunks fromspokenChunks[lastInterruptedAtChunkIndex+1..]and flips back toSpeaking. Settings UI exposesBargeInPreferences(enabled / sensitivity ∈Off/Low/Default/High/ resume) with anAcousticEchoCanceler.isAvailable()-driven compatibility badge. - Integrates with
ChatViewModelby observingmessages: StateFlow— no changes to chat code. Transcribed text goes through normalchatVm.sendMessage(text)so voice utterances appear as regular user messages in chat history. VoiceModeOverlay— full-screen UI with the MorphingSphere at 60% height invoiceMode=true, transcribed + response text, mic button supporting Tap / Hold / Continuous interaction modes.MorphingSpheregainsSphereState.Listening(soft blue/purple, subtle wobble with user amplitude) andSphereState.Speaking(vivid green/teal, dramatic core-warmth pulse with agent amplitude). Additive changes — existing call sites unchanged via defaultedvoiceAmplitude/voiceModeparams.- Voice Settings screen off the main Settings — interaction mode, silence threshold, TTS/STT provider labels, Test Voice button.
See docs/decisions.md → Voice Mode — Architecture for the four key decisions (relay-hosted endpoints, buffer-not-stream client chunking, m4a-not-webm recorder, ChatViewModel observation pattern).
Priority: P3 — not for MVP
- Notification listener — shipped v0.3.0 via
HermesNotificationCompanion(opt-inNotificationListenerService), exposed to the agent viaandroid_notifications_recent(limit=20)over a bounded relay-side deque inplugin/relay/channels/notifications.py. - Clipboard bridge — shipped on the v0.4 bridge-expansion branch (
feature/A6-clipboard):android_clipboard_read/android_clipboard_write. - Reverse file transfer (phone → server direct upload; inbound agent → phone already shipped in v0.2.0)
- Multi-device session routing (per-device tool-call routing with an explicit "add another device" flow)
- On-device model fallback (Gemma / Qwen via MediaPipe or llama.cpp, for offline + hybrid routing)
- iOS client (evaluate Shortcuts + accessibility + App Intents feasibility first)
As of v0.3.0, Phases 0–5 plus Phase V (voice) have shipped in some form. The current release cadence focuses on v0.4 bridge feature expansion — see docs/plans/2026-04-13-bridge-feature-expansion.md.
Still non-goals for the current cadence:
- Biometric session lock (fingerprint/face gate on terminal and/or chat resume). Tracked under Phase 4.
- Push notifications for agent messages (requires FCM + a notification channel on the relay side). Tracked under Phase 5.
- iOS client. Not on the roadmap.
- Reverse file transfer (phone → server direct upload). Inbound media (agent → phone) shipped in v0.2.0; outbound is attachments via the chat stream only.
- On-device model fallback (Phase 6).
See Appendix A — Original Phase 0 Scope at the end of this document for the historical "what we needed to build the first night" list, preserved for reference.
Current versions as of v0.3.0. Source of truth is gradle/libs.versions.toml — this table is a human-readable snapshot, not authoritative.
| Dependency | Version | Purpose |
|---|---|---|
| Android Gradle Plugin | 8.13.2 | Build toolchain |
| Kotlin | 2.3.20 | Language + Compose compiler plugin |
| Jetpack Compose BOM | 2026.03.01 | UI framework |
| Material 3 (via BOM) | — | Design system |
| Navigation Compose | 2.9.7 | Type-safe navigation |
| Lifecycle | 2.10.0 | ViewModel + state |
| OkHttp | 5.3.2 | WebSocket + SSE + HTTP |
| kotlinx.serialization | 1.11.0 | JSON handling |
| kotlinx.coroutines | 1.10.2 | Structured concurrency |
| DataStore Preferences | 1.1.1 | Key-value settings |
| Security Crypto | 1.1.0 | EncryptedSharedPreferences legacy token fallback |
| markdown-renderer (mikepenz) | 0.30.0 | Chat message rendering |
| Haze | 1.7.2 | Glassmorphism blur |
| ML Kit Barcode Scanning | 17.3.0 | QR pairing scan |
| CameraX | 1.6.0 | QR camera preview |
| xterm.js | 5.x | Terminal emulator (WebView) |
| aiohttp | 3.9+ | Server relay |
| libtmux | 0.37+ | tmux session management |
| gradle-play-publisher | 4.0.0 | Automated Play Console upload (optional) |
| Surface | How We Connect |
|---|---|
| WebAPI chat | HTTP to localhost:8642/api/sessions/*/chat/stream (SSE) |
| WebAPI sessions | GET/POST/PATCH/DELETE /api/sessions for CRUD |
| Personalities | GET /api/config → config.agent.personalities for picker + command palette |
| Server skills | GET /api/skills — dynamic skill discovery for command palette + autocomplete |
| Plugin system | register_tool() via ctx for android_* tools |
| Gateway | Chat channel goes through WebAPI, not directly to gateway |
| Memory/Skills | Accessible through agent chat (no direct API needed for MVP) |
| Dashboard plugin | Lives at plugin/dashboard/; see §10.1 below |
Hermes-Relay ships a hermes-agent Dashboard Plugin (upstream axiom branch, commit 01214a7f) that surfaces relay-specific state in the gateway's web UI. The plugin subtree at plugin/dashboard/ is discovered automatically: the canonical install.sh symlinks ~/.hermes/plugins/hermes-relay → <repo>/plugin, and the gateway scans ~/.hermes/plugins/<name>/dashboard/manifest.json at startup. Manifest fields (name: "hermes-relay", label: "Relay", icon: "Activity" from the 20-name Lucide whitelist, tab.path: "/relay", tab.position: "after:skills") place the tab after Skills in the dashboard nav.
Four internal tabs render inside the single /relay route via a shadcn Tabs component:
| Tab | Data source | What it shows |
|---|---|---|
| Relay Management | /api/plugins/hermes-relay/overview + /sessions |
Relay version + uptime + health, paired-device list (token prefix, device name, last-seen, expires-at, per-channel grants), per-row Revoke button (placeholder pending proxy route). |
| Bridge Activity | /api/plugins/hermes-relay/bridge-activity |
Ring buffer of the most recent 100 bridge commands (method, path, redacted params, decision, sent_at, response_status, error). Filter chips: All / Executed / Blocked / Confirmed / Timeout / Error. Polls every 5s; pausable via header Auto-refresh toggle (persisted to localStorage). |
| Push Console | /api/plugins/hermes-relay/push |
Stub — returns {configured: false, reason: "FCM not yet wired; …"}. Renders an FCM-not-configured banner + link to the deferred-items doc. Real data ships when FCM is wired. |
| Media Inspector | /api/plugins/hermes-relay/media |
Active MediaRegistry tokens (basename-only file name — absolute paths never leave the server — plus content_type, size, created_at, expires_at, last_accessed). TTL countdown decrements in real time (setInterval(1000), cleaned up on unmount). Polls every 15s. |
Three new loopback-gated relay routes feed the plugin backend (plus a loopback-exempt branch on the existing GET /sessions). All are gated by a tiny _require_loopback() helper that rejects any request.remote other than 127.0.0.1 / ::1 with HTTP 403. Full wire-shape details in docs/relay-server.md.
| Route | Method | Purpose |
|---|---|---|
/bridge/activity |
GET | Ring buffer of recent bridge commands; ?limit=N (max 500, default 100). |
/media/inspect |
GET | Active media tokens; ?include_expired=true to include evicted entries (default false). |
/relay/info |
GET | Aggregate status for the management tab: {version, uptime_seconds, session_count, paired_device_count, pending_commands, media_entry_count, health}. |
/sessions |
GET | Loopback branch now returns the full session list without a bearer (for the dashboard proxy). Non-loopback callers still require the bearer and retain the is_current flag. |
Auth model. The dashboard plugin's FastAPI router mounts under /api/plugins/hermes-relay/* inside the gateway process (itself bound to localhost). It forwards to the relay at http://127.0.0.1:{HERMES_RELAY_PORT} (default 8767). Both hops are loopback-only — no bearer is minted and no new credentials are introduced. Media paths are sanitized to basename-only in MediaRegistry.list_all() so even a future decision to expose these routes externally wouldn't leak filesystem layout.
Frontend. Source under plugin/dashboard/src/ (JSX + esbuild), committed pre-built IIFE at plugin/dashboard/dist/index.js (~16 KB minified). Uses the dashboard's window.__HERMES_PLUGIN_SDK__ global for React + shadcn primitives + fetchJSON() — no external HTTP library, no bundled React. See ADR 19 in docs/decisions.md for the architectural rationale.
- ARC — CI/CD patterns, project structure conventions
- Hermes Agent — Gateway, WebAPI, plugin system, SSE streaming
Preserved verbatim from the original scoping session. This is a historical snapshot, not a current MVP definition. See §8 for the current scope.
MVP Scope (Tonight)
Focus: Phase 0 + start of Phase 1
Deliverables:
- Compose project with bottom nav scaffold
- WSS connection manager with channel multiplexing
- Basic pairing/auth flow
- Chat tab: send message → get streaming response
- Server: relay with chat channel routing
- GitHub Actions: build APK
Non-goals for tonight:
- Terminal (Phase 2)
- Bridge (Phase 3)
- Biometrics (Phase 4)
- Release workflow (Phase 5)
All six deliverables shipped in v0.1.0. Four of the five "non-goals for tonight" have since shipped in v0.2.0 / v0.3.0; biometrics is the one remaining open item.
- ClawPort — Web dashboard (parallel effort, different interface surface)