Base URL: http://<host>:8765
The JARVIS server serves a FastAPI application on port 8765 by default. It provides REST endpoints for session management, settings, tool configuration, and system control, plus a WebSocket endpoint for real-time chat streaming. The WebUI static build is served automatically from interface/jarvis/web/dist/ when available.
The server uses a simple token-based auth flow. The first call any client should make is to the bootstrap endpoint, which generates a short-lived token used for subsequent requests (in practice the token is stored in memory — no external auth provider is required):
GET /jarvis/bootstrap
Response:
{
"token": "abc123...",
"ws_path": "/jarvis/ws",
"model_name": "gpt-4o",
"expires_in": 3600
}| Field | Type | Description |
|---|---|---|
token |
string | Opaque bearer token, 43 chars |
ws_path |
string | WebSocket path to connect to |
model_name |
string | Currently configured model |
expires_in |
int | Token TTL in seconds (currently 3600) |
Note: The token store is in-memory (
_tokensdict). In production, you must replace this with Redis or a similar store. Currently, rate limiting and token revocation on the REST side are not enforced — the token is a convenience for the WebUI bootstrap flow.
The WebSocket endpoint (/jarvis/ws) authenticates implicitly via the connection lifecycle. On connect the server sends a ready event. The client then sends either:
{"type": "new_chat"}— creates a new session and auto-authenticates{"type": "attach", "chat_id": "..."}— resumes an existing session, auto-authenticates
There is no explicit auth message type. Both new_chat and attach generate a server-side token on first use.
GET /jarvis/health
Response:
{
"status": "ok",
"timestamp": "2026-05-14T12:00:00"
}Sessions are persisted as JSON files in ~/.jarvis/sessions/.
GET /api/sessions
Returns all local sessions sorted by updated_at descending.
[
{
"key": "websocket:chat_a1b2c3d4",
"channel": "websocket",
"chatId": "chat_a1b2c3d4",
"createdAt": "2026-05-14T12:00:00",
"updatedAt": "2026-05-14T12:30:00",
"preview": "How do I refactor this function?"
}
]POST /api/sessions
Response:
{
"id": "a1b2c3d4",
"created": "2026-05-14T12:00:00"
}GET /api/sessions/{session_id}
Returns the full session data (messages included):
{
"id": "a1b2c3d4",
"created_at": "2026-05-14T12:00:00",
"updated_at": "2026-05-14T12:30:00",
"preview": "How do I refactor this function?",
"messages": [],
"session_id": "uuid..."
}DELETE /api/sessions/{session_id}
POST /api/sessions/{session_id}/delete
Both methods are supported (FastAPI route aliases).
{
"deleted": true
}GET /api/sessions/{session_id}/messages
{
"key": "websocket:a1b2c3d4",
"created_at": "2026-05-14T12:00:00",
"updated_at": "2026-05-14T12:30:00",
"messages": []
}GET /api/history/{chat_id}
Reads the underlying ConversationHistory from the session's session_id field. Returns structured messages with role/content metadata.
{
"session_id": "uuid...",
"chat_id": "chat_a1b2c3d4",
"messages": [
{
"role": "user",
"content": "Hello",
"timestamp": "..."
}
]
}GET /api/sessions/{session_id}/checkpoints
Returns all user messages as rewindable checkpoints:
{
"session_id": "a1b2c3d4",
"checkpoints": [
{
"index": 0,
"content": "How do I refactor this function?",
"timestamp": "2026-05-14T12:00:00",
"has_file_changes": false
}
]
}POST /api/sessions/{session_id}/rewind
Request body:
{
"message_index": 0,
"restore_files": false
}Truncates the session's message list at the given index and persists.
{
"success": true,
"rewound_to": 0,
"message_content": "How do I refactor this function?"
}GET /api/sessions/remote
Proxies through to a remote JARVIS instance (configured via JARVIS_REMOTE_URL env var).
{
"sessions": [],
"error": "No remote URL configured. Set JARVIS_REMOTE_URL to enable remote sessions."
}All /api/sessions/* endpoints have /jarvis/api/sessions/* aliases for backward compatibility with older WebUI builds.
| REST endpoint | Legacy alias |
|---|---|
GET /api/sessions |
GET /jarvis/api/sessions |
POST /api/sessions |
POST /jarvis/api/sessions |
DELETE /api/sessions/{id} |
DELETE /jarvis/api/sessions/{id} |
POST /api/sessions/{id}/delete |
POST /jarvis/api/sessions/{id}/delete |
ws://<host>:8765/jarvis/ws
The server accepts immediately and sends a ready event. All messages are JSON-encoded.
type |
Required Fields | Description |
|---|---|---|
new_chat |
— | Create a new conversation, receive attached |
attach |
chat_id |
Resume existing conversation, receive attached |
message |
content, chat_id |
Send a user message; triggers streaming response |
approval_response |
tool_call_id, approved |
Approve/deny a pending tool execution |
| Field | Type | Description |
|---|---|---|
tool_call_id |
string | Matches tool_call_id from approval_request |
approved |
boolean | true to allow, false to deny |
always_allow |
boolean | If true, persists the approval (not yet implemented) |
| Event | Direction | Description |
|---|---|---|
ready |
init | Connection established, ready for auth |
attached |
init | Chat session is active and ready for messages |
delta |
streaming | Token-by-token text delta |
reasoning |
streaming | Model reasoning/thinking tokens |
reasoning_end |
streaming | Reasoning phase complete |
tool_call |
streaming | Agent invoked a tool |
tool_result |
streaming | Tool execution result |
user_input |
streaming | Agent asks user a question |
approval_request |
streaming | Agent requests permission for a tool |
message |
terminal | Final complete assistant message |
stream_end |
terminal | Token stream finished, awaiting final message |
turn_end |
terminal | Full turn complete, ready for next message |
error |
terminal | An error occurred |
Client Server
│ │
│── new_chat ──────────────────>│
│<── attached ─────────────────│
│── message ──────────────────>│
│<── delta (N times) ──────────│ token-by-token
│<── reasoning ────────────────│ (if model supports it)
│<── reasoning_end ────────────│
│<── tool_call ────────────────│ (zero or more)
│<── tool_result ──────────────│
│<── delta (more tokens) ──────│
│<── approval_request ────────>│ (if tool needs approval)
│── approval_response ────────>│ user responds
│<── stream_end ───────────────│
│<── message ──────────────────│ final text
│<── turn_end ─────────────────│ ready for next input
The server uses an agent_lock (asyncio.Lock) to ensure only one agent turn processes at a time per WebSocket connection. Multiple message events sent before a turn_end will queue and execute sequentially.
GET /api/settings
Returns the current agent configuration:
{
"agent": {
"model": "gpt-4o",
"provider": "openai",
"resolved_provider": "openai",
"has_api_key": true,
"thinking_level": "medium"
},
"providers": [
{"name": "auto", "label": "Auto"},
{"name": "openai", "label": "OpenAI"},
{"name": "anthropic", "label": "Anthropic"}
],
"thinking_levels": [
{"name": "low", "label": "Low", "description": "Minimal reasoning"},
{"name": "medium", "label": "Medium", "description": "Balanced reasoning"},
{"name": "high", "label": "High", "description": "Detailed reasoning"}
],
"runtime": {
"config_path": "/home/user/.jarvis/config.json"
},
"requires_restart": false
}GET /api/settings/update?model=gpt-4o&provider=openai&thinking_level=high
Query parameters:
| Parameter | Values |
|---|---|
model |
Any model ID (see below) |
provider |
auto, openai, anthropic |
thinking_level |
low, medium, high |
Note: This is a GET endpoint for historical reasons.
POST /api/settingsis canonical for write operations. Themodelandthinking_levelchanges take effect immediately in environment variables but require a server restart to change the actual LLM provider instance.
Returns the same shape as GET /api/settings.
GET /api/models
{
"models": [
{"id": "gpt-4o", "name": "GPT-4o", "provider": "openai", "family": "openai", "capabilities": {"reasoning": true, "vision": true, "tool_call": true}},
{"id": "gpt-4o-mini", "name": "GPT-4o Mini", "provider": "openai", "family": "openai", "capabilities": {"reasoning": true, "vision": true, "tool_call": true}},
{"id": "gpt-4-turbo", "name": "GPT-4 Turbo", "provider": "openai", "family": "openai", "capabilities": {"reasoning": false, "vision": true, "tool_call": true}},
{"id": "gpt-4", "name": "GPT-4", "provider": "openai", "family": "openai", "capabilities": {"reasoning": false, "vision": false, "tool_call": true}},
{"id": "gpt-3.5-turbo", "name": "GPT-3.5 Turbo", "provider": "openai", "family": "openai", "capabilities": {"reasoning": false, "vision": false, "tool_call": true}},
{"id": "claude-sonnet-4-20250514", "name": "Claude Sonnet 4", "provider": "anthropic", "family": "anthropic", "capabilities": {"reasoning": true, "vision": true, "tool_call": true}},
{"id": "claude-3-5-sonnet-20241022", "name": "Claude 3.5 Sonnet", "provider": "anthropic", "family": "anthropic", "capabilities": {"reasoning": true, "vision": true, "tool_call": true}},
{"id": "claude-3-5-haiku-20241022", "name": "Claude 3.5 Haiku", "provider": "anthropic", "family": "anthropic", "capabilities": {"reasoning": false, "vision": true, "tool_call": true}}
],
"current_model": "gpt-4o"
}GET /api/providers
{
"providers": [
{
"provider_id": "openai",
"sdk_mode": "openai",
"default_model": "gpt-4o",
"enabled": true,
"models": [],
"has_api_key": true,
"base_url": ""
}
]
}Uses ProviderManager from the provider system. Falls back to hardcoded defaults if no providers are registered.
POST /api/settings/model
Request body:
{
"model": "claude-sonnet-4-20250514",
"provider": "anthropic"
}Response:
{
"success": true,
"model": "claude-sonnet-4-20250514",
"provider": "anthropic"
}Only modifies environment variables (JARVIS_MODEL, JARVIS_SDK). The agent instance itself is not hot-reloaded — this requires a restart.
GET /api/mcp/servers
{
"servers": [
{
"name": "filesystem",
"command": "npx",
"transport": "stdio",
"disabled": false,
"lifecycle": "lazy",
"connected": false,
"tool_count": 0
}
]
}POST /api/mcp/servers
Request body:
{
"name": "filesystem",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem"],
"env": {},
"transport": "stdio",
"url": "",
"lifecycle": "eager"
}Response:
{
"success": true,
"server": {
"name": "filesystem",
"command": "npx",
"transport": "stdio",
"disabled": false,
"lifecycle": "eager"
}
}| Field | Type | Default | Description |
|---|---|---|---|
name |
string | mcp-server |
Server identifier |
command |
string | "" |
Executable to launch |
args |
array | [] |
CLI arguments |
env |
object | {} |
Environment variables |
transport |
string | stdio |
stdio or sse |
url |
string | "" |
URL for SSE transport |
lifecycle |
string | eager |
eager (connect on add) or lazy (connect on use) |
DELETE /api/mcp/servers/{name}
{
"success": true,
"server": "filesystem"
}GET /api/mcp/tools
{
"tools": []
}Currently returns an empty array — actual MCP tool discovery is pending.
The heartbeat system is a periodic self-check mechanism that runs the agent against a HEARTBEAT.md prompt file.
GET /api/heartbeat
{
"enabled": true,
"interval": "30m",
"is_running": true,
"last_run": null,
"last_result": "...",
"heartbeat_file": "...",
"has_heartbeat_file": true
}| Field | Type | Description |
|---|---|---|
enabled |
boolean | Whether heartbeat is active |
interval |
string | Human-readable interval (e.g. 30m) |
is_running |
boolean | Whether the scheduler is currently running |
last_run |
string? | ISO timestamp of last heartbeat execution |
last_result |
string? | First 500 chars of HEARTBEAT_RESULTS.md |
heartbeat_file |
string | First 1000 chars of HEARTBEAT.md |
has_heartbeat_file |
boolean | Whether HEARTBEAT.md exists |
POST /api/heartbeat/start
{
"success": true,
"status": "running"
}POST /api/heartbeat/stop
{
"success": true,
"status": "stopped"
}Connectors are external service integrations (GitHub, Weather, HTTP, etc.).
GET /api/connectors
{
"connectors": [
{
"id": "github",
"display_name": "github",
"auth_type": "token",
"connected": false,
"auth_configured": true,
"sync_state": "idle"
}
]
}| Field | Type | Description |
|---|---|---|
id |
string | Connector identifier |
display_name |
string | Human-readable label |
auth_type |
string | token, api_key, headers, or none |
connected |
boolean | Whether the connector can reach its service |
auth_configured |
boolean | Whether credentials have been set |
sync_state |
string | idle, syncing, error |
POST /api/connectors/{name}/auth
Request body varies by connector:
GitHub (name: github):
{
"token": "ghp_...",
"username": "octocat",
"repos": ["owner/repo"]
}Weather (name: weather):
{
"api_key": "...",
"city": "San Francisco"
}HTTP (name: http):
{
"headers": {
"Authorization": "Bearer ..."
}
}Response:
{
"success": true,
"connector": "github",
"connected": true
}GET /api/context/usage
Returns current token consumption and context window limits:
{
"usage": {
"input_tokens": 1234,
"output_tokens": 567,
"total_tokens": 1801
},
"limits": {
"context": 128000,
"output": 16000
},
"model": "gpt-4o",
"message_count": 0
}Usage data is fetched via provider.get_and_clear_usage() (destructive read — resets counters). Limits come from context_length_manager.get_token_limits(model).
GET /api/debug/logs
Returns the last 200 in-memory debug log entries:
{
"logs": [
"DEBUG: WebSocket accepted successfully",
"DEBUG: Created new chat_id: chat_a1b2c3d4"
],
"note": "Debug console showing last 200 log entries."
}POST /api/debug/command
Request body:
{
"command": "agent_status",
"args": {}
}Built-in commands:
| Command | Description |
|---|---|
ping |
Returns "pong" |
agent_status |
Agent class name and memory/tool status |
clear_logs |
Clears the debug log buffer |
health |
Returns JSON health status |
| (any other) | Returns "Unknown command: {cmd}" |
Response:
{
"output": "Agent: JarvisV2, Memory: active, Tools: available",
"success": true
}POST /api/feedback
Request body:
{
"rating": 5,
"message": "Really helpful response!",
"page": "chat"
}Response:
{
"success": true
}Feedback is persisted to ~/.jarvis/feedback.jsonl (one JSON object per line).
POST /api/voice/transcribe
Request body (JSON):
{
"audio": "..."
}Response:
{
"success": true,
"text": "",
"note": "Voice transcription requires a speech-to-text backend (Whisper/Whisper.cpp) to be installed."
}Note: This is a placeholder endpoint. Integration with Whisper or Whisper.cpp is required for actual transcription.
The server defines five built-in safety profiles:
| ID | Name | Code | Files | Dangerous | Bypass |
|---|---|---|---|---|---|
| 1 | Lockdown | never | ask | ask | false |
| 2 | Restricted | ask | ask | ask | false |
| 3 | Balanced | ask | always | ask | false |
| 4 | Permissive | always | always | ask | false |
| 5 | Unrestricted | always | always | always | true |
GET /api/safety/profile
{
"profiles": [
{"id": 1, "name": "Lockdown", "desc": "Everything asks. No code execution.", ...},
...
],
"current": {
"id": 3,
"name": "Balanced",
"desc": "Default. File ops allowed, code asks.",
"bypass": false,
"code": "ask",
"files": "always",
"dangerous": "ask"
}
}POST /api/safety/profile
Request body:
{
"profile_id": 4
}Response:
{
"success": true,
"profile": {
"id": 4,
"name": "Permissive",
...
}
}Modifies environment variables JARVIS_BYPASS_PERMISSIONS and JARVIS_CODE_PERMISSION.
When bypass_tool_permissions is false (default), every tool invocation requires user approval. The flow is:
- Agent invokes a tool
- Server sends
{"event": "approval_request", "tool_call_id": "...", "tool_name": "...", "tool_args": {...}} - Client sends
{"type": "approval_response", "tool_call_id": "...", "approved": true} - Agent resumes execution
When JARVIS_BYPASS_PERMISSIONS=1, all tools auto-approve silently.
When the agent needs to ask the user a question (via AskUserQuestionTool), it sends a user_input event:
{
"event": "user_input",
"chat_id": "chat_...",
"question": "Which file should I edit?",
"options": ["src/main.py", "src/utils.py"]
}Currently the response is handled server-side with an empty answer — interactive Q&A requires additional client support.
Sent immediately after WebSocket accept.
{
"event": "ready",
"chat_id": "temp_abcd1234",
"client_id": "ef567890"
}Sent after new_chat or attach succeeds. Signals the session is ready for messages.
{
"event": "attached",
"chat_id": "chat_a1b2c3d4",
"session_id": "uuid..."
}Token-by-token text output from the model.
{
"event": "delta",
"chat_id": "chat_a1b2c3d4",
"text": "To refactor"
}Model's chain-of-thought / reasoning tokens (when supported by the model).
{
"event": "reasoning",
"chat_id": "chat_a1b2c3d4",
"text": "The user wants to extract a function..."
}Signals reasoning phase is complete.
{
"event": "reasoning_end",
"chat_id": "chat_a1b2c3d4"
}Emitted when the agent invokes a tool.
{
"event": "tool_call",
"chat_id": "chat_a1b2c3d4",
"tool_name": "FileReadTool",
"tool_args": {"path": "src/main.py"}
}Emitted after a tool finishes execution.
{
"event": "tool_result",
"chat_id": "chat_a1b2c3d4",
"tool_name": "FileReadTool",
"result": "file contents...",
"success": true
}The result field is always stringified via str() to avoid serialization errors with complex objects like ToolOutput.
Emitted when the agent needs to ask the user a question.
{
"event": "user_input",
"chat_id": "chat_a1b2c3d4",
"question": "Which approach should I use?",
"options": ["Option A", "Option B"]
}Emitted when a tool requires user permission before executing.
{
"event": "approval_request",
"chat_id": "chat_a1b2c3d4",
"tool_name": "BashTool",
"tool_args": {"command": "rm -rf /tmp/test"},
"required_permissions": ["dangerous"],
"tool_call_id": "call_abc123"
}The client must respond with {"type": "approval_response", "tool_call_id": "call_abc123", "approved": true|false}.
All token deltas have been sent. The final complete message follows.
{
"event": "stream_end",
"chat_id": "chat_a1b2c3d4"
}The complete assistant response text.
{
"event": "message",
"chat_id": "chat_a1b2c3d4",
"text": "To refactor that function, you can extract the loop body into..."
}The agent has finished its turn. The client may now send another message.
{
"event": "turn_end",
"chat_id": "chat_a1b2c3d4"
}An error occurred during message processing.
{
"event": "error",
"chat_id": "chat_a1b2c3d4",
"detail": "Something went wrong"
}| Code | Meaning | Common Causes |
|---|---|---|
| 200 | OK | Request succeeded |
| 400 | Bad Request | Invalid message_index in rewind, malformed JSON |
| 404 | Not Found | Session or resource does not exist |
| 503 | Service Unavailable | WebUI not built, npm run build required |
Error responses follow this shape:
{
"error": "not found"
}| Scenario | Behavior |
|---|---|
| Invalid JSON message | Logged and silently skipped |
Unknown type field |
Logged and silently skipped |
| Agent processing error | Sends {"event": "error", "detail": "..."}, callbacks restored |
| WebSocket disconnect (client) | Clean shutdown, approval task cancelled |
| WebSocket disconnect (server) | Client receives close frame with no code |
There are no enforced rate limits in the current implementation. The token store and approval queues are in-memory. For production use, you should add:
- Connection limits: one WebSocket connection per user (the server tracks
active_connectionsonapp.statebut does not enforce a cap) - Request throttling: add
slowapior similar middleware to REST endpoints - Token rate limiting: limit bootstrap calls to prevent token exhaustion
| Variable | Default | Description |
|---|---|---|
JARVIS_MODEL |
gpt-4o |
Active model ID |
JARVIS_BASE_URL |
"" |
Custom API base URL for LLM provider |
JARVIS_API_KEY |
"" |
API key for LLM provider |
JARVIS_SDK |
openai |
SDK mode: openai or anthropic |
JARVIS_BYPASS_PERMISSIONS |
"" |
Set to 1 to auto-approve all tools |
JARVIS_THINKING_LEVEL |
medium |
low, medium, or high reasoning depth |
JARVIS_CODE_PERMISSION |
ask |
always, ask, or never for code tools |
JARVIS_HEARTBEAT_ENABLED |
true |
Enable/disable heartbeat scheduler |
JARVIS_HEARTBEAT_EVERY |
30m |
Heartbeat interval |
JARVIS_REMOTE_URL |
"" |
Remote JARVIS instance URL for session sync |
Any unmatched GET route (/{path}) serves index.html from the WebUI dist directory. This enables client-side routing in the React/Vue frontend without server-side URL rewriting. If the dist directory does not exist:
{
"error": "Web UI not built. Run 'npm run build' in interface/webui/"
}Returns status 503.