Jroid is a browser-integrated automation framework that allows you to operate any web application through natural-language commands.
It combines a Tampermonkey client script (for DOM control and UI overlays) with a local Node.js proxy server (for planning, reasoning, and logging) powered by OpenAI’s API.
The system observes the browser DOM, plans actions using ReAct-style reasoning, executes them safely, learns from previous runs through a Reflexion loop, and logs everything locally — ensuring complete privacy and transparency.
| Layer | Component | Description |
|---|---|---|
| 1. Agent Runtime | OpenAI API API | Performs reasoning, planning, and reflection using ReAct + Plan-Execute + Reflexion frameworks. |
| 2. Communication Layer | WebSocket (primary), HTTP (fallback) | Bi-directional channel between Tampermonkey and the local proxy. |
| 3. Data Format | JSON | Standardized schemas for context, plan, and feedback. |
| 4. State Management | JSONL files | Append-only local logs for steps, memory, and reflections. |
| MCP Server | Tool Names | Responsibility |
|---|---|---|
| messagingServer | composeMessage, sendEmail, replyEmail, writeLinkedInPost |
Drafts outreach, validates email content, and prepares social posts. |
| analysisServer | summarize, aggregateCompose, searchWeb, formAutofill, extractTable |
Research + data extraction tasks; converts DOM text into structured summaries or table instructions. |
| notesServer | saveToNotes, describeControls, describeForm |
Persists notes and provides quick descriptions of visible controls/forms. |
| browserServer | navigate |
Validates navigation requests and emits safe navigation plans. |
| storageServer | logEvent, readLog, recordMemory, queryMemory |
Centralized logging + memory persistence. All proxy-side logging (plans, execution steps, reflections) now flows through this MCP server so every record shares a single schema. |
Each server is described in config/mcp.config.json and auto-routed by server/mcp/serverManager.js, making it easy to add future domains (CRM, research APIs, etc.) without touching the planner.
- Set up the local Node.js proxy server with WebSocket + HTTP endpoints.
- Install Tampermonkey and load the client script into your browser.
- Confirm that the browser connects to the proxy on startup (
Agent ready / WS connected).
- User opens any page and presses Ctrl + . or clicks the 🦾 icon.
- A floating chat-style input bar appears (top-right corner).
- User types a natural-language command (e.g. "Reply to the most recent email I received" or “Search LinkedIn for AI jobs in San Francisco”).
- Tampermonkey collects the DOM context and sends a JSON payload to the proxy.
- Proxy formats the input for GPT using the ReAct + Plan-Execute + Reflexion prompt.
- LLM performs step-by-step reasoning and returns a structured JSON action plan.
- Proxy validates schema and sends the plan back to Tampermonkey.
- Tampermonkey displays a popup overlay showing:
- Planned steps
- Preview of any generated text (e.g. email drafts)
- User chooses:
- ✅ Confirm → Execute actions live.
- ❌ Decline → Abort task and log status.
- Tampermonkey runs each action sequentially:
- Deterministic selector → Fuzzy fallback if needed.
- Clicks, typing, waiting, extracting.
- Random delays (300–800 ms) simulate human pacing.
- Step results stream via WebSocket to the proxy and console.
- Proxy writes JSONL entries for every action in
/data/logs/.
- When a task finishes, the proxy sends the full log back to GPT for reflection.
- The model summarizes successes/failures and outputs a short “lesson.”
- Lesson appended to
/data/memory/reflections.jsonl. - On next run, the last few lessons are injected into the system prompt to guide improved planning.
| Category | Rule |
|---|---|
| Action Approval | All write/send/delete actions require popup confirmation. |
| Rate Limiting | 300–800 ms random delay per action, 2–5 s between task groups. |
| Privacy | No data leaves local environment unless explicitly authorized. |
| Recovery | Automatic retry → fallback to fuzzy selectors → replan → safe abort on repeated failures. |
| Manual Stop | Global “Stop Task” button immediately halts execution. |
- Proxy compiles run summary (start/end time, success %, error count).
- Toast message shown:
✅ Task Complete – 17 steps executed, 0 errors
- Logs and reflections remain locally stored for review or future learning.
| Subsystem | Responsibility |
|---|---|
| Tampermonkey Client | DOM observation, UI overlays, confirmation popup, and live execution. |
| Node.js Proxy Server | OpenAI API communication, plan validation, memory & log storage (via MCP), error recovery. |
| LLM (GPT-4.1) | Natural-language understanding, reasoning, planning, and reflexive learning. |
- Browser snapshots stream to the proxy →
recordPageSnapshot→storageServer.recordMemory, ensuring every snapshot and graph update is persisted with the same schema. - All plan + execution logs are appended through
storageServer.logEvent, so JSONL files underdata/logs/always contain atimestamp,sessionId,type, and tool-specific metadata. - Reflections, notes, and cached values are exposed back to Tampermonkey so placeholder resolution (
{{DRAFT_EMAIL_BODY}}, etc.) never stalls execution.
- Human-in-the-loop: All impactful actions require explicit approval.
- Privacy-first: No credentials or sensitive data ever leave the user’s machine.
- Rate-limiting: Mimics natural behavior to avoid detection or system strain.
- Auditability: Every run is fully logged in JSONL for transparency and debugging.
| Milestone | Goal |
|---|---|
| MVP (v1) | Single-tab automation loop with ReAct reasoning and confirmation popup. |
| v2 | Multi-tab orchestration using BroadcastChannel Tab Manager. |
| v3 | Local dashboard for viewing logs, lessons, and task metrics. |
| v4 | Transition optional LLM runtime to local (Ollama or Hugging Face) for full offline mode. |
This README provides the full system specification for GitHub Copilot and AI assistants.
Use it as context for generating functions, data models, prompts, and integration logic while maintaining:
- Layered modularity: keep Tampermonkey, proxy, and memory independent.
- Schema integrity: preserve JSON formats for
context,plan, andfeedback. - Safety compliance: always include confirmation, rate limits, and error recovery hooks.
MIT License — open for modification and learning use.
- Install deps –
npm install - Run the proxy –
npm run dev - Execute the automated suite –
npm testtests/planValidation.test.jsensures the planner schema validator rejects malformed steps.tests/mcpIntegration.test.jsspins up a sample JSON-RPC server to verify the sharedJSONRPCClientcontract.
- Manual loop – load the Tampermonkey script, trigger a command (Ctrl +
.), approve the generated plan, and watch the HUD stream tool results + cached placeholders.
Add new tests under tests/*.test.js and keep them self-contained; the suite runs with node --test so no extra harness is required.