Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
## Summary

-

## Checklist

- [ ] No fabricated effort estimates in this change. See [house-rules/no-fabricated-estimates.md](https://github.com/arc-web/claude-skills/blob/main/house-rules/no-fabricated-estimates.md).
6 changes: 6 additions & 0 deletions .github/workflows/check-fabricated-estimates.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
name: check-fabricated-estimates
on:
pull_request:
jobs:
check:
uses: arc-web/claude-skills/.github/workflows/check-fabricated-estimates.yml@main
16 changes: 16 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# arc-browser - Agent Instructions

Repo: /Users/home/ai/tools/browser/arc-browser


---

## House rules (non-negotiable, version-controlled)

### No fabricated effort estimates

Agents must not invent durations ("30 minutes", "half an hour", "quick task"). Real deadlines from humans/stakeholders ("by Friday", "before campaign launch") and human-set Plane time fields are fine.

Canonical rule: <https://github.com/arc-web/claude-skills/blob/main/house-rules/no-fabricated-estimates.md>

Enforced via pre-commit hook (global + per-repo) and CI workflow `check-fabricated-estimates`. Hard fail on violation. PR template carries the reminder.
79 changes: 79 additions & 0 deletions MANUAL_FALLBACK.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# arc-browser Manual Fallback Protocol

When automation fails, arc-browser does NOT silently fall back to manual click-through. The agent always:

1. Posts a clear message to Discord `#agentic-browser` via `agentic_browser_prompt(...)`
2. Pauses execution
3. Offers the human the option to drive the failed step manually
4. Resumes only on explicit human reply (Discord message or `touch /tmp/<session>_resume`)

## Decision tree

```
Tool fails (selector miss, modal not detected, 2FA challenge, captcha, scope picker confused)
|
v
Post to #agentic-browser:
"automation hit [X] on [URL].
- reply 'manual' to take over click-by-click (window stays open)
- reply 'retry' to attempt the macro again
- reply 'skip' to abort
- reply 'abort' to close the session"
|
+-- reply 'manual' -> agent surfaces step-by-step guidance, polls
| for completion signal, then resumes downstream macros
+-- reply 'retry' -> re-run the same macro once with relaxed timing
+-- reply 'skip' -> return failure to caller, do not close session
+-- reply 'abort' -> teardown session + return failure
```

## Implementation rule

Manual fallback is a **first-class branch** in every macro tool, not a hidden default. Macros must:

- Wrap risky steps in try/except
- On failure, call `agentic_browser_prompt(message=..., session=...)`
- Branch on `result['reply']` content
- Document the expected reply vocabulary in the message itself

## Anti-pattern

```python
# WRONG - silent fallback
try:
await click_create_button()
except Exception:
pass # hope the user clicks it
```

```python
# RIGHT - explicit human pause
try:
if not await click_by_text(page, "Create Integration"):
raise RuntimeError("button not found")
except Exception:
rep = agentic_browser_prompt(
message="Could not find 'Create Integration' button. Click it manually, then reply 'done', or reply 'abort' to cancel.",
session=GHL_SESSION,
timeout=600,
)
if rep["reply"].lower().strip() == "abort":
return {"ok": False, "error": "user aborted"}
# otherwise assume done, continue
```

## Default behavior settings

Per `~/.cache/arc-browser/config.json` (created on first run):

```json
{
"manual_fallback_default": "ask",
"ask_timeout_s": 600,
"auto_fallback_after_failures": 3,
"audit_log_retention_days": 30
}
```

- `manual_fallback_default`: `"ask"` (default, recommended), `"never"`, or `"always"`
- `auto_fallback_after_failures`: after this many consecutive macro failures, escalate to manual prompt automatically
229 changes: 228 additions & 1 deletion arc_browser/browser.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
cdp - Connects to user's real running Chrome (most stealth, disruptive)
"""
import asyncio
import json
import os
import signal
import subprocess
Expand Down Expand Up @@ -190,7 +191,12 @@ async def navigate_ready(page, url: str, wait_for_selector: str = None,
async def auto_login(site_id: str, session: str = None, force: bool = False) -> dict:
"""Auto-login for a site using the registry recipe + 1Password.

Returns {"status": "logged_in"|"already"|"failed", "reason": str}.
Supports two flow types via `auth.flow`:
- "form" (default) - email/password form submit
- "google_sso" - click "Continue with Google" -> Google account picker
-> 2FA pause via #agentic-browser if challenge appears

Returns {"status": "logged_in"|"already"|"failed"|"pending_2fa", "reason": str}.
Idempotent: returns "already" if the verify_url check passes without
needing to re-authenticate (unless force=True).
"""
Expand All @@ -202,6 +208,7 @@ async def auto_login(site_id: str, session: str = None, force: bool = False) ->
if not auth:
return {"status": "failed", "reason": f"No auth recipe for '{site_id}'"}

flow = auth.get("flow", "form")
required = ["credential_item", "login_url", "verify_url"]
missing = [k for k in required if k not in auth]
if missing:
Expand All @@ -222,6 +229,10 @@ async def auto_login(site_id: str, session: str = None, force: bool = False) ->
except Exception:
pass

# Branch on flow type
if flow == "google_sso":
return await _login_google_sso(page, auth, sess)

# Fetch credentials
try:
creds = get_credentials(auth["credential_item"])
Expand Down Expand Up @@ -288,3 +299,219 @@ async def verify_auth(site_id: str, session: str = None) -> dict:
return {"authenticated": True, "reason": f"At {page.url}"}
except Exception as e:
return {"authenticated": False, "reason": f"Probe failed: {e}"}


# ---------------------------------------------------------------------------
# Google SSO flow + helpers
# ---------------------------------------------------------------------------

async def _detect_any_selector(page, selectors: list) -> bool:
"""Return True if any selector resolves to a visible element."""
for sel in selectors or []:
try:
el = await page.query_selector(sel)
if el:
visible = await el.is_visible()
if visible:
return True
except Exception:
continue
return False


async def _login_google_sso(page, auth: dict, sess: str) -> dict:
"""Drive a Google SSO login. Pauses for 2FA / captcha via #agentic-browser."""
from .utils.credentials import get_credentials
from .utils.prompt import agentic_browser_prompt

# Fetch credentials (used for typing email if Google prompt shows)
try:
creds = get_credentials(auth["credential_item"])
except Exception as e:
return {"status": "failed", "reason": f"Credential lookup failed: {e}"}

try:
await page.goto(auth["login_url"], wait_until="load", timeout=60000)
await asyncio.sleep(2.5)
except Exception as e:
return {"status": "failed", "reason": f"Initial nav failed: {e}"}

# Try the Google button
btn_sels = auth.get("google_button_selectors", [])
clicked = False
for sel in btn_sels:
try:
el = await page.query_selector(sel)
if el:
await el.click()
clicked = True
await asyncio.sleep(2)
break
except Exception:
continue
if not clicked:
# Site may already be at Google account picker if cookies partially valid
if "accounts.google.com" not in page.url:
return {"status": "failed", "reason": "Could not find Continue with Google button"}

# If Google email entry visible, type it
try:
email_input = await page.query_selector("input[type='email']")
if email_input:
await email_input.fill(creds.get("username", ""))
await asyncio.sleep(0.8)
try:
next_btn = await page.query_selector("button:has-text('Next'), #identifierNext")
if next_btn:
await next_btn.click()
except Exception:
await page.keyboard.press("Enter")
await asyncio.sleep(2.5)
except Exception:
pass

# Password if shown
try:
pw_input = await page.query_selector("input[type='password']")
if pw_input:
await pw_input.fill(creds.get("password", ""))
await asyncio.sleep(0.8)
try:
next_btn = await page.query_selector("button:has-text('Next'), #passwordNext")
if next_btn:
await next_btn.click()
except Exception:
await page.keyboard.press("Enter")
await asyncio.sleep(3)
except Exception:
pass

# Detect 2FA + captcha
two_fa_sels = (auth.get("two_fa") or {}).get("detect_selectors", [])
captcha_sels = (auth.get("captcha") or {}).get("detect_selectors", [])
if await _detect_any_selector(page, two_fa_sels):
reply = agentic_browser_prompt(
message=f"2FA challenge detected on {auth['login_url']}.\nComplete it in the browser window, then reply 'done' or run: touch /tmp/{sess}_resume",
session=sess,
timeout=600,
)
if reply["status"] == "timeout":
return {"status": "pending_2fa", "reason": "Human did not resolve 2FA in time"}
await asyncio.sleep(2)
elif await _detect_any_selector(page, captcha_sels):
reply = agentic_browser_prompt(
message=f"reCAPTCHA detected on {auth['login_url']}.\nComplete it in the browser window, then reply 'done' or run: touch /tmp/{sess}_resume",
session=sess,
timeout=600,
)
if reply["status"] == "timeout":
return {"status": "pending_2fa", "reason": "Captcha unresolved in time"}
await asyncio.sleep(2)

# Poll for redirect to verify_url
for _ in range(30):
await asyncio.sleep(2)
if auth.get("verify_not_contains", "") not in page.url and "accounts.google.com" not in page.url:
return {"status": "logged_in", "reason": f"Redirected to {page.url}"}

return {"status": "failed", "reason": f"Did not redirect away from auth (current: {page.url})"}


async def wait_for_hydration(page, max_ms: int = 8000, custom_selector: str = None) -> bool:
"""Poll for SPA-ready signals: body has visible text, optional selector visible.

Returns True if hydrated within max_ms, False on timeout.
"""
deadline = asyncio.get_event_loop().time() + max_ms / 1000.0
while asyncio.get_event_loop().time() < deadline:
try:
ready = await page.evaluate(
"""() => {
const hasText = (document.body.innerText || '').trim().length > 50;
const ready = document.readyState === 'complete';
return ready && hasText;
}"""
)
if ready:
if custom_selector:
try:
el = await page.query_selector(custom_selector)
if el and await el.is_visible():
return True
except Exception:
pass
else:
return True
except Exception:
pass
await asyncio.sleep(0.4)
return False


async def extract_modal_text(page, modal_selector: str = "[role='dialog'], .modal, [class*='modal'], [class*='Modal']", max_chars: int = 3000) -> str:
"""Find any visible modal and return its inner text. Empty string if none."""
try:
el = await page.query_selector(modal_selector)
if not el:
return ""
visible = await el.is_visible()
if not visible:
return ""
txt = await el.inner_text()
return (txt or "")[:max_chars]
except Exception:
return ""


async def tick_all_checkboxes(page, container_selector: str, exclude_labels: list = None, delay_range: tuple = (0.05, 0.2)) -> int:
"""Click every unchecked checkbox inside container. Returns count of ticks."""
import random
exclude_labels = [s.lower() for s in (exclude_labels or [])]
try:
container = await page.query_selector(container_selector)
if not container:
return 0
checkboxes = await container.query_selector_all("input[type='checkbox'], [role='checkbox']")
except Exception:
return 0
ticked = 0
for cb in checkboxes:
try:
checked = await cb.is_checked() if hasattr(cb, "is_checked") else False
if checked:
continue
# Check exclude
label_text = ""
try:
label_text = (await cb.evaluate("el => (el.closest('label')||el.parentElement||el).innerText || ''")).lower()
except Exception:
pass
if any(ex in label_text for ex in exclude_labels):
continue
await cb.click()
ticked += 1
await asyncio.sleep(random.uniform(*delay_range))
except Exception:
continue
return ticked


async def click_by_text(page, text: str, role: str = "button", timeout: int = 5000) -> bool:
"""Click an element matching visible text + role. Resilient to className churn."""
role_map = {"button": "button, [role='button']", "link": "a, [role='link']", "menuitem": "[role='menuitem']"}
sel = role_map.get(role, role)
js = f"""(() => {{
const cand = Array.from(document.querySelectorAll({json.dumps(sel)}));
const target = cand.find(el => (el.innerText||'').trim().toLowerCase().includes({json.dumps(text.lower())}));
if (target) {{ target.click(); return true; }}
return false;
}})()"""
try:
for _ in range(int(timeout / 250)):
ok = await page.evaluate(js)
if ok:
return True
await asyncio.sleep(0.25)
except Exception:
return False
return False
Loading