Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions python/extend-browserbase/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Browserbase credentials (required)
# Get these from https://www.browserbase.com/settings
BROWSERBASE_PROJECT_ID=your_browserbase_project_id
BROWSERBASE_API_KEY=your_browserbase_api_key

# Google API key for Gemini model (required for Stagehand)
GOOGLE_API_KEY=your_google_api_key

# Extend AI (optional – enables receipt parsing; omit to only download receipts)
EXTEND_API_KEY=your_extend_api_key

# Optional: set after first run to reuse the created processor
# EXTEND_PROCESSOR_ID=your_processor_id
81 changes: 81 additions & 0 deletions python/extend-browserbase/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Stagehand + Browserbase + Extend: Download Expense Receipts and Parse with Extend AI

## AT A GLANCE

- **Goal**: Automate downloading receipts from an expense portal and extract structured receipt data using AI-powered document parsing.
- **Pattern Template**: Demonstrates the integration pattern of Browserbase (browser automation + download capture) + Extend AI (schema-based document extraction).
- **Workflow**: Stagehand navigates the expense portal and clicks each receipt's download link; Browserbase captures downloads. The script polls for the session's download ZIP, extracts files, then optionally sends them to Extend for structured extraction (vendor, date, totals, line items, etc.).
- **Download Handling**: Implements retry/polling around Browserbase's Session Downloads API until the ZIP is available.
- **Structured Extraction**: Extend EXTRACT processor with a receipt JSON schema; results written to `output/results/receipts.json` and `receipts.csv`.
- Docs → [Browserbase Downloads](https://docs.browserbase.com/features/downloads) | [Extend AI](https://docs.extend.app)

## GLOSSARY

- **act**: perform UI actions from natural language prompts (click, scroll, navigate)
Docs → https://docs.stagehand.dev/basics/act
- **observe**: find and return interactive elements on the page matching a description, without performing actions. Used here to locate all individual download buttons before clicking them.
Docs → https://docs.stagehand.dev/basics/observe
- **Browserbase Downloads**: When files are downloaded during a browser session, Browserbase captures and stores them. Files are retrieved via the Session Downloads API as a ZIP archive.
Docs → https://docs.browserbase.com/features/downloads
- **Extend EXTRACT processor**: A configurable document extraction pipeline that parses files against a JSON schema and returns structured data. Processors are reusable and persist across runs.
Docs → https://docs.extend.app
- **Download polling**: Browserbase syncs downloads in real-time; the script retries every 2 seconds until the ZIP is available or a timeout is reached.

## QUICKSTART

1. cd python/extend-browserbase
2. cp .env.example .env
3. Add required API keys to .env:
- `BROWSERBASE_PROJECT_ID`
- `BROWSERBASE_API_KEY`
- `GOOGLE_API_KEY`
- `EXTEND_API_KEY` (optional — enables receipt parsing)
4. Run the script:

```bash
uv run python main.py
```

## EXPECTED OUTPUT

- Initializes Stagehand session with Browserbase and opens the live view link
- Navigates to the expense portal and finds all per-receipt download links via observe
- Clicks each download button; Browserbase captures files
- After closing the session, polls for the session's download ZIP and extracts to `output/documents/`
- If `EXTEND_API_KEY` is set: creates/uses an Extend "Receipt Extractor" processor, uploads each file, runs extraction, writes `output/results/receipts.json` and `receipts.csv`
- Opens the Extend dashboard runs page in your browser for review
- Closes session cleanly

## COMMON PITFALLS

- "ModuleNotFoundError": ensure you're running with `uv run python main.py` so dependencies are installed automatically from pyproject.toml
- Missing credentials: verify .env contains BROWSERBASE_PROJECT_ID, BROWSERBASE_API_KEY, and GOOGLE_API_KEY
- Download timeout: increase `retry_for_seconds` parameter in `save_downloads_with_retry` if downloads take longer than 60 seconds
- Empty ZIP file: ensure downloads were actually triggered (check live view link to debug)
- Rate limiting on Extend: the script retries with exponential backoff on 429 errors, but very large batches may need the batch size reduced from 9
- Find more information on your Browserbase dashboard → https://www.browserbase.com/sign-in

## USE CASES

• Expense automation: Download receipts from expense portal and extract vendor, date, totals, and line items for accounting systems.
• Document batch processing: Collect files from web portals and run structured extraction across all of them with a single script.
• Receipt digitization: Convert paper/PDF receipts into structured JSON and CSV for import into ERP, bookkeeping, or reimbursement tools.

## NEXT STEPS

• Parameterize the portal URL: Accept the expense portal URL from env or CLI to support different receipt sources.
• Custom schemas: Modify `RECEIPT_EXTRACTION_CONFIG` to extract different document types (invoices, W-2s, contracts) by changing the JSON schema.
• Add validation: Compare extracted totals against line item sums to flag discrepancies or incomplete extractions.
• Scheduled runs: Deploy on cron/Lambda to periodically check for new receipts and process them automatically.

## HELPFUL RESOURCES

📚 Stagehand Docs: https://docs.stagehand.dev/v3/first-steps/introduction
📚 Python SDK: https://docs.stagehand.dev/v3/sdk/python
📚 Browserbase Downloads: https://docs.browserbase.com/features/downloads
📚 Extend AI: https://docs.extend.app
🎮 Browserbase: https://www.browserbase.com
💡 Try it out: https://www.browserbase.com/playground
🔧 Templates: https://www.browserbase.com/templates
📧 Need help? support@browserbase.com
💬 Discord: http://stagehand.dev/discord
Loading