Skip to content

Add browse get markdown command#1907

Open
derekmeegan wants to merge 1 commit intomainfrom
derek/add_get_markdown
Open

Add browse get markdown command#1907
derekmeegan wants to merge 1 commit intomainfrom
derek/add_get_markdown

Conversation

@derekmeegan
Copy link
Copy Markdown
Contributor

Summary

  • Adds browse get markdown [selector] to convert page HTML to clean markdown
  • Defaults to body content when no selector given, accepts optional CSS/XPath/ref selector
  • Uses node-html-markdown for quality conversion (links, tables, code blocks preserved)
  • Useful for agents that need readable page content without HTML noise

Usage

browse get markdown                # full page body as markdown
browse get markdown .article       # specific element
browse get markdown @0-5           # ref from snapshot

Test results

Test Local Remote (Browserbase)
get markdown (body default) HN full page markdown HN full page markdown
get markdown .titleline (selector) Clean link with title Clean link with title

Test plan

  • Test locally with no selector (full body)
  • Test locally with CSS selector
  • Test on remote Browserbase session (no selector)
  • Test on remote Browserbase session (with selector)

🤖 Generated with Claude Code

Adds markdown output to the `get` subcommand. Converts page HTML to
markdown using node-html-markdown, preserving links, tables, and code
blocks. Defaults to body content, accepts optional selector for
specific elements.

Usage:
  browse get markdown              # full page body as markdown
  browse get markdown .article     # specific element

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Mar 29, 2026

🦋 Changeset detected

Latest commit: f3d5b5d

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 0 packages

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 4 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant User
    participant CLI as CLI (Commander)
    participant Exec as executeCommand()
    participant SH as Stagehand/Page
    participant Browser as Browser (Local/Remote)
    participant MD as node-html-markdown

    Note over User,MD: NEW: Markdown Extraction Flow

    User->>CLI: browse get markdown [selector]
    CLI->>Exec: action(what="markdown", selector)
    
    alt Selector provided
        Exec->>Exec: resolveSelector(selector)
    else No selector
        Exec->>Exec: CHANGED: Default to "body"
    end

    Exec->>SH: deepLocator(target).innerHtml()
    
    Note over SH,Browser: Context: local playwright or remote Browserbase
    SH->>Browser: CDP/Playwright: Get innerHTML
    Browser-->>SH: HTML string
    SH-->>Exec: HTML string

    Exec->>MD: NEW: NodeHtmlMarkdown.translate(html)
    MD-->>Exec: Markdown string
    
    Exec-->>CLI: { markdown: value }
    CLI-->>User: Print Markdown output
Loading

@shrey150
Copy link
Copy Markdown
Contributor

I think this might do a better as a skill (/markdown) given that it's not a primitive wrapping the browser. I'd also prefer we do that while we figure out the ergonomics. @derekmeegan what do you think?

@derekmeegan
Copy link
Copy Markdown
Contributor Author

i honestly feel markdown extraction is a scraping primitive in the ai agent era but let's chat more synchronously tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants