just-scrape

Made with love by the ScrapeGraphAI team 💜

Command-line interface for ScrapeGraph AI — AI-powered web scraping, data extraction, search, and crawling.

Project Structure

just-scrape/
├── src/
│   ├── cli.ts                       # Entry point, citty main command + subcommands
│   ├── lib/
│   │   ├── env.ts                   # Zod-parsed env config (API key, debug, timeout)
│   │   ├── folders.ts               # API key resolution + interactive prompt
│   │   ├── scrapegraphai.ts         # SDK layer — all API functions
│   │   ├── schemas.ts               # Zod validation schemas
│   │   └── log.ts                   # Logger factory + syntax-highlighted JSON output
│   ├── types/
│   │   └── index.ts                 # Zod-derived types + ApiResult
│   ├── commands/
│   │   ├── smart-scraper.ts
│   │   ├── search-scraper.ts
│   │   ├── markdownify.ts
│   │   ├── crawl.ts
│   │   ├── sitemap.ts
│   │   ├── scrape.ts
│   │   ├── agentic-scraper.ts
│   │   ├── generate-schema.ts
│   │   ├── history.ts
│   │   ├── credits.ts
│   │   └── validate.ts
│   └── utils/
│       └── banner.ts                # ASCII banner + version from package.json
├── tests/
│   └── scrapegraphai.test.ts        # SDK layer tests (mocked fetch)
├── dist/                            # Build output (git-ignored)
│   └── cli.mjs                      # Bundled ESM with shebang
├── package.json
├── tsconfig.json
├── tsup.config.ts
├── biome.json
└── .gitignore

Installation

npm install -g just-scrape           # npm (recommended)
pnpm add -g just-scrape              # pnpm
yarn global add just-scrape           # yarn
bun add -g just-scrape               # bun
npx just-scrape --help               # or run without installing

Package: just-scrape on npm.

Coding Agent Skill

Use just-scrape as a skill for AI coding agents via Vercel's skills.sh:

bunx skills add https://github.com/ScrapeGraphAI/just-scrape

Browse the skill: skills.sh/scrapegraphai/just-scrape/just-scrape

Configuration

The CLI needs a ScrapeGraph API key. Get one at dashboard.scrapegraphai.com.

Four ways to provide it (checked in order):

Environment variable: export SGAI_API_KEY="sgai-..."
.env file: SGAI_API_KEY=sgai-... in project root
Config file: ~/.scrapegraphai/config.json
Interactive prompt: the CLI asks and saves to config

Environment Variables

Variable	Description	Default
`SGAI_API_KEY`	ScrapeGraph API key	—
`JUST_SCRAPE_API_URL`	Override API base URL	`https://api.scrapegraphai.com/v1`
`JUST_SCRAPE_TIMEOUT_S`	Request/polling timeout in seconds	`120`
`JUST_SCRAPE_DEBUG`	Set to `1` to enable debug logging to stderr	`0`

JSON Mode (`--json`)

All commands support --json for machine-readable output. When set, banner, spinners, and interactive prompts are suppressed — only minified JSON on stdout (saves tokens when piped to AI agents).

just-scrape credits --json | jq '.remaining_credits'
just-scrape smart-scraper https://example.com -p "Extract data" --json > result.json
just-scrape history smartscraper --json | jq '.requests[].status'

Smart Scraper

Extract structured data from any URL using AI. docs

Usage

just-scrape smart-scraper <url> -p <prompt>                # Extract data with AI
just-scrape smart-scraper <url> -p <prompt> --schema <json> # Enforce output schema
just-scrape smart-scraper <url> -p <prompt> --scrolls <n>   # Infinite scroll (0-100)
just-scrape smart-scraper <url> -p <prompt> --pages <n>    # Multi-page (1-100)
just-scrape smart-scraper <url> -p <prompt> --stealth      # Anti-bot bypass (+4 credits)
just-scrape smart-scraper <url> -p <prompt> --cookies <json> --headers <json>
just-scrape smart-scraper <url> -p <prompt> --plain-text   # Plain text instead of JSON

Examples

# Extract product listings from an e-commerce page
just-scrape smart-scraper https://store.example.com/shoes -p "Extract all product names, prices, and ratings"

# Extract with a strict schema, scrolling to load more content
just-scrape smart-scraper https://news.example.com -p "Get all article headlines and dates" \
  --schema '{"type":"object","properties":{"articles":{"type":"array","items":{"type":"object","properties":{"title":{"type":"string"},"date":{"type":"string"}}}}}}' \
  --scrolls 5

# Scrape a JS-heavy SPA behind anti-bot protection
just-scrape smart-scraper https://app.example.com/dashboard -p "Extract user stats" \
  --stealth

Search Scraper

Search the web and extract structured data from results. docs

Usage

just-scrape search-scraper <prompt>                        # AI-powered web search
just-scrape search-scraper <prompt> --num-results <n>      # Sources to scrape (3-20, default 3)
just-scrape search-scraper <prompt> --no-extraction        # Markdown only (2 credits vs 10)
just-scrape search-scraper <prompt> --schema <json>        # Enforce output schema
just-scrape search-scraper <prompt> --stealth --headers <json>

Examples

# Research a topic across multiple sources
just-scrape search-scraper "What are the best Python web frameworks in 2025?" --num-results 10

# Get raw markdown from search results (cheaper)
just-scrape search-scraper "React vs Vue comparison" --no-extraction --num-results 5

# Structured output with schema
just-scrape search-scraper "Top 5 cloud providers pricing" \
  --schema '{"type":"object","properties":{"providers":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"free_tier":{"type":"string"}}}}}}'

Markdownify

Convert any webpage to clean markdown. docs

Usage

just-scrape markdownify <url>                              # Convert to markdown
just-scrape markdownify <url> --stealth                    # Anti-bot bypass (+4 credits)
just-scrape markdownify <url> --headers <json>             # Custom headers

Examples

# Convert a blog post to markdown
just-scrape markdownify https://blog.example.com/my-article

# Convert a JS-rendered page behind Cloudflare
just-scrape markdownify https://protected.example.com --stealth

# Pipe markdown to a file
just-scrape markdownify https://docs.example.com/api --json | jq -r '.result' > api-docs.md

Crawl

Crawl multiple pages and extract data from each. docs

Usage

just-scrape crawl <url> -p <prompt>                        # Crawl + extract
just-scrape crawl <url> -p <prompt> --max-pages <n>        # Max pages (default 10)
just-scrape crawl <url> -p <prompt> --depth <n>            # Crawl depth (default 1)
just-scrape crawl <url> --no-extraction --max-pages <n>    # Markdown only (2 credits/page)
just-scrape crawl <url> -p <prompt> --schema <json>        # Enforce output schema
just-scrape crawl <url> -p <prompt> --rules <json>         # Crawl rules (include_paths, same_domain)
just-scrape crawl <url> -p <prompt> --no-sitemap           # Skip sitemap discovery
just-scrape crawl <url> -p <prompt> --stealth               # Anti-bot bypass

Examples

# Crawl a docs site and extract all code examples
just-scrape crawl https://docs.example.com -p "Extract all code snippets with their language" \
  --max-pages 20 --depth 3

# Crawl only blog pages, skip everything else
just-scrape crawl https://example.com -p "Extract article titles and summaries" \
  --rules '{"include_paths":["/blog/*"],"same_domain":true}' --max-pages 50

# Get raw markdown from all pages (no AI extraction, cheaper)
just-scrape crawl https://example.com --no-extraction --max-pages 10

Sitemap

Get all URLs from a website's sitemap. docs

Usage

just-scrape sitemap <url>

Examples

# List all pages on a site
just-scrape sitemap https://example.com

# Pipe URLs to another tool
just-scrape sitemap https://example.com --json | jq -r '.urls[]'

Scrape

Get raw HTML content from a URL. docs

Usage

just-scrape scrape <url>                                   # Raw HTML
just-scrape scrape <url> --stealth                         # Anti-bot bypass (+4 credits)
just-scrape scrape <url> --branding                        # Extract branding (+2 credits)
just-scrape scrape <url> --country-code <iso>              # Geo-targeting

Examples

# Get raw HTML of a page
just-scrape scrape https://example.com

# Scrape a geo-restricted page with anti-bot bypass
just-scrape scrape https://store.example.com --stealth --country-code DE

# Extract branding info (logos, colors, fonts)
just-scrape scrape https://example.com --branding

Agentic Scraper

Browser automation with AI — login, click, navigate, fill forms. docs

Usage

just-scrape agentic-scraper <url> -s <steps>               # Run browser steps
just-scrape agentic-scraper <url> -s <steps> --ai-extraction -p <prompt>
just-scrape agentic-scraper <url> -s <steps> --schema <json>
just-scrape agentic-scraper <url> -s <steps> --use-session # Persist browser session

Examples

# Log in and extract dashboard data
just-scrape agentic-scraper https://app.example.com/login \
  -s "Fill email with user@test.com,Fill password with secret,Click Sign In" \
  --ai-extraction -p "Extract all dashboard metrics"

# Navigate through a multi-step form
just-scrape agentic-scraper https://example.com/wizard \
  -s "Click Next,Select Premium plan,Fill name with John,Click Submit"

# Persistent session across multiple runs
just-scrape agentic-scraper https://app.example.com \
  -s "Click Settings" --use-session

Generate Schema

Generate a JSON schema from a natural language description.

Usage

just-scrape generate-schema <prompt>                       # AI generates a schema
just-scrape generate-schema <prompt> --existing-schema <json>

Examples

# Generate a schema for product data
just-scrape generate-schema "E-commerce product with name, price, ratings, and reviews array"

# Refine an existing schema
just-scrape generate-schema "Add an availability field" \
  --existing-schema '{"type":"object","properties":{"name":{"type":"string"},"price":{"type":"number"}}}'

History

Browse request history for any service. Interactive by default — arrow keys to navigate, select to view details, "Load more" for infinite scroll.

Usage

just-scrape history <service>                              # Interactive browser
just-scrape history <service> <request-id>                 # Fetch specific request
just-scrape history <service> --page <n>                   # Start from page (default 1)
just-scrape history <service> --page-size <n>              # Results per page (default 10, max 100)
just-scrape history <service> --json                       # Raw JSON (pipeable)

Services: markdownify, smartscraper, searchscraper, scrape, crawl, agentic-scraper, sitemap

Examples

# Browse your smart-scraper history interactively
just-scrape history smartscraper

# Jump to a specific request by ID
just-scrape history smartscraper abc123-def456-7890

# Export crawl history as JSON
just-scrape history crawl --json --page-size 100 | jq '.requests[] | {id: .request_id, status}'

Credits

Check your credit balance.

just-scrape credits
just-scrape credits --json | jq '.remaining_credits'

Validate

Validate your API key (health check).

just-scrape validate

Contributing

From Source

Requires Bun and Node.js 22+.

git clone https://github.com/ScrapeGraphAI/just-scrape.git
cd just-scrape
bun install
bun run dev --help

Tech Stack

Concern	Tool
Language	TypeScript 5.8
Dev Runtime	Bun
Build	tsup (esbuild)
CLI Framework	citty (unjs)
Prompts	@clack/prompts
Styling	chalk v5 (ESM)
Validation	zod v4
Env	dotenv
Lint / Format	Biome
Testing	Bun test (built-in)
Target	Node.js 22+, ESM-only

Scripts

bun run dev                          # Run CLI from TS source
bun run build                        # Bundle ESM to dist/cli.mjs
bun run lint                         # Lint + format check
bun run format                       # Auto-format
bun test                             # Run tests
bun run check                        # Type-check + lint

Testing

Tests mock all API calls via spyOn(globalThis, "fetch") — no network, no API key needed.

Covers: success paths, polling, HTTP error mapping (401/402/422/429/500), Zod validation, timeouts, and network failures.

License

ISC

Made with love by the ScrapeGraphAI team 💜

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.claude		.claude
.github/workflows		.github/workflows
assets		assets
skills/just-scrape		skills/just-scrape
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
SPEC.md		SPEC.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts

License

ScrapeGraphAI/just-scrape

Folders and files

Latest commit

History

Repository files navigation

just-scrape

Project Structure

Installation

Coding Agent Skill

Configuration

Environment Variables

JSON Mode (--json)

Smart Scraper

Usage

Examples

Search Scraper

Usage

Examples

Markdownify

Usage

Examples

Crawl

Usage

Examples

Sitemap

Usage

Examples

Scrape

Usage

Examples

Agentic Scraper

Usage

Examples

Generate Schema

Usage

Examples

History

Usage

Examples

Credits

Validate

Contributing

From Source

Tech Stack

Scripts

Testing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

JSON Mode (`--json`)

Packages