You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bump firecrawl plugin version to 1.0.5 and expand the firecrawl-cli documentation. Clarifies that `search --scrape` already retrieves full page content (so avoid redundant scraping), adds a research-task example, warns against re-scraping URLs or using `--html` to re-extract metadata, and emphasizes reading existing scraped files before fetching more data. Changes in .claude-plugin/plugin.json and skills/firecrawl-cli/SKILL.md.
Copy file name to clipboardExpand all lines: skills/firecrawl-cli/SKILL.md
+20Lines changed: 20 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -36,6 +36,8 @@ Follow this escalation pattern when fetching web data:
36
36
4.**Crawl** — You need bulk content from an entire site section (e.g., all docs pages).
37
37
5.**Browser** — Scrape didn't return the needed data because it's behind interaction (pagination, modals, form submissions, multi-step navigation). Open a browser session to click through and extract it.
38
38
39
+
**Note:**`search --scrape` already fetches full page content for every result. Don't scrape those URLs again individually — only scrape URLs that weren't part of the search results.
40
+
39
41
**Example: fetching API docs from a large documentation site**
Never use browser on sites with bot detection — it will be blocked. This includes Google, Bing, DuckDuckGo, and sites behind Cloudflare challenges or CAPTCHAs. Use `firecrawl search` for web searches instead.
@@ -458,6 +476,8 @@ firecrawl browser close --session <id>
458
476
459
477
## Reading Scraped Files
460
478
479
+
Always read and process the files you already have before fetching more data. Don't re-scrape a URL you already have content for.
480
+
461
481
NEVER read entire firecrawl output files at once unless explicitly asked or required - they're often 1000+ lines. Instead, use grep, head, or incremental reads. Determine values dynamically based on file size and what you're looking for.
0 commit comments