Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions api-reference/endpoint/smartcrawler/start.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@ Content-Type: `application/json`
"same_domain": "boolean"
},
"sitemap": "boolean",
"stealth": "boolean"
"webhook_url": str
"stealth": "boolean",
"webhook_url": "string",
"wait_ms": "integer"
}
```

Expand All @@ -58,7 +59,8 @@ Content-Type: `application/json`
| rules | object | No | - | Crawl rules for filtering URLs. Object with optional fields: `exclude` (array of regex URL patterns), `include_paths` (array of path patterns to include, supports wildcards `*` and `**`), `exclude_paths` (array of path patterns to exclude, takes precedence over `include_paths`), `same_domain` (boolean, default: true). See Rules section below for details. |
| sitemap | boolean | No | false | Use sitemap.xml for discovery |
| stealth | boolean | No | false | Enable stealth mode to bypass bot protection using advanced anti-detection techniques. Adds +4 credits to the request cost |
| webhook_url | str | No | None | Webhook URL to send the job result to. When provided, a signed webhook notification will be sent upon job completion. See [Webhook Signature Verification](#webhook-signature-verification) below.
| webhook_url | string | No | None | Webhook URL to send the job result to. When provided, a signed webhook notification will be sent upon job completion. See [Webhook Signature Verification](#webhook-signature-verification) below. |
| wait_ms | integer | No | 3000 | Milliseconds to wait before scraping each page. Useful for pages with heavy JavaScript rendering that need extra time to load. |

### Example
```json
Expand Down
2 changes: 2 additions & 0 deletions services/smartcrawler.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ curl -X 'POST' \
| rules | object | No | Crawl rules object with optional fields: `exclude` (array of regex URL patterns), `include_paths` (array of path patterns to include, supports wildcards `*` and `**`), `exclude_paths` (array of path patterns to exclude, takes precedence over `include_paths`), `same_domain` (boolean, default: true). See below for details. |
| sitemap | bool | No | Use sitemap.xml for discovery (default: false). |
| webhook_url | string | No | URL to receive webhook notification on job completion. |
| wait_ms | int | No | Milliseconds to wait before scraping each page. Useful for pages with heavy JavaScript rendering that need extra time to load (default: 3000). |


<Note>
Expand Down Expand Up @@ -463,6 +464,7 @@ POST https://api.scrapegraphai.com/v1/crawl
| max_pages | int | No | Max pages to crawl |
| rules | object | No | Crawl rules object with optional fields: `exclude` (regex URL patterns), `include_paths` (path patterns to include), `exclude_paths` (path patterns to exclude), `same_domain` (boolean) |
| sitemap | bool | No | Use sitemap.xml |
| wait_ms | int | No | Milliseconds to wait before scraping each page. Useful for pages with heavy JavaScript rendering (default: 3000). |

#### Response Format
```json
Expand Down