Skip to content

Commit da8c9e4

Browse files
committed
feat: Add a quickstart tutorial, update existing documentation content, and modify VitePress configuration and logo.
1 parent 3a4befa commit da8c9e4

9 files changed

Lines changed: 688 additions & 121 deletions

File tree

.vitepress/config.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,9 +135,10 @@ export default defineConfig({
135135

136136
sidebar: [
137137
{
138-
text: 'Introduction',
138+
text: 'Getting Started',
139139
items: [
140140
{ text: 'Overview', link: '/docs/' },
141+
{ text: 'Quick Start Tutorial', link: '/docs/quickstart-tutorial' },
141142
{ text: 'Apify SDK Environment', link: '/docs/apify-sdk-environment' },
142143
],
143144
},

src/docs/api.md

Lines changed: 70 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ curl -H "Authorization: Bearer YOUR_TOKEN" \
1717
https://your-server.com/v2/datasets
1818
```
1919

20+
All resource endpoints are user-scoped — authenticated users can only access their own resources.
21+
2022
---
2123

2224
## Datasets
@@ -42,6 +44,15 @@ curl -X POST \
4244
https://your-server.com/v2/datasets/{id}/items
4345
```
4446

47+
### Retrieve Items
48+
49+
Supports pagination with `offset` and `limit` query parameters (max limit: 1000).
50+
51+
```bash
52+
curl -H "Authorization: Bearer $TOKEN" \
53+
"https://your-server.com/v2/datasets/{id}/items?offset=0&limit=100"
54+
```
55+
4556
---
4657

4758
## Key-Value Stores
@@ -53,6 +64,7 @@ Store arbitrary data by key.
5364
| `GET` | `/v2/key-value-stores` | List all stores |
5465
| `POST` | `/v2/key-value-stores` | Create a new store |
5566
| `GET` | `/v2/key-value-stores/{id}` | Get store details |
67+
| `DELETE` | `/v2/key-value-stores/{id}` | Delete a store |
5668
| `PUT` | `/v2/key-value-stores/{id}/records/{key}` | Set a record |
5769
| `GET` | `/v2/key-value-stores/{id}/records/{key}` | Get a record |
5870
| `DELETE` | `/v2/key-value-stores/{id}/records/{key}` | Delete a record |
@@ -68,19 +80,33 @@ Store arbitrary data by key.
6880

6981
Manage URLs to crawl with automatic deduplication.
7082

71-
| Method | Endpoint | Description |
72-
| -------- | --------------------------------------------------- | ----------------------- |
73-
| `GET` | `/v2/request-queues` | List all queues |
74-
| `POST` | `/v2/request-queues` | Create a new queue |
75-
| `POST` | `/v2/request-queues/{id}/requests` | Add requests to queue |
76-
| `POST` | `/v2/request-queues/{id}/head/lock` | Lock and fetch requests |
77-
| `DELETE` | `/v2/request-queues/{id}/requests/{requestId}/lock` | Release a lock |
78-
| `PUT` | `/v2/request-queues/{id}/requests/{requestId}` | Update request status |
83+
| Method | Endpoint | Description |
84+
| -------- | --------------------------------------------------- | ------------------------ |
85+
| `GET` | `/v2/request-queues` | List all queues |
86+
| `POST` | `/v2/request-queues` | Create a new queue |
87+
| `GET` | `/v2/request-queues/{id}` | Get queue details |
88+
| `DELETE` | `/v2/request-queues/{id}` | Delete a queue |
89+
| `GET` | `/v2/request-queues/{id}/head` | Get next pending requests |
90+
| `POST` | `/v2/request-queues/{id}/head/lock` | Lock and fetch requests |
91+
| `POST` | `/v2/request-queues/{id}/requests` | Add request to queue |
92+
| `POST` | `/v2/request-queues/{id}/requests/batch` | Batch add requests |
93+
| `GET` | `/v2/request-queues/{id}/requests/{requestId}` | Get request details |
94+
| `PUT` | `/v2/request-queues/{id}/requests/{requestId}` | Update request status |
95+
| `PUT` | `/v2/request-queues/{id}/requests/{requestId}/lock` | Prolong request lock |
96+
| `DELETE` | `/v2/request-queues/{id}/requests/{requestId}/lock` | Release a lock |
7997

8098
### Deduplication
8199

82100
Requests are deduplicated by `uniqueKey`. Adding a request with an existing `uniqueKey` is a no-op.
83101

102+
### Locking
103+
104+
The lock endpoint (`POST .../head/lock`) supports distributed crawling. Parameters:
105+
106+
- `lockSecs` — Lock duration in seconds (max 86400)
107+
- `limit` — Number of requests to fetch (max 1000)
108+
- `clientKey` — Unique identifier for the crawling client
109+
84110
---
85111

86112
## Actors
@@ -96,18 +122,31 @@ Manage Actor definitions.
96122
| `DELETE` | `/v2/acts/{id}` | Delete an Actor |
97123
| `POST` | `/v2/acts/{id}/runs` | Start a new run |
98124

125+
### Input Validation
126+
127+
Actor create/update bodies are validated with the following constraints:
128+
129+
- `name` — 1-100 chars, alphanumeric with dots, dashes, underscores
130+
- `timeout` — Max 86400 seconds (24h)
131+
- `memory` — Max 16384 MB (16 GB)
132+
99133
---
100134

101135
## Runs
102136

103137
Monitor Actor executions.
104138

105-
| Method | Endpoint | Description |
106-
| ------ | --------------------------- | --------------------- |
107-
| `GET` | `/v2/actor-runs` | List all runs |
108-
| `GET` | `/v2/actor-runs/{id}` | Get run status |
109-
| `POST` | `/v2/actor-runs/{id}/abort` | Abort a running Actor |
110-
| `GET` | `/v2/actor-runs/{id}/log` | Get run logs |
139+
| Method | Endpoint | Description |
140+
| ------ | --------------------------------------------------------- | ------------------------------- |
141+
| `GET` | `/v2/actor-runs` | List all runs |
142+
| `GET` | `/v2/actor-runs/{id}` | Get run status |
143+
| `PUT` | `/v2/actor-runs/{id}` | Update run status |
144+
| `POST` | `/v2/actor-runs/{id}/abort` | Abort a running Actor |
145+
| `POST` | `/v2/actor-runs/{id}/resurrect` | Resurrect a failed run |
146+
| `GET` | `/v2/actor-runs/{id}/logs` | Get run logs |
147+
| `POST` | `/v2/actor-runs/{id}/logs` | Append log entry |
148+
| `GET` | `/v2/actor-runs/{id}/dataset/items` | Get run's dataset items |
149+
| `GET` | `/v2/actor-runs/{id}/key-value-store/records/{key}` | Get run's KV store record |
111150

112151
### Run Status Values
113152

@@ -136,6 +175,22 @@ All successful responses wrap data in a `data` field:
136175
}
137176
```
138177

178+
### Validation Errors
179+
180+
Invalid request bodies return a 400 with Zod validation details:
181+
182+
```json
183+
{
184+
"error": {
185+
"type": "validation_error",
186+
"message": "Validation failed",
187+
"details": [
188+
{ "path": ["name"], "message": "String must contain at least 1 character(s)" }
189+
]
190+
}
191+
}
192+
```
193+
139194
### Error Responses
140195

141196
```json
@@ -151,6 +206,6 @@ All successful responses wrap data in a `data` field:
151206
| --------- | ------------------------------ |
152207
| `400` | Bad request / validation error |
153208
| `401` | Authentication required |
154-
| `403` | Permission denied |
155209
| `404` | Resource not found |
210+
| `409` | Conflict (e.g., locked request)|
156211
| `500` | Internal server error |

src/docs/apify-sdk-environment.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -60,10 +60,10 @@ npm start
6060

6161
```bash
6262
# Login to your server
63-
crawlee-cloud login --server https://your-server.com
63+
crc login --url https://your-server.com
6464

6565
# Push your Actor
66-
crawlee-cloud push my-actor
66+
crc push my-actor
6767
```
6868

6969
---

src/docs/cli.md

Lines changed: 91 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -105,10 +105,29 @@ crawlee-cloud status abc123 --watch --interval 5
105105
Authenticate with your Crawlee Cloud server.
106106

107107
```bash
108-
crawlee-cloud login --url https://your-server.com
108+
crawlee-cloud login [options]
109109
```
110110

111-
You'll be prompted to enter your API token. Credentials are stored in `~/.crawlee-cloud/config.json`.
111+
**Options:**
112+
113+
| Flag | Description |
114+
| -------------- | --------------- |
115+
| `--url, -u` | API base URL |
116+
| `--token, -t` | API token |
117+
118+
Without flags, you'll be prompted interactively. The token is validated against the server before saving.
119+
120+
**Examples:**
121+
122+
```bash
123+
# Interactive login (prompts for URL and token)
124+
crawlee-cloud login
125+
126+
# Non-interactive
127+
crawlee-cloud login --url https://your-server.com --token your-api-token
128+
```
129+
130+
Credentials are stored in `~/.crawlee-cloud/config.json`.
112131

113132
---
114133

@@ -124,48 +143,49 @@ crawlee-cloud push [actor-name]
124143

125144
| Flag | Description |
126145
| --------------- | ------------------------- |
127-
| `--version, -v` | Version tag for the build |
128-
| `--no-build` | Skip local build step |
146+
| `--tag, -t` | Docker image tag for the build |
147+
| `--no-build` | Skip local build step |
129148

130149
**Example:**
131150

132151
```bash
133152
cd my-actor
134-
crawlee-cloud push my-scraper --version 1.0.0
153+
crawlee-cloud push my-scraper --tag 1.0.0
135154
```
136155

137156
---
138157

139158
### `run`
140159

141-
Execute an Actor on the server.
160+
Run an Actor locally with local file storage.
142161

143162
```bash
144-
crawlee-cloud run <actor-name> [options]
163+
crawlee-cloud run [options]
145164
```
146165

147166
**Options:**
148167

149-
| Flag | Description |
150-
| -------------- | ------------------------ |
151-
| `--input, -i` | JSON input for the Actor |
152-
| `--input-file` | Path to JSON input file |
153-
| `--wait, -w` | Wait for completion |
154-
| `--timeout` | Max wait time (seconds) |
168+
| Flag | Description |
169+
| ------------- | ----------------------------------- |
170+
| `--input, -i` | JSON input or path to JSON file |
171+
| `--no-purge` | Do not purge storage before run |
155172

156173
**Examples:**
157174

158175
```bash
159-
# Run with inline input
160-
crawlee-cloud run my-scraper --input '{"url": "https://example.com"}'
176+
# Run in current directory
177+
cd my-actor
178+
crawlee-cloud run
161179

162-
# Run with input file
163-
crawlee-cloud run my-scraper --input-file ./input.json
180+
# Run with input
181+
crawlee-cloud run --input '{"url": "https://example.com"}'
164182

165-
# Run and wait for completion
166-
crawlee-cloud run my-scraper --wait --timeout 300
183+
# Keep previous storage data
184+
crawlee-cloud run --no-purge
167185
```
168186

187+
Local storage is created in `./storage/` with datasets, key-value stores, and request queues.
188+
169189
---
170190

171191
### `logs`
@@ -193,28 +213,70 @@ crawlee-cloud logs abc123 --follow
193213

194214
### `call`
195215

196-
Make direct API requests.
216+
Call a remote Actor on the platform and optionally wait for results.
197217

198218
```bash
199-
crawlee-cloud call <method> <path> [options]
219+
crawlee-cloud call <actor> [options]
200220
```
201221

202222
**Options:**
203223

204-
| Flag | Description |
205-
| ------------ | ------------------- |
206-
| `--data, -d` | Request body (JSON) |
224+
| Flag | Description |
225+
| ----------------- | ---------------------------------------------- |
226+
| `--input, -i` | Input JSON or path to JSON file |
227+
| `--env, -e` | Environment variable KEY=VALUE (repeatable) |
228+
| `--wait, -w` | Wait for run to finish |
229+
| `--timeout, -t` | Timeout in seconds (default: 3600) |
230+
| `--memory, -m` | Memory in MB (default: 1024) |
207231

208232
**Examples:**
209233

210234
```bash
211-
# List datasets
212-
crawlee-cloud call GET /v2/datasets
235+
# Call an Actor
236+
crawlee-cloud call my-scraper --input '{"url": "https://example.com"}'
237+
238+
# Call and wait for results
239+
crawlee-cloud call my-scraper --wait --input '{"url": "https://example.com"}'
240+
241+
# Call with environment variables (use -e multiple times)
242+
crc call my-actor -e KEY1=val1 -e KEY2=val2
243+
```
244+
245+
> **Tip:** The `-e` flag can be repeated to pass multiple environment variables in a single call.
246+
247+
---
248+
249+
## Getting Your API Token
250+
251+
You need an API token to authenticate with Crawlee Cloud. There are two ways to get one:
252+
253+
### Via the Dashboard
254+
255+
1. Login to the dashboard at `http://localhost:3001`
256+
2. Go to **Settings → API Keys**
257+
3. Create a new API key
258+
259+
### Via the API
260+
261+
First, obtain a JWT token by logging in:
262+
263+
```bash
264+
curl -X POST http://localhost:3000/v2/auth/login \
265+
-H "Content-Type: application/json" \
266+
-d '{"email":"admin@crawlee.cloud","password":"your-password"}'
267+
```
213268

214-
# Create a dataset
215-
crawlee-cloud call POST /v2/datasets --data '{"name": "my-data"}'
269+
Then create an API key using the JWT token:
270+
271+
```bash
272+
curl -X POST http://localhost:3000/v2/auth/api-keys \
273+
-H "Authorization: Bearer <token>" \
274+
-H "Content-Type: application/json" \
275+
-d '{"name":"my-key"}'
216276
```
217277

278+
Use the resulting API key as your token when running `crawlee-cloud login`.
279+
218280
---
219281

220282
## Configuration
@@ -223,7 +285,7 @@ Configuration is stored in `~/.crawlee-cloud/config.json`:
223285

224286
```json
225287
{
226-
"server": "https://your-server.com",
288+
"apiBaseUrl": "https://your-server.com",
227289
"token": "your-api-token"
228290
}
229291
```

0 commit comments

Comments
 (0)