Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@
},
"metadata": {
"description": "Official Parallel Web Systems plugin for Claude Code - web search, content extraction, deep research, and data enrichment capabilities.",
"version": "0.2.0",
"version": "0.3.0",
"pluginRoot": "./"
},
"plugins": [
{
"name": "parallel",
"description": "Parallel CLI integration for web search, URL content extraction, deep research tasks, and bulk data enrichment powered by Parallel's AI-native web infrastructure.",
"version": "0.2.0",
"version": "0.3.0",
"source": "./",
"author": {
"name": "Parallel Web Systems",
Expand Down
2 changes: 1 addition & 1 deletion .claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "parallel",
"version": "0.2.0",
"version": "0.3.0",
"description": "Parallel Web Search MCP and Task API integration for Claude Code. Provides web search, content extraction, deep research tasks, and data enrichment capabilities.",
"author": {
"name": "Parallel Web Systems",
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ Skills follow the [Agent Skills](https://agentskills.io/specification) specifica
| **parallel-web-extract** | Extract content from URLs, articles, PDFs |
| **parallel-deep-research** | Comprehensive research and analysis |
| **parallel-data-enrichment** | Enrich lists of companies, people, products |
| **parallel-develop** | Bootstrap a Parallel API integration in your codebase (Python / TypeScript / cURL / MCP) |
| **setup** | Install CLI and authenticate |
| **status** | Check running research task status |
| **result** | Get completed research task result |
Expand All @@ -93,6 +94,7 @@ Skills follow the [Agent Skills](https://agentskills.io/specification) specifica
/parallel:parallel-web-extract https://docs.parallel.ai
/parallel:parallel-deep-research competitive landscape of AI code assistants
/parallel:parallel-data-enrichment Apple, Microsoft, Google - get CEO names
/parallel:parallel-develop add a research agent to my Next.js app in typescript
/parallel:setup
```

Expand Down
84 changes: 84 additions & 0 deletions skills/parallel-develop/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
name: parallel-develop
description: "Bootstrap a Parallel API integration in the user's codebase. Use when the user says 'integrate Parallel', 'add Parallel API', 'build with Parallel', 'use Parallel for web search / research / enrichment / monitoring / extraction / lead discovery', or asks for starter code that talks to api.parallel.ai. Produces install + env + working example code tailored to the user's language (Python / TypeScript / cURL / MCP) and the API that fits their use case."
user-invocable: true
argument-hint: <what you want to build — e.g. \"research agent in typescript\">
compatibility: Works in any agent/IDE. The generated code targets Python (`parallel-web`), TypeScript (`parallel-web`), raw cURL, or MCP clients (Cursor, Claude Desktop, VS Code, Claude Code).
allowed-tools: Read Write Edit Bash(ls:*) Bash(cat:*) Bash(mkdir:*) Bash(pip:*) Bash(npm:*) Bash(pnpm:*) Bash(uv:*) WebFetch
metadata:
author: parallel
---

# Parallel API Bootstrap

Build a Parallel integration for: $ARGUMENTS

## Step 1 — identify the client + use case

Parallel has five main APIs. Pick the one that fits the user's goal:

| Use case | API | When |
|----------|-----|------|
| Single web lookup, RAG retrieval | **Search** | Fast, LLM-optimized excerpts for a given objective + keywords |
| Deep research / multi-hop question | **Task** (processor `pro`/`ultra`) | Complex queries that need several retrieval hops and structured output |
| Enrich a known list (people, companies, products) | **Task** (processor `core`) | You have the entities; fill in columns |
| Discover NEW entities matching criteria | **FindAll** (beta) | Building a list from scratch |
| Track web changes on a schedule | **Monitor** (alpha) | News tracking, change alerts with webhooks |
| Convert a specific URL to LLM-friendly markdown | **Extract** | Single-URL ingestion, PDFs, JS-heavy pages |

Likewise, pick the client the user is working in:

- **Python** — official `parallel-web` SDK (top-level `client.search`, `client.extract`, `client.task_run`, `client.beta.findall`)
- **TypeScript** — official `parallel-web` SDK (top-level `client.search`, `client.extract`, `client.taskRun`; FindAll / Monitor via generic `client.post<T>`)
- **cURL** — any language; just HTTP + `x-api-key` header
- **MCP** — if the user wants an MCP client (Cursor / Claude Desktop / VS Code / Claude Code) to call Parallel's hosted Search + Task MCPs

If the user's message doesn't make the client + use case obvious, **ask them once** (AskUserQuestion or a plain clarifying question). Don't guess.

## Step 2 — emit the integration

Read the matching recipe file and follow its instructions verbatim. Each recipe covers install, env setup, the minimal working code, and the top best practices.

- Python → [references/python.md](references/python.md)
- TypeScript → [references/typescript.md](references/typescript.md)
- cURL → [references/curl.md](references/curl.md)
- MCP → [references/mcp.md](references/mcp.md)

After reading the recipe, write the example into the user's repo (a sensible path like `scripts/parallel_<use_case>.py` or `examples/parallel-<use-case>.ts`), update the nearest `requirements.txt` / `package.json` if needed, and walk the user through getting an API key at [platform.parallel.ai](https://platform.parallel.ai).

## Step 3 — point at the canonical docs for follow-ups

Parallel publishes an agent-friendly docs index at **https://docs.parallel.ai/llms.txt** — a single markdown file with the full API surface. If the user asks for anything beyond the recipe (webhooks, source policies, task groups, advanced schemas), fetch that URL first and pull in the relevant section rather than guessing.

For deep-dive API questions the `/llms.txt` index is authoritative; do not invent parameter names.

### Canonical best-practice pages (link, don't paraphrase)

When the user asks "how do I do X correctly?", send them to the dedicated page instead of making things up:

- **Search** — [Best Practices](https://docs.parallel.ai/search/best-practices) · [Advanced Settings](https://docs.parallel.ai/search/advanced-search-settings) · [Modes](https://docs.parallel.ai/search/modes)
- **Extract** — [Best Practices](https://docs.parallel.ai/extract/best-practices) · [Advanced Settings](https://docs.parallel.ai/extract/advanced-extract-settings)
- **Task** — [Specify a Task](https://docs.parallel.ai/task-api/guides/specify-a-task) · [Choose a Processor](https://docs.parallel.ai/task-api/guides/choose-a-processor) · [Task Run Lifecycle](https://docs.parallel.ai/task-api/guides/execute-task-run) · [Webhooks](https://docs.parallel.ai/task-api/webhooks)
- **FindAll** — [Generators & Pricing](https://docs.parallel.ai/findall-api/core-concepts/findall-generator-pricing)
- **MCP** — [Quickstart](https://docs.parallel.ai/integrations/mcp/quickstart) · [Programmatic Use](https://docs.parallel.ai/integrations/mcp/programmatic-use)

## Guardrails

- **Snake_case everywhere.** Both the Python and TypeScript SDKs use **snake_case** body keys (`task_spec`, `output_schema`, `json_schema`, `run_id`, `event_types`). Do NOT camelCase these in TypeScript — it will silently be rejected by the server or produce a type error.
- **Never** invent an endpoint version. Current versions: Search/Extract/Task at `/v1`, FindAll at `/v1beta`, Monitor at `/v1alpha`.
- **For runs > 30 s** (Task `pro`/`ultra`, FindAll `core`/`pro`, any Monitor event), prefer a **webhook** over polling. Pass `webhook={"url": "...", "event_types": [...]}` at creation.
- **Task output schemas are strict.** Every property must appear in `required`; for optional fields use a union like `{"type": ["string", "null"]}` instead of omitting. Always set `additionalProperties: false`. The root must be `{"type": "object"}` — arrays cannot be the root. Prefer flat schemas.
- **Always** give each Task/Enrichment field a *specific* description with exact format (e.g. `"MM-YYYY"`, `"USD"`, `"ISO 3166-1 alpha-2"`) and an explicit missing-value behavior (`"Return 'Not Available' if no source confirms"`).
- **Start FindAll with `generator="preview"`** to iterate on the objective and match_conditions before scaling up — it's cheap and fast. **If you get 0 matches, upgrade the generator before rewriting the query** — the issue is usually pool size, not query quality.
- **session_id** groups related Search + Extract calls into one logical task. Generate a fresh UUID per task and reuse it across calls. **client_model** tells the server which LLM will consume the excerpts so it can tune formatting.
- **When exposing Search/Extract as an LLM tool**, expose ONLY `objective` and `search_queries`. Exposing `advanced_settings` tempts the model to over-narrow and hurts recall.
- **Don't** use `parallel-cli` for these recipes — this skill is about writing integration code, not about invoking the CLI. For CLI workflows, use the `parallel-web-search`, `parallel-web-extract`, `parallel-deep-research`, or `parallel-data-enrichment` skills instead.

## When to choose a different skill

- User just wants to **run a web search right now** → use `parallel-web-search`.
- User just wants **one research task** run → use `parallel-deep-research`.
- User wants to **enrich a CSV right now** → use `parallel-data-enrichment`.
- User wants to **fetch a specific URL** → use `parallel-web-extract`.

This skill is for the *build-and-ship* case — when the user is adding Parallel to their own codebase.
173 changes: 173 additions & 0 deletions skills/parallel-develop/references/curl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# cURL recipe — raw HTTP

All calls use header `x-api-key: $PARALLEL_API_KEY`. Get a key at [platform.parallel.ai](https://platform.parallel.ai).

```bash
export PARALLEL_API_KEY="your-api-key"
```

## Snippets by use case

### Web Search — `POST /v1/search`

```bash
# Best practices:
# - Provide both "objective" and "search_queries" (2-3 short keyword queries).
# - Modes: "basic" = lowest latency, "advanced" = higher recall.
# - advanced_settings.source_policy restricts/blocks domains;
# advanced_settings.fetch_policy.max_age_seconds controls freshness;
# advanced_settings.location (ISO 3166 alpha-2) for geo-targeting.
curl -X POST https://api.parallel.ai/v1/search \
-H "Content-Type: application/json" \
-H "x-api-key: $PARALLEL_API_KEY" \
-d '{
"objective": "Find the latest information about <TOPIC>",
"search_queries": ["<query 1>", "<query 2>"],
"mode": "advanced",
"max_chars_total": 27000,
"advanced_settings": {
"max_results": 10,
"excerpt_settings": {"max_chars_per_result": 10000}
}
}'
```

### Research / structured task — `POST /v1/tasks/runs`

```bash
# Processors: lite/base (simple), core (~10 output fields), pro/ultra (deep).
# Append "-fast" for lower latency (e.g. "core-fast", "pro-fast").
# For pro/ultra tasks prefer a webhook over polling.

# 1. Create a task run and capture the run_id
RUN_ID=$(curl -s -X POST https://api.parallel.ai/v1/tasks/runs \
-H "Content-Type: application/json" \
-H "x-api-key: $PARALLEL_API_KEY" \
-d '{
"input": "<your research question>",
"task_spec": {
"output_schema": "A summary, key findings, and sources"
},
"processor": "core"
}' | jq -r '.run_id')

# 2. Get the result (?timeout=N in seconds; max 600. Re-poll on timeout.)
curl -X GET "https://api.parallel.ai/v1/tasks/runs/$RUN_ID/result?timeout=600" \
-H "x-api-key: $PARALLEL_API_KEY"
```

### Data enrichment — same endpoint, structured schema

```bash
# Processors: lite/base (1-2 fields), core (up to ~10 fields), pro (complex).

RUN_ID=$(curl -s -X POST https://api.parallel.ai/v1/tasks/runs \
-H "Content-Type: application/json" \
-H "x-api-key: $PARALLEL_API_KEY" \
-d '{
"input": "<entity, e.g. Stripe>",
"task_spec": {
"output_schema": {
"type": "json",
"json_schema": {
"type": "object",
"properties": {
"founding_date": {
"type": "string",
"description": "Founding date in MM-YYYY format. Return \"Not Available\" if unknown."
},
"employee_count": {
"type": "string",
"description": "Estimated employee count as a range, e.g. \"500-1000\"."
},
"funding_sources": {
"type": "string",
"description": "Funding sources and total raised in USD."
}
},
"required": ["founding_date", "employee_count", "funding_sources"]
}
}
},
"processor": "core"
}' | jq -r '.run_id')

curl -X GET "https://api.parallel.ai/v1/tasks/runs/$RUN_ID/result?timeout=600" \
-H "x-api-key: $PARALLEL_API_KEY"
```

### Lead discovery — `POST /v1beta/findall/runs`

```bash
# Start with "preview" generator to validate your query (~10 candidates, fast, cheap).
# Generators: preview (test), base (broad), core (specific), pro (rare matches).

FINDALL_ID=$(curl -s -X POST https://api.parallel.ai/v1beta/findall/runs \
-H "Content-Type: application/json" \
-H "x-api-key: $PARALLEL_API_KEY" \
-d '{
"objective": "<e.g. AI startups that raised Series A in 2024>",
"entity_type": "companies",
"match_conditions": [
{
"name": "series_a_2024",
"description": "Company raised a Series A in 2024, confirmed by TechCrunch/Crunchbase/PR."
},
{
"name": "ai_focused",
"description": "Primary product must be AI-focused (LLMs, ML infra, or AI apps)."
}
],
"generator": "core",
"match_limit": 20
}' | jq -r '.findall_id')

# May take several minutes for core/pro. Use a webhook for large runs.
curl -X GET "https://api.parallel.ai/v1beta/findall/runs/$FINDALL_ID/result" \
-H "x-api-key: $PARALLEL_API_KEY"
```

### Web Monitoring — `POST /v1alpha/monitors`

```bash
# Natural-language query focused on intent, not keywords.
# Cadence: "hourly" (fast-moving), "daily" (most news), "weekly" (slow-changing).
# Don't include dates — Monitor tracks new updates automatically.
# "simulate_event": true forces an immediate test event during development.
curl -X POST https://api.parallel.ai/v1alpha/monitors \
-H "Content-Type: application/json" \
-H "x-api-key: $PARALLEL_API_KEY" \
-d '{
"query": "<e.g. latest news about AI regulation>",
"cadence": "daily",
"webhook": {
"url": "https://your-app.com/webhook",
"event_types": ["monitor.event.detected"]
}
}'
```

Response shape: `{ "monitor_id": "...", "status": "...", "cadence": "...", "query": "...", ... }`.

### Content extraction — `POST /v1/extract`

```bash
# Always provide an "objective" — extraction is LLM-focused, not a raw scrape.
# Up to 20 URLs per call. PDFs and JS-heavy pages are handled.
# "full_content": true only when you need the whole page as markdown.
# "fetch_policy": {"max_age_seconds": 3600} for cache-vs-live control.
curl -X POST https://api.parallel.ai/v1/extract \
-H "Content-Type: application/json" \
-H "x-api-key: $PARALLEL_API_KEY" \
-d '{
"urls": ["https://example.com/article"],
"objective": "<what to focus on in the page>",
"excerpt_settings": {"max_chars_per_result": 5000}
}'
```

## Error handling

- Non-2xx responses return `{ "error": { "message": "...", "type": "..." } }`. Check HTTP status first, then inspect the body.
- For 429 (rate limit), back off with exponential delay.
- For 5xx, retry with jitter. Runs that have been *created* are durable — refetch with the `run_id` / `findall_id` rather than recreating.
Loading