diff --git a/AGENTS.md b/AGENTS.md index 67f2277..5be0add 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,13 +2,17 @@ ## Purpose -This repo publishes a single Agent Skills document for Replicate. +This repo publishes Agent Skills documents for Replicate. -Keep it short and focused: a human- and agent-readable guide to discovering models, inspecting schemas, running predictions, and handling outputs. +Keep it short and focused: human- and agent-readable guides for finding, comparing, running, building, and deploying models. ## Files that matter -- `skills/replicate/SKILL.md` is the canonical skill. +- `skills/find-models/SKILL.md` covers discovery workflows. +- `skills/compare-models/SKILL.md` covers model evaluation. +- `skills/run-models/SKILL.md` covers prediction workflows. +- `skills/build-models/SKILL.md` covers Cog builds. +- `skills/deploy-models/SKILL.md` covers deployments and scaling. - `.mcp.json` points to the remote MCP server. - `.claude-plugin/` contains marketplace metadata for Claude Code. diff --git a/README.md b/README.md index de3c278..ca2ce75 100644 --- a/README.md +++ b/README.md @@ -1,9 +1,17 @@ # Replicate Skills -A collection of [Agent Skills](https://agentskills.io) for building AI-powered apps with [Replicate](https://replicate.com). +A collection of [Agent Skills](https://agentskills.io) for building AI-powered apps with [Replicate](https://replicate.com). -Discover, compare, and run AI models using Replicate's API. +Find, compare, run, build, and deploy models using Replicate and Cog. + +Skills included: + +- find-models +- compare-models +- run-models +- build-models +- deploy-models ## Installing diff --git a/skills/build-models/SKILL.md b/skills/build-models/SKILL.md new file mode 100644 index 0000000..51b4d95 --- /dev/null +++ b/skills/build-models/SKILL.md @@ -0,0 +1,26 @@ +--- +name: build-models +description: Build Replicate models using Cog +--- + +## Docs + +- Cog docs: https://cog.run/llms.txt +- Replicate docs: https://replicate.com/docs/llms.txt +- HTTP API schema: https://api.replicate.com/openapi.json +- Set an `Accept: text/markdown` header when requesting docs pages to get a Markdown response. + +## Workflow + +- Define your model in `cog.yaml` using the Cog schema. +- Implement the Predictor interface in Python and wire inputs and outputs. +- Build and test the image locally with Cog before pushing. +- Use the Cog docs as the source of truth for `cog.yaml` and Predictor APIs. + +## Guidelines + +- Focus on the `cog.yaml` schema and the Predictor API in the Cog docs. +- Cog is open source at https://github.com/replicate/cog if you need internals. +- Review Replicate models that link GitHub repos to learn existing Cog patterns. +- Use model repos as references for inputs, outputs, and packaging decisions. +- Keep `cog.yaml` minimal and explicit about build and runtime dependencies. diff --git a/skills/compare-models/SKILL.md b/skills/compare-models/SKILL.md new file mode 100644 index 0000000..2cd8d98 --- /dev/null +++ b/skills/compare-models/SKILL.md @@ -0,0 +1,24 @@ +--- +name: compare-models +description: Compare Replicate models for fit, cost, and reliability +--- + +## Docs + +- Reference docs: https://replicate.com/docs/llms.txt +- HTTP API schema: https://api.replicate.com/openapi.json +- Set an `Accept: text/markdown` header when requesting docs pages to get a Markdown response. + +## Workflow + +- Fetch model schemas and compare required inputs and outputs. +- Compare pricing, speed, and reliability from model metadata. +- Prefer official models when you need stable interfaces. +- Use collections to narrow the shortlist before deep comparison. +- Run a small set of predictions to compare output quality. + +## Guidelines + +- Verify output types match downstream requirements. +- Official models have predictable output pricing and stable APIs. +- Consider cold-start behavior for community models. diff --git a/skills/deploy-models/SKILL.md b/skills/deploy-models/SKILL.md new file mode 100644 index 0000000..8eeda27 --- /dev/null +++ b/skills/deploy-models/SKILL.md @@ -0,0 +1,25 @@ +--- +name: deploy-models +description: Push models with Cog and configure Replicate deployments +--- + +## Docs + +- Cog docs: https://cog.run/llms.txt +- Replicate docs: https://replicate.com/docs/llms.txt +- HTTP API schema: https://api.replicate.com/openapi.json +- Set an `Accept: text/markdown` header when requesting docs pages to get a Markdown response. + +## Workflow + +- Use Cog to build and push a model image. +- Configure deployments in Replicate for hardware and scaling behavior. +- Use the API schema as the source of truth for deployment fields. +- Align deployment settings with expected throughput and cost. + +## Guidelines + +- Review models with GitHub repos in their metadata for deployment examples. +- Keep deployment settings aligned with model performance and cost targets. +- Prefer official models for stable deployment behavior. +- Use deployments when you need consistent uptime and predictable latency. diff --git a/skills/find-models/SKILL.md b/skills/find-models/SKILL.md new file mode 100644 index 0000000..a0d2fa6 --- /dev/null +++ b/skills/find-models/SKILL.md @@ -0,0 +1,26 @@ +--- +name: find-models +description: Find Replicate models and curated collections +--- + +## Docs + +- Reference docs: https://replicate.com/docs/llms.txt +- HTTP API schema: https://api.replicate.com/openapi.json +- Set an `Accept: text/markdown` header when requesting docs pages to get a Markdown response. + +## Workflow + +- Use search and collections endpoints from the API schema. +- Prefer curated collections for vetted models. +- Use the "official" collection when you need stable interfaces. +- Check model metadata for inputs, outputs, and pricing. + +## Guidelines + +- Avoid listing all models via API; use targeted queries. +- Collections are curated by Replicate staff. +- Official models are maintained by Replicate and are always running. +- Official models have stable interfaces and predictable output pricing. +- Community models can have cold-start time. +- Always-on deployments of community models pay for uptime. diff --git a/skills/replicate/SKILL.md b/skills/replicate/SKILL.md deleted file mode 100644 index 3cfd84b..0000000 --- a/skills/replicate/SKILL.md +++ /dev/null @@ -1,53 +0,0 @@ ---- -name: replicate -description: Discover, compare, and run AI models using Replicate's API ---- - -## Docs - -- Reference docs: https://replicate.com/docs/llms.txt -- HTTP API schema: https://api.replicate.com/openapi.json -- MCP server: https://mcp.replicate.com -- Set an `Accept: text/markdown` header when requesting docs pages to get a Markdown response. - -## Workflow - -Here's a common workflow for using Replicate's API to run a model: - -1. **Choose the right model** - Search with the API or ask the user -2. **Get model metadata** - Fetch model input and output schema via API -3. **Create prediction** - POST to /v1/predictions -4. **Poll for results** - GET prediction until status is "succeeded" -5. **Return output** - Usually URLs to generated content - -## Choosing models - -- Use the search and collections APIs to find and compare the best models. Do not list all the models via API, as it's basically a firehose. -- Collections are curated by Replicate staff, so they're vetted. -- Official models are in the "official" collection. -- Use official models because they: - - are always running - - have stable API interfaces - - have predictable output pricing - - are maintained by Replicate staff -- If you must use a community model, be aware that it can take a long time to boot. -- You can create always-on deployments of community models, but you pay for model uptime. - -## Running models - -Models take time to run. There are three ways to run a model via API and get its output: - -1. Create a prediction, store its id from the response, and poll until completion. -2. Set a `Prefer: wait` header when creating a prediction for a blocking synchronous response. Only recommended for very fast models. -3. Set an HTTPS webhook URL when creating a prediction, and Replicate will POST to that URL when the prediction completes. - -Follow these guideliness when running models: - -- Use the "POST /v1/predictions" endpoint, as it supports both official and community models. -- Every model has its own OpenAPI schema. Always fetch and check model schemas to make sure you're setting valid inputs. -- Use HTTPS URLs for file inputs whenever possible. You can also send base64-encoded files, but they should be avoided. -- Fire off multiple predictions concurrently. Don't wait for one to finish before starting the next. -- Output file URLs expire after 1 hour, so back them up if you need to keep them, using a service like Cloudflare R2. -- Webhooks are a good mechanism for receiving and storing prediction output. - - diff --git a/skills/run-models/SKILL.md b/skills/run-models/SKILL.md new file mode 100644 index 0000000..3cd6114 --- /dev/null +++ b/skills/run-models/SKILL.md @@ -0,0 +1,26 @@ +--- +name: run-models +description: Run Replicate models via predictions and webhooks +--- + +## Docs + +- Reference docs: https://replicate.com/docs/llms.txt +- HTTP API schema: https://api.replicate.com/openapi.json +- Set an `Accept: text/markdown` header when requesting docs pages to get a Markdown response. + +## Workflow + +- Create a prediction with POST /v1/predictions. +- Poll for completion, use a webhook, or set `Prefer: wait` for fast models. +- Add a webhook URL at creation time when you want async delivery. +- Read model schemas to validate inputs before sending requests. +- Return output when the prediction status is "succeeded". + +## Guidelines + +- Use HTTPS URLs for file inputs; avoid base64 when possible. +- POST /v1/predictions supports both official and community models. +- Run predictions concurrently rather than serially. +- Webhooks are a good way to receive and store outputs. +- Output file URLs expire after 1 hour; back them up if needed.