chore: add flash agent skill co-located with source code#254
chore: add flash agent skill co-located with source code#254TimPietruskyRunPod wants to merge 24 commits intomainfrom
Conversation
Adds flash/SKILL.md rewritten around the unified Endpoint class API. Replaces the old skill in runpod/skills which documented the deprecated 8-class resource hierarchy. Co-locating the skill ensures it stays in sync with the codebase. Discoverable via `npx skills add runpod/flash`.
Remove content an agent doesn't need: architecture internals, full enum listings, verbose CLI option tables, redundant code patterns. Keep: constructor params, four modes, cloudpickle rules, gotchas. Point agents to source files for enum details they can read themselves.
- replace v1.6.0 content with eval-tested v1.7.0 skill - remove non-existent flash login command - fix GpuType.ANY to GpuGroup (GpuType has no ANY member) - consolidate from four modes to three (QB, LB, client) - all examples use Endpoint class exclusively - scored 18/18 on eval assertions across 3 test prompts
c951deb to
16a0a2d
Compare
…auto-switch gotcha
runpod-Henrik
left a comment
There was a problem hiding this comment.
PR #254 Review: chore: add flash agent skill co-located with source code
1. Bug: Fabricated CLI subcommands (high confidence — verified against cli/main.py and flash deploy --help)
The CLI section documents subcommands that don't exist:
flash deploy new staging # ❌ not a real command
flash deploy send staging # ❌ not a real command
flash deploy list staging # ❌ not a real command
flash deploy info staging # ❌ not a real command
flash deploy delete staging # ❌ not a real commandThe actual flash deploy is a single command with --env:
flash deploy --env staging # build + deploy to staging
flash deploy --exclude torch,pkg2 # exclude packagesEnvironment management is via flash env:
flash env list
flash env create
flash env get
flash env delete2. Issue: execution_timeout_ms missing from the constructor table
Endpoint.__init__ has execution_timeout_ms: int = 0 but it's absent from the skill's parameter table. This is user-facing (needed for long-running jobs) and was a known fix (executionTimeoutMs → execution_timeout_ms snake_case rename in 1.7.0).
3. Question: "Auto GPU switching requires workers >= 5"
Gotcha #8 states this as a rule. Is this a documented platform policy? If not verified, it could mislead users into setting unnecessarily high worker counts.
4. Nit: ADA_80_PRO VRAM labeled "80GB" but includes H100 NVL
The source docstring says: "NVIDIA H100 PCIe, NVIDIA H100 80GB HBM3, NVIDIA H100 NVL". H100 NVL is 94GB, not 80GB. Worth noting the label is approximate.
5. Nit: Missing less-common constructor params
accelerate_downloads, datacenter, scaler_type, scaler_value are in __init__ but absent from the table. Fine to omit as advanced, but accelerate_downloads=False is a useful workaround for slow dep installs.
Verdict: NEEDS WORK — The CLI section needs to be corrected before merge. An agent using this skill will generate flash deploy new/send/list commands that fail immediately.
🤖 Reviewed by Henrik's AI-Powered Bug Finder
- Replace made-up `flash deploy new/send/list/info/delete` with actual `flash deploy --env`, `flash env list/create/get/delete` commands - Add `flash deploy --preview` for local Docker preview - Add `execution_timeout_ms` to Endpoint constructor
|
we continue in runpod/skills#7 |
Summary
flash/SKILL.md(588 lines) rewritten around the unifiedEndpointclass APIrunpod/skillswhich documented the deprecated 8-class resource hierarchy (LiveServerless,CpuLiveServerless, etc.)npx skills add runpod/flashSkill contents
flash login,flash init,flash run,flash deploy)endpoint.py)cli/main.py)Test plan
npx skills add runpod/flashdiscovers the skillgpu_worker.py,cpu_worker.py,lb_worker.py)Endpoint.__init__inendpoint.pycli/main.py