-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
There is currently no way to discover available GPU types from the CLI without knowing instance type strings in advance. The documentation link in --gpu help text (https://brev.dev/docs/reference/gpu) returns a 404.
I see that PR brevdev/brev-cli#289 is adding brev search gpu / brev search cpu — great! This issue is about the next layer: making search user-centric rather than catalog-centric.
The gap
The existing/incoming brev search surfaces catalog fields (type string, VRAM, price/hr, provider). What users actually have when they sit down is:
- A model name — not a VRAM number
- A budget for the session — not an hourly rate
- A need to know if something is available right now — not just
is_available: truein the catalog (which is a stale field)
Requested additions
1. Model-aware search — derive VRAM requirement from a HuggingFace model name:
brev search gpu --model "Qwen/Qwen3-Coder-Next-FP8"
# → Model needs ~75 GB VRAM (FP8). Showing single-GPU options with ≥75 GB:
#
# TYPE VRAM PRICE BOOT AVAILABLE
# hyperstack_H100 80 GB .28/hr ~3 min ✓ now
# latitude_H100 80 GB .39/hr ~4 min ✓ now2. Session cost column — show total cost for a given number of hours, not just hourly rate:
brev search gpu --vram 80 --hours 8
# TYPE VRAM $/HR 8-HR TOTAL
# hyperstack_H100 80 GB .28 8.24
# latitude_H100 80 GB .39 9.123. Budget filter:
brev search gpu --vram 80 --budget 30 --hours 8
# → only show options where (price × hours) ≤ budget4. Live availability check — distinguish catalog availability from actual current provisionability. The catalog is_available field does not reflect live capacity fluctuations. A --now flag or a success-rate signal would help users avoid types that appear available but fail at create time.
5. Print the exact brev create command at the bottom of search output so the user can copy-paste without knowing the type string format.
Why this matters
Users think in terms of tasks and constraints — not instance type strings. The current flow requires them to know the answer before asking the question. Search that starts from what the user knows (model, budget, time) rather than what the catalog exposes (type strings, $/hr) would significantly reduce time-to-first-GPU.