Skip to content

[FEATURE] GPU discovery: user-centric search (model-aware VRAM, session cost, budget filtering) #16

@dims

Description

@dims

Summary

There is currently no way to discover available GPU types from the CLI without knowing instance type strings in advance. The documentation link in --gpu help text (https://brev.dev/docs/reference/gpu) returns a 404.

I see that PR brevdev/brev-cli#289 is adding brev search gpu / brev search cpu — great! This issue is about the next layer: making search user-centric rather than catalog-centric.

The gap

The existing/incoming brev search surfaces catalog fields (type string, VRAM, price/hr, provider). What users actually have when they sit down is:

  • A model name — not a VRAM number
  • A budget for the session — not an hourly rate
  • A need to know if something is available right now — not just is_available: true in the catalog (which is a stale field)

Requested additions

1. Model-aware search — derive VRAM requirement from a HuggingFace model name:

brev search gpu --model "Qwen/Qwen3-Coder-Next-FP8"
# → Model needs ~75 GB VRAM (FP8). Showing single-GPU options with ≥75 GB:
#
# TYPE                  VRAM    PRICE     BOOT    AVAILABLE
# hyperstack_H100       80 GB   .28/hr  ~3 min  ✓ now
# latitude_H100         80 GB   .39/hr  ~4 min  ✓ now

2. Session cost column — show total cost for a given number of hours, not just hourly rate:

brev search gpu --vram 80 --hours 8
# TYPE               VRAM    $/HR      8-HR TOTAL
# hyperstack_H100    80 GB   .28     8.24
# latitude_H100      80 GB   .39     9.12

3. Budget filter:

brev search gpu --vram 80 --budget 30 --hours 8
# → only show options where (price × hours) ≤ budget

4. Live availability check — distinguish catalog availability from actual current provisionability. The catalog is_available field does not reflect live capacity fluctuations. A --now flag or a success-rate signal would help users avoid types that appear available but fail at create time.

5. Print the exact brev create command at the bottom of search output so the user can copy-paste without knowing the type string format.

Why this matters

Users think in terms of tasks and constraints — not instance type strings. The current flow requires them to know the answer before asking the question. Search that starts from what the user knows (model, budget, time) rather than what the catalog exposes (type strings, $/hr) would significantly reduce time-to-first-GPU.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions