Summary
Hyperstack instances have a fixed ~100 GB root disk and a 750 GB ephemeral disk at /ephemeral. Neither fact is documented. For model-serving workloads this causes the root disk to fill mid-download with no clear error.
This is the docs companion to #17 (the API/bug side of the same issue).
What needs documenting
On the Hyperstack instance page:
- Root disk: ~100 GiB, fixed — the
diskStorage parameter has no effect
- Ephemeral disk: ~750 GiB at
/ephemeral, survives reboots, may be wiped on re-provision
For model-serving workloads:
# Wrong — fills the 100 GiB root disk mid-download for any model >50 GB
docker run -v ~/.cache/huggingface:/root/.cache/huggingface ...
# Correct
mkdir -p /ephemeral/huggingface
docker run -v /ephemeral/huggingface:/root/.cache/huggingface ...
Impact
A user downloading an 80 GB model (Llama 70B, Qwen3-Coder-Next-FP8, etc.) to the default HuggingFace cache fills the root disk mid-download. The container crashes with a generic engine error — the actual disk-full warning is buried in container logs. The instance must be deleted and re-created. An hour of GPU time is wasted.
This note should appear on the Hyperstack instance type page and in any quickstart involving large model downloads.