Skip to content

[BUG] diskStorage request silently ignored — provisioned disk differs from requested with no warning #17

@dims

Description

@dims

Summary

Requesting diskStorage: "500Gi" when creating a hyperstack_H100 instance succeeds (HTTP 200, instance created) but provisions a ~100 GB root disk. The API response echoes back the requested value rather than the actual provisioned value. There is no error, no warning, and no indication that the request was not honored.

Note: I see this may be related to #12 (Crusoe disk issue). The problem appears to exist across providers.

Steps to reproduce

# Create a Hyperstack H100 instance requesting 500Gi disk
# (via API since CLI --gpu is broken for non-GCP types — see #15)

curl -X POST .../workspaces -d '{ "instanceType": "hyperstack_H100", "diskStorage": "500Gi", ... }'
# → HTTP 200, workspace created, status DEPLOYING

# SSH in after instance is RUNNING
df -h /
# Filesystem  Size  Used  Avail  Use%
# /dev/vda1    97G   16G    82G   16%   ← 97 GB, not 500 GiB

Downstream consequence

A user downloading an 80 GB model to ~/.cache/huggingface (the standard path) fills the 97 GB root disk mid-download. The workload crashes with a generic engine initialization error — the actual disk-full warnings are buried 200 lines earlier in container logs and easy to miss.

The undocumented ephemeral disk

Hyperstack instances have a 750 GB ephemeral disk at /ephemeral that is completely undocumented. This is the correct location for large model caches, but users have no way to discover it without running df -h or lsblk on the instance.

NAME    SIZE  MOUNTPOINTS
vda     100G
└─vda1   99G  /
vdb     750G  /ephemeral   ← exists, not documented anywhere

Expected

  • API returns an error if diskStorage exceeds the provider's maximum, or
  • API returns the actual provisioned disk size in the response body (not the requested value)
  • Catalog API includes disk constraints per instance type so clients can pre-validate
  • /ephemeral is documented on the Hyperstack instance type page

Suggested fix

API response — return actual provisioned disk:

"diskStorage": {
  "requested": "500Gi",
  "root": "100Gi",
  "warning": "hyperstack_H100 has a fixed 100 GiB root disk. Requested size not applied.",
  "ephemeral": [{ "path": "/ephemeral", "size": "750Gi" }]
}

CLI — print actual disk layout after create finishes:

✓ my-h100 is ready
  Root disk:  100 GiB
  Ephemeral:  750 GiB at /ephemeral  ← use this for large model caches (HuggingFace, etc.)

Documentation — add a provider page for Hyperstack noting:

  • Root disk is fixed at ~100 GiB (diskStorage parameter has no effect)
  • 750 GiB ephemeral disk is available at /ephemeral
  • For any workload downloading files >50 GB, always use /ephemeral

CLI version: v0.6.316 | Provider: Hyperstack (via Shadeform)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions