Skip to content

DGX Spark support worked in 3.3.0, has been broken since 3.3.1 #449

@raziel2001au

Description

@raziel2001au

With the DGX Spark, support was added in 3.3.0, everything worked, but as of 3.3.1 it:

  • Detects used memory correctly
  • Detects total VRAM incorrectly (VRAM should be total system ram - ram used by system processes, right now it always seems to use 121GB as available VRAM, which is not correct, as you'll see in the screenshots)
  • The graph is no longer displayed correctly, with the memory bar always showing 0 % usage

Screenshot from my compile of the 3.3.0 tag:
Image

Screenshot from my compile of the 3.3.1 tag (but the same issues exist in 3.3.2 and main):
Image

These screenshots were taken one after the other during an AI workload where about 60GB was used by other system processes, you'll note in the newer version it incorrectly reports a total of 121GB of VRAM, which it should be closer to 64GB, since the VRAM and system ram is shared, so when system ram is consumed, total VRAM reduces. Also, the memory graph line is completely broken, showing 0% memory usage.

It was working perfectly in 3.3.0, but in 3.3.1 and newer, nvtop is completely unusable on the DGX Spark.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions