From 85b38e34133a4bee7fe2e8d77a3350fc31ba0bbe Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Thu, 12 Mar 2026 11:32:44 -0400 Subject: [PATCH 1/2] highlight a common confusion --- .../02_getting_and_renewing_an_account.mdx | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/hpc/01_getting_started/02_getting_and_renewing_an_account.mdx b/docs/hpc/01_getting_started/02_getting_and_renewing_an_account.mdx index 9de726f6e2..63f6092e6e 100644 --- a/docs/hpc/01_getting_started/02_getting_and_renewing_an_account.mdx +++ b/docs/hpc/01_getting_started/02_getting_and_renewing_an_account.mdx @@ -18,7 +18,12 @@ This section deals with the eligibility for getting HPC accounts, the process to - All **sponsored accounts** will be created for a period of 12 months, at which point a renewal process is required to continue to use the clusters - Faculty, students and staff from the **NYU School of Medicine** require the sponsorship of an eligible NYU faculty member to access the NYU HPC clusters - **Non-NYU Researchers** who are collaborating with NYU researchers must obtain an affiliate status before applying for an NYU HPC account -- An HPC account gives you access to Torch, but an active allocation within the HPC projects management portal gives you access to a SLURM account which is needed to run jobs on Torch. More information on using the HPC project management portal can be found [here](./07_hpc_project_management_portal.mdx) + +::: + +:::warning + +An HPC account gives you access to Torch, but an active allocation within the HPC projects management portal gives you access to a SLURM account which is needed to run jobs on Torch. More information on using the HPC project management portal can be found [here](./07_hpc_project_management_portal.mdx) ::: From 83b9a84350c79fad5c0e708b5a9c1c8e65d9e153 Mon Sep 17 00:00:00 2001 From: Sajid Ali Date: Thu, 12 Mar 2026 11:37:21 -0400 Subject: [PATCH 2/2] highlight another common confusion --- docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md b/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md index ec5560916a..8c9a47aa17 100644 --- a/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md +++ b/docs/hpc/05_submitting_jobs/01_slurm_submitting_jobs.md @@ -21,6 +21,12 @@ Jobs within the same partition cannot exceed their assigned resources (`QOSGrpGR Non-stakeholders to temporarily use stakeholder resources (a stakeholder group to temporarily use another group’s resources). Stakeholders retain normal access to their own resources. If non-stakeholders (or other stakeholders) are using them, their jobs may be preempted (canceled) once stakeholders submit new jobs. Public users are allowed to use stakeholder resources only with preemption partitions. Refer to the section below for details on preemptible jobs. +:::tip + +`QOSGrpGRES` indicates that there are currently no GPUs available in the partition; it does not reflect an issue with your individual account. In contrast, messages such as `QOSMaxMemoryPerUser` and `QOSMaxCpuPerUserLimit` indicate limits imposed on a user’s account. + +::: + ## Job Submission on Torch As stated in the tuturial, always only request the compute resources (e.g., GPUs, CPUs, memory) needed for the job. Requesting too many resources can prevent your job from being scheduled within an adequate time. The `SLURM` scheduler will automatically dispatch jobs to all accessible GPU partitions that match resource requests.