Skip to content

Inconsistent Job Status Display in gridtk list After Resubmission Due to sacct Delay #17

@183amir

Description

@183amir

When jobs are resubmitted to Slurm using gridtk resubmit, there is a delay before the updated job information appears in sacct. Since gridtk list relies on sacct internally, it continues to display the previous state of the job for some time after resubmission, while direct Slurm commands like squeue show the current accurate state.

Example demonstrating the issue:

$ gridtk resubmit
Resubmitted job 1
$ gridtk list
  job-id    slurm-id  nodes    state          job-name    output                    dependencies    command
--------  ----------  -------  -------------  ----------  ------------------------  --------------  ----------------------------------------------------------------------------------------
       1     2894248  node01   CANCELLED (0)  test-ff     logs/test-ff.2894248.out                  gridtk submit --time 0-8 --mem 32G test.sh
$ squeue --me
             JOBID PARTITION     NAME     USER ST       TIME        NODES  NODELIST(REASON)
           2894248       cpu   test-ff    amir   R       INVALID  1       mode01
$ gridtk list
  job-id    slurm-id  nodes    state          job-name    output                    dependencies    command
--------  ----------  -------  -------------  ----------  ------------------------  --------------  ----------------------------------------------------------------------------------------
       1     2894248  node01   CANCELLED (0)  test-ff     logs/test-ff.2894248.out                  gridtk submit --time 0-8 --mem 32G test.sh

As shown above, squeue correctly displays the job as running (status "R"), but gridtk list still shows it as "CANCELLED (0)" due to the delay in sacct updates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions