Skip to content

Releases: aws/aws-parallelcluster-node

AWS ParallelCluster v2.10.3

18 Mar 22:06

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.10.3

This is associated with AWS ParallelCluster v2.10.3

CHANGES

  • There were no notable changes for this version.

AWS ParallelCluster v2.10.2

02 Mar 16:33

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.10.2

This is associated with AWS ParallelCluster v2.10.2

CHANGES

  • There were no notable changes for this version.

AWS ParallelCluster v2.10.1

22 Dec 23:16

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.10.1

This is associated with AWS ParallelCluster v2.10.1

ENHANCEMENTS

  • Improve error handling in slurm plugin processes when clustermgtd is down.

CHANGES

  • Increase max attempts when retrying on Route53 API call failures.

AWS ParallelCluster v2.10.0

18 Nov 16:21

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.10.0

This is associated with AWS ParallelCluster v2.10.0

ENHANCEMENTS

  • Add new all_or_nothing_batch configuration parameter for slurm_resume script. When True, slurm_resume will
    succeed only if all the instances required by all the pending jobs in Slurm will be available.

CHANGES

  • CentOS 6 is no longer supported.
  • Optimize retrieval of nodes info from Slurm scheduler.
  • Improve retrieval of instance type info by using DescribeInstanceType API.
  • Increase timeout from 10 to 30 seconds when clustermgtd and computemgtd daemons invoke Slurm commands.

BUG FIXES

  • Retrieve the right number of compute instance slots when instance type is updated.
  • Fix a bug that was causing clustermgtd and computemgtd sleep interval to be incorrectly computed when
    system timezone is not set to UTC.

AWS ParallelCluster v2.9.1

15 Sep 14:38

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.9.1.

This is associated with AWS ParallelCluster v2.9.1.

CHANGES

  • There were no notable changes for this version.

AWS ParallelCluster v2.9.0

11 Sep 18:53

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.9.0

This is associated with AWS ParallelCluster v2.9.0

ENHANCEMENTS

  • Add support for multiple queues and multiple instance types feature with the Slurm scheduler.
    • Replace the previously available scaling components with: clustermgtd daemon that takes care of handling compute fleet management operations, included the processing of health check failures coming from EC2 and cluster start/stop operations; slurm_resume and slurm_suspend scripts that integrate with Slurm power saving plugin; computemgtd daemon that monitors the health of the system from the compute nodes.
    • Replace Auto Scaling Group with plain EC2 APIs when provisioning cluster nodes.
    • Register cluster nodes in a Route53 private hosted zone when DNS resolution is enabled for the cluster.
    • Register mapping between Slurm node names and EC2 instances in DynamoDB table.
    • Create log files for the new components in /var/log/parallelcluster/ dir.

AWS ParallelCluster v2.8.1

04 Aug 15:54

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.8.1.

This is associated with AWS ParallelCluster v2.8.1.

CHANGES

  • There were no notable changes for this version.

AWS ParallelCluster v2.8.0

24 Jul 00:53

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.8.0.

This is associated with AWS ParallelCluster v2.8.0

ENHANCEMENTS

  • Dynamically generate the architecture-specific portion of the path to the SGE binaries directory.

AWS ParallelCluster v2.7.0

19 May 08:37

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster Node 2.7.0

This is associated with AWS ParallelCluster v2.7.0

ENHANCEMENTS

  • sqswatcher: The daemon is now compatible with VPC Endpoints so that SQS messages can be passed without traversing the public internet.

AWS ParallelCluster v2.6.1

09 Apr 23:04

Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.6.1.

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Improved the management of SQS messages and retries to speed-up recovery times when failures occur.

CHANGES

  • Do not launch a replacement for an unhealthy or unresponsive node until this is terminated. This makes cluster slower at provisioning new nodes when failures occur but prevents any temporary over-scaling with respect to the expected capacity.
  • Increase parallelism when starting slurmd on compute nodes that join the cluster from 10 to 30.
  • Reduce the verbosity of messages logged by the node daemons.
  • Do not dump logs to /home/logs when nodewatcher encounters a failure and terminates the node. CloudWatch can be used to debug such failures.
  • Reduce the number of retries for failed REMOVE events in sqswatcher.

BUG FIXES

  • Fixed a bug in the ordering and retrying of SQS messages that was causing, under certain circumstances of heavy load, the scheduler configuration to be left in an inconsistent state.
  • Delete from queue the REMOVE events that are discarded due to hostname collision with another event fetched as part of the same sqswatcher iteration.

Support

Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192