Skip to content

Releases: aws/aws-parallelcluster

AWS ParallelCluster v2.11.9

02 Dec 12:18
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.9

Upgrade

How to upgrade?

sudo pip install aws-parallelcluster==2.11.9

BUG FIXES

  • Prevent updating vpc_security_group_id when a managed FSx for Lustre file system is configured in the cluster.
    Doing so would result in file system deletion and potential data loss.

AWS ParallelCluster v3.3.1

03 Dec 00:49
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.3.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

CHANGES

  • Allow to use official product AMIs even after the two years EC2 deprecation time.
  • Increase memory size of ParallelCluster API Lambda to 2048 in order to reduce cold start penalty and avoid timeouts.

BUG FIXES

  • Prevent managed FSx for Lustre file systems to be replaced during a cluster update avoiding to support changes on the compute fleet subnet id.
  • Apply the DeletionPolicy defined on shared storages also during the cluster update operations.

AWS ParallelCluster v2.11.8

15 Nov 01:36
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.8

Upgrade

How to upgrade?

sudo pip install aws-parallelcluster==2.11.8

CHANGES

  • Upgrade Intel MPI Library to 2021.6.0.602.
  • Upgrade EFA installer to 1.19.0
    • Efa-driver: efa-1.16.0-1
    • Efa-config: efa-config-1.11-1
    • Efa-profile: efa-profile-1.5-1
    • Libfabric-aws: libfabric-aws-1.16.0-1
    • Rdma-core: rdma-core-41.0-2
    • Open MPI: openmpi40-aws-4.1.4-3
  • Upgrade Python runtime used by Lambda functions in AWS Batch integration to python3.9.

BUG FIXES

  • Prevent cluster tags to be changed during an update because not supported.

AWS ParallelCluster v3.1.5

16 Nov 13:54
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.1.5

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

CHANGES

  • Upgrade EFA installer to 1.18.0
    • Efa-driver: efa-1.16.0-1
    • Efa-config: efa-config-1.11-1
    • Efa-profile: efa-profile-1.5-1
    • Libfabric-aws: libfabric-aws-1.16.0~amzn4.0-1
    • Rdma-core: rdma-core-41.0-2
    • Open MPI: openmpi40-aws-4.1.4-2
  • Add lambda:ListTags and lambda:UntagResource to ParallelClusterUserRole used by ParallelCluster API stack for cluster update.
  • Upgrade Intel MPI Library to 2021.6.0.602.
  • Upgrade NVIDIA driver to version 470.141.03.
  • Upgrade NVIDIA Fabric Manager to version 470.141.03.

BUG FIXES

  • Fix Slurm issue that prevents idle nodes termination.

AWS ParallelCluster v3.3.0

02 Nov 15:06
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.3.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add possibility to specify multiple EC2 instance types for the same compute resource.
  • Add support for adding and removing shared storages at cluster update by updating SharedStorage configuration.
  • Add new configuration parameter DeletionPolicy for EFS and FSx for Lustre shared storage to support storage retention.
  • Add new configuration section Scheduling/SlurmSettings/Database to enable accounting functionality in Slurm.
  • Add support for On-Demand Capacity Reservations and Capacity Reservations Resource Groups.
  • Add new configuration parameter in Imds/ImdsSettings to specify the IMDS version to support in a cluster or build image infrastructure.
  • Add support for Networking/PlacementGroup in the SlurmQueues/ComputeResources section.
  • Add support for instances with multiple network interfaces that allows only one ENI per device.
  • Improve validation of networking for external EFS file systems by checking the CIDR block in the attached security group.
  • Add validator to check if configured instance types support placement groups.
  • Configure NFS threads to be min(256, max(8, num_cores * 4)) to ensure better stability and performance.
  • Move NFS installation at build time to reduce configuration time.
  • Enable server-side encryption for the EcrImageBuilder SNS topic created when deploying ParallelCluster API and used to notify on docker image build events.

CHANGES

  • Change behaviour of SlurmQueues/Networking/PlacementGroup/Enabled: now it creates a different managed placement
    group for each compute resource instead of a single managed placement group for all compute resources.
  • Add support for PlacementGroup/Name as the preferred naming method.
  • Move head node tags from Launch Template to instance definition to avoid head node replacement on tags updates.
  • Disable Multithreading through script executed by cloud-init and not through CpuOptions set into Launch Template.
  • Upgrade Python to version 3.9 and NodeJS to version 16 in API infrastructure, API Docker container and cluster Lambda resources.
  • Remove support for Python 3.6 in aws-parallelcluster-batch-cli.
  • Upgrade Slurm to version 22.05.5.
  • Upgrade NVIDIA driver to version 470.141.03.
  • Upgrade NVIDIA Fabric Manager to version 470.141.03.
  • Upgrade NVIDIA CUDA Toolkit to version 11.7.1.
  • Upgrade Python used in ParallelCluster virtualenvs from 3.7.13 to 3.9.15.
  • Upgrade Slurm to version 22.05.5.
  • Upgrade EFA installer to version 1.18.0.
  • Upgrade NICE DCV to version 2022.1-13300.
  • Allow for suppressing the SingleSubnetValidator for Queues.

BUG FIXES

  • Fix validation of filters parameter in ListClusterLogStreams command to fail when incorrect filters are passed.
  • Fix validation of parameter SharedStorage/EfsSettings: now validation fails when FileSystemId is specified
    along with other SharedStorage/EfsSettings parameters, whereas it was previously ignoring them.
  • Fix cluster update when changing the order of SharedStorage together with other changes in the configuration.
  • Fix UpdateParallelClusterLambdaRole in the ParallelCluster API to upload logs to CloudWatch.
  • Fix Cinc not using the local CA certificates bundle when installing packages before any cookbooks are executed.
  • Fix a hang in upgrading ubuntu via pcluster build-image when Build:UpdateOsPackages:Enabled:true is set.
  • Fix parsing of YAML cluster configuration by failing on duplicate keys.

AWS ParallelCluster v3.2.1

03 Oct 08:59
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.2.1

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Improve the logic to associate the host routing tables to the different network cards to better support EC2 instances with several NICs.

CHANGES

  • Upgrade NVIDIA driver to version 470.141.03.
  • Upgrade NVIDIA Fabric Manager to version 470.141.03.
  • Disable cron job tasks man-db and mlocate, which may have a negative impact on node performance.
  • Upgrade Intel MPI Library to 2021.6.0.602.
  • Upgrade Python from 3.7.10 to 3.7.13 in response to this security risk.

BUG FIXES

  • Avoid failing on DescribeCluster when cluster configuration is not available.

AWS ParallelCluster v3.2.0

27 Jul 17:48
fdc0dfd
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.2.0

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add support for memory-based job scheduling in Slurm
    • Configure compute nodes real memory in the Slurm cluster configuration.
    • Add new configuration parameter Scheduling/SlurmSettings/EnableMemoryBasedScheduling to enable memory-based scheduling in Slurm.
    • Add new configuration parameter Scheduling/SlurmQueues/ComputeResources/SchedulableMemory to override default value of the memory seen by the scheduler on compute nodes.
  • Improve flexibility on cluster configuration updates to avoid the stop and start of the entire cluster whenever possible.
    • Add new configuration parameter Scheduling/SlurmSettings/QueueUpdateStrategy to set the preferred strategy to adopt for compute nodes needing a configuration update and replacement.
  • Improve failover mechanism over available compute resources when hitting insufficient capacity issues with EC2 instances. Disable compute nodes by a configurable amount of time (default 10 min) when a node launch fails due to insufficient capacity.
  • Add support to mount existing FSx for ONTAP and FSx for OpenZFS file systems.
  • Add support to mount multiple instances of existing EFS, FSx for Lustre / for ONTAP/ for OpenZFS file systems.
  • Add support for FSx for Lustre Persistent_2 deployment type when creating a new file system.
  • Prompt user to enable EFA for supported instance types when using pcluster configure wizard.
  • Add support for rebooting compute nodes via Slurm.
  • Improved handling of Slurm power states to also account for manual powering down of nodes.
  • Add NVIDIA GDRCopy 2.3 into the product AMIs to enable low-latency GPU memory copy.

CHANGES

  • Upgrade EFA installer to version 1.17.2
    • EFA driver: efa-1.16.0-1
    • EFA configuration: efa-config-1.10-1
    • EFA profile: efa-profile-1.5-1
    • Libfabric: libfabric-aws-1.16.0~amzn2.0-1
    • RDMA core: rdma-core-41.0-2
    • Open MPI: openmpi40-aws-4.1.4-2
  • Upgrade NICE DCV to version 2022.0-12760.
  • Upgrade NVIDIA driver to version 470.129.06.
  • Upgrade NVIDIA Fabric Manager to version 470.129.06.
  • Change default EBS volume types from gp2 to gp3 for both the root and additional volumes.
  • Changes to FSx for Lustre file systems created by ParallelCluster:
    • Change the default deployment type to Scratch_2.
    • Change the Lustre server version to 2.12.
  • Do not require PlacementGroup/Enabled to be set to true when passing an existing PlacementGroup/Id.
  • Add parallelcluster:cluster-name tag to all the resources created by ParallelCluster.
  • Do not allow setting PlacementGroup/Id when PlacementGroup/Enabled is explicitly set to false.
  • Add lambda:ListTags and lambda:UntagResource to ParallelClusterUserRole used by ParallelCluster API stack for cluster update.
  • Restrict IPv6 access to IMDS to root and cluster admin users only, when configuration parameter HeadNode/Imds/Secured is true as by default.
  • With a custom AMI, use the AMI root volume size instead of the ParallelCluster default of 35 GiB. The value can be changed in cluster configuration file.
  • Automatic disabling of the compute fleet when the configuration parameter Scheduling/SlurmQueues/ComputeResources/SpotPrice
    is lower than the minimum required Spot request fulfillment price.
  • Show requested_value and current_value values in the change set when adding or removing a section during an update.
  • Disable aws-ubuntu-eni-helper service in DLAMI to avoid conflicts with configure_nw_interface.sh when configuring instances with multiple network cards.
  • Remove support for Python 3.6.
  • Set MTU to 9001 for all the network interfaces when configuring instances with multiple network cards.
  • Remove the trailing dot when configuring the compute node FQDN.

BUG FIXES

  • Fix the default behavior to skip the ParallelCluster validation and test steps when building a custom AMI.
  • Fix file handle leak in computemgtd.
  • Fix race condition that was sporadically causing launched instances to be immediately terminated because not available yet in EC2 DescribeInstances response
  • Fix support for DisableSimultaneousMultithreading parameter on instance types with Arm processors.
  • Fix ParallelCluster API stack update failure when upgrading from a previus version. Add resource pattern used for the ListImagePipelineImages action in the EcrImageDeletionLambdaRole.
  • Fix ParallelCluster API adding missing permissions needed to import/export from S3 when creating an FSx for Lustre storage.

AWS ParallelCluster v2.11.7

13 May 16:46
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.7

Upgrade

How to upgrade?

sudo pip install aws-parallelcluster==2.11.7

CHANGES

  • Upgrade Slurm to version 20.11.9.

AWS ParallelCluster v3.1.4

16 May 19:57
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 3.1.4

Upgrade

How to upgrade?

sudo pip install --upgrade aws-parallelcluster

ENHANCEMENTS

  • Add validation for DirectoryService/PasswordSecretArn to fail in case the secret does not exist.

CHANGES

  • Upgrade Slurm to version 21.08.8-2.
  • Build Slurm with JWT support.
  • Do not require PlacementGroup/Enabled to be set to true when passing an existing PlacementGroup/Id.
  • Add lambda:TagsResource to ParallelClusterUserRole used by ParallelCluster API stack for cluster creation and image creation.

BUG FIXES

  • Fix the ability to export cluster's logs when using export-cluster-logs command with the --filters option.
  • Fix AWS Batch Docker entrypoint to use /home shared directory to coordinate Multi-node-Parallel job execution.

AWS ParallelCluster v2.11.6

19 Apr 13:27
Compare
Choose a tag to compare

We're excited to announce the release of AWS ParallelCluster 2.11.6

Upgrade

How to upgrade?

sudo pip install aws-parallelcluster==2.11.6

ENHANCEMENTS

  • Improve exception management in case of missing networking.

CHANGES

  • OS package updates and security fixes.