d. Create a Cluster Config

Now that you installed AWS ParallelCluster and set up the foundation, you can create a configuration file to build a simple HPC Cluster. This file is generated in your home directory.

Generate the cluster with the following settings:

  • Head-node and compute nodes: m5.2xlarge and c5n.18xlarge instances. You can change the instance type if you like, but you may run into EC2 limits that may prevent you from creating instances or create too many instances.
  • AWS ParallelCluster (since version 2.9) supports multiple instance types and multiple queues.
  • We use a placement group in this lab. A placement group will spin up instances close together, in a single network spine, inside one physical data center located in a specific Availability Zone to maximize the bandwidth and reduce the latency between instances.
  • In this lab, the cluster has 0 compute nodes when starting and a maximum of 2 instances. AWS ParallelCluster will grow and shrink between the min and max limits based on the cluster utilization and job queue backlog.
  • A GP2 Amazon EBS volume will be attached to the head-node then shared through NFS to be mounted by the compute nodes on /shared. It is generally a good location to store applications or scripts. Keep in mind that the /home directory is shared on NFS as well.
  • SLURM is used as a job scheduler
  • We disable Intel Hyper-threading by setting disable_hyperthreading : true in the configuration file.

For more details about the AWS ParallelCluster configuration options, see the AWS ParallelCluster User Guide.

For now, paste the following commands in your terminal:

  1. Let us first makes sure all the required environment vairables from the previous section are set, Source the env_vars file generated in your working directory previously
source env_vars
  1. You can have a look at them by running:
echo ${AWS_REGION}
echo ${INSTANCES}
echo ${SSH_KEY_NAME}
echo ${VPC_ID}
echo ${SUBNET_ID}
echo ${CUSTOM_AMI}
  1. Retrieve NCAR WRF v4 AMI

NCAR provides an Amazon Machine Image (AMI) that contains a compiled version of WRF v4. You will leverage this AMI to run WRF on a test case in the next section of this lab.

Your env_vars file already contains the AMI ID you need, you can see below - just for your convenience - how we did retrieve this AMI ID.

CUSTOM_AMI=`aws ec2 describe-images --owners 280472923663 \
    --query 'Images[*].{ImageId:ImageId,CreationDate:CreationDate}' \
    --filters "Name=name,Values=*-amzn2-parallelcluster-3.1.2-wrf-4.2.2-*" \
    --region ${AWS_REGION} \
    | jq -r 'sort_by(.CreationDate)[-1] | .ImageId'`

  1. Build the custom config file for ParallelCluster
cat > my-cluster-config.yaml << EOF
HeadNode:
  InstanceType: m5.2xlarge
  Ssh:
    KeyName: ${SSH_KEY_NAME}
  Networking:
    SubnetId: ${SUBNET_ID}
  LocalStorage:
    RootVolume:
      Size: 50
  Iam:
    AdditionalIamPolicies:
      - Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
  Dcv:
    Enabled: true
  Imds:
    Secured: true
Scheduling:
  Scheduler: slurm
  SlurmQueues:
    - Name: queue0
      ComputeResources:
        - Name: queue0-c5n18xlarge
          MinCount: 0
          MaxCount: 2
          InstanceType: c5n.18xlarge
          DisableSimultaneousMultithreading: true
          Efa:
            Enabled: true
      Networking:
        SubnetIds:
          - ${SUBNET_ID}
        PlacementGroup:
          Enabled: true
      ComputeSettings:
        LocalStorage:
          RootVolume:
            Size: 50
Region: ${AWS_REGION}
Image:
  Os: alinux2
  CustomAmi: ${CUSTOM_AMI}
SharedStorage:
  - Name: Ebs0
    StorageType: Ebs
    MountDir: /shared
    EbsSettings:
      VolumeType: gp2
      DeletionPolicy: Delete
      Size: '50'

EOF

Now, you are ready to launch a cluster! Proceed to the next section.