GPU accelerator configuration
The accelerator role allows users to set up the AMD ROCm platform or the CUDA Nvidia toolkit. These tools allow users to unlock the potential of installed GPUs.
Enter all required parameters in input/accelerator_config.yml
.
Note
Nodes provisioned using the Omnia provision tool do not require a RedHat subscription to run
accelerator.yml
on RHEL target nodes.For RHEL target nodes not provisioned by Omnia, ensure that RedHat subscription is enabled on all target nodes. Every target node will require a RedHat subscription.
If
cuda_toolkit_path
is provided ininput/provision_config.yml
and NVIDIA GPUs are available on the target nodes, CUDA packages will be deployed post provisioning without user intervention during the execution ofprovision.yml
.AMD ROCm driver installation is not supported by Omnia on Rocky cluster nodes.
To install all the latest GPU drivers and toolkits, run:
cd accelerator
ansible-playbook accelerator.yml -i inventory
(where inventory consists of manager, cluster and login nodes)
- The following configurations take place when running
accelerator.yml
Servers with AMD GPUs are identified and the latest GPU drivers and ROCm platforms are downloaded and installed.
Servers with NVIDIA GPUs are identified and the specified CUDA toolkit is downloaded and installed.
For the rare servers with both NVIDIA and AMD GPUs installed, all the above mentioned download-ables are installed to the server.
Servers with neither GPU are skipped.
If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.