Input parameters for the cluster

These parameters are located in input/omnia_config.yml, input/security_config.yml, input/telemetry_config.yml and [optional] input/storage_config.yml.

Caution

Do not remove or comment any lines in the input/omnia_config.yml, input/security_config.yml and [optional] input/storage_config.yml file.

omnia_config.yml

Parameters for kubernetes

Variables

Details

scheduler_type

string

Required

  • Job scheduler to be installed across all nodes in the cluster*

  • To install slurm provide scheduler_type: "slurm"

  • To install k8s provide scheduler_type: "k8s"

  • To install slurm and k8s provide scheduler_type: "slurm,k8s"

Default value: "slurm"

k8s_version

string

Required

  • Kubernetes version.

  • Required when scheduler_type: "k8s"

    Choices:

    • "1.19.3" <- default

    • " 1.16.7"

k8s_cni

string

Required

  • Kubernetes SDN network.

  • Required when scheduler_type: "k8s"

    Choices:

    • "calico" <- default

    • "flannel"

k8s_pod_network_cidr

string

Required

  • Kubernetes pod network CIDR.

  • Make sure this value does not overlap with any of the host networks.

  • Required when scheduler_type: "k8s"

    Default values: "10.244.0.0/16"

docker_username

string

Optional

  • Username for Dockerhub account

  • A kubernetes secret will be created and patched to service account in default namespace. This kubernetes secret can be used to pull images from private repositories

  • This value is optional but suggested avoiding docker pull limit issues.

  • The first character of the string should be an alphabet.

docker_password

string

Optional

  • Password for Dockerhub account

  • This value is mandatory if docker username is provided.

  • The first character of the string should be an alphabet.

ansible_config_file_path

string

Required

  • Path to directory hosting ansible config file (ansible.cfg file)

  • This directory is on the host running ansible, if ansible is installed using dnf

  • If ansible is installed using pip, this path should be set

    Default values: /etc/ansible

enable_omnia_nfs

boolean [1]

Required

  • Boolean indicating whether a parallel file system is not running in the environment and a share file system (NFS/BeeGFS) will be used to create home directory/ Kubernetes share directory on it.

  • When this variable is true, Omnia will create its own NFS share and mount omnia_usrhome_share on all the nodes.

    Choices:

    • true <- default

    • false

omnia_usrhome_share

string

Required

  • Path to directory which will be shared across all nodes in the cluster.

  • If enable_omnia_nfs: true, NFS share will be created at path mentioned below.

  • If enable_omnia_nfs: false, set this variable as path to parallel file system(NFS/BeeGFS) running in the system.

Default value: “/home/omnia-share”

Parameters for slurm setup

Variables

Details

scheduler_type

string

Required

  • Job scheduler to be installed across all nodes in the cluster*

  • To install slurm provide scheduler_type: "slurm"

  • To install k8s provide scheduler_type: "k8s"

  • To install slurm and k8s provide scheduler_type: "slurm,k8s"

Default value: "slurm"

mariadb_password

string

Optional

  • Password used for Slurm database.

  • The password must not contain -,, ‘,”

  • The length of the password should be at least 8.

  • Required when scheduler_type: "slurm".

  • The first character of the string should be an alphabet.

    Default value: "password"

ansible_config_file_path

string

Required

  • Path to directory hosting ansible config file (ansible.cfg file)

  • This directory is on the host running ansible, if ansible is installed using dnf

  • If ansible is installed using pip, this path should be set

    Default values: /etc/ansible

enable_omnia_nfs

boolean [1]

Required

  • Boolean indicating whether a parallel file system is not running in the environment and a share file system (NFS/BeeGFS) will be used to create home directory/ Kubernetes share directory on it.

  • When this variable is true, Omnia will create its own NFS share and mount omnia_usrhome_share on all the nodes.

    Choices:

    • true <- default

    • false

omnia_usrhome_share

string

Required

  • Path to directory which will be shared across all nodes in the cluster.

  • If enable_omnia_nfs: true, NFS share will be created at path mentioned below.

  • If enable_omnia_nfs: false, set this variable as path to parallel file system(NFS/BeeGFS) running in the system.

Default value: “/home/omnia-share”

security_config.yml

Parameters for FreeIPA

Parameter

Details

freeipa_required

boolean [1] Required

Boolean indicating whether FreeIPA is required or not.

Choices:

  • true <- Default

  • false

realm_name

string Required

Sets the intended realm name.

Default value: OMNIA.TEST

directory_manager_password

string Required

  • Password authenticating admin level access to the Directory for system management tasks.

  • It will be added to the instance of directory server created for IPA.

  • Required Length: 8 characters.

  • The password must not contain -,, �,�

  • The first character of the string should be an alphabet.

kerberos_admin_password

string Required

  • “admin” user password for the IPA server on RockyOS.

  • The first character of the string should be an alphabet.

domain_name

string Required

Sets the intended domain name

Default value: omnia.test

Parameters for LDAP

Parameter

Details

ldap_required

boolean [1] Optional

Boolean indicating whether ldap client is required or not.

Choices:

  • false <- Default

  • true

domain_name

string Optional

Sets the intended domain name.

Default values: omnia.test

ldap_server_ip

string Optional

LDAP server IP. Required if ldap_required is true. There should be an explicit LDAP server running on this IP.

ldap_connection_type

string Optional

For a TLS connection, provide a valid certification path. For an SSL connection, ensure port 636 is open.

Default values: TLS

ldap_ca_cert_path

string Optional

This variable accepts Server Certificate Path. Make sure certificate is present in the path provided. The certificate should have .pem or .crt extension. This variable is mandatory if connection type is TLS.

Default values: /etc/openldap/certs/omnialdap.pem

user_home_dir

string Optional

This variable accepts the user home directory path for ldap configuration. If nfs mount is created for user home, make sure you provide the LDAP users mount home directory path.

Default values: /home

ldap_bind_username

string Optional

If LDAP server is configured with bind dn then bind dn user to be provided. If this value is not provided (when bind is configured in server) then ldap authentication fails. Omnia does not validate this input. Ensure that it is valid and proper.

Default values: admin

ldap_bind_password

string Optional

  • If LDAP server is configured with bind dn then bind dn password to be provided. If this value is not provided (when bind is configured in server) then ldap authentication fails. Omnia does not validate this input. Ensure that it is valid and proper.

  • The first character of the string should be an alphabet.

enable_secure_login_node

boolean [1] Optional

Boolean value deciding whether security features are enabled on the Login Node.

Choices:

  • false <- Default

  • true

storage_config.yml

Name

Details

nfs_client_params

JSON list Optional

If NFS client services are to be deployed, enter the configuration required here in JSON format. The server_ip provided should have an explicit NFS server running. If left blank, no NFS configuration takes place. Possible values include: 1. Single NFS file system: A single filesystem from a single NFS server is mounted.

Sample value: - { server_ip: xx.xx.xx.xx, server_share_path:   “/mnt/share”, client_share_path: “/mnt/client”, client_mount_options:   “nosuid,rw,sync,hard,intr” } 2. Multiple Mount NFS file system: Multiple filesystems from a single NFS server are mounted. Sample values:

- { server_ip: xx.xx.xx.xx, server_share_path: “/mnt/server1”,   client_share_path: “/mnt/client1”, client_mount_options:   “nosuid,rw,sync,hard,intr” } - { server_ip: xx.xx.xx.xx, server_share_path: “/mnt/server2”,   client_share_path: “/mnt/client2”, client_mount_options:   “nosuid,rw,sync,hard,intr” }

3. Multiple NFS file systems: Multiple filesystems are mounted from multiple servers. Sample Values: - { server_ip: zz.zz.zz.zz, server_share_path:   “/mnt/share1”, client_share_path: “/mnt/client1”, client_mount_options:   “nosuid,rw,sync,hard,intr”}

- { server_ip: xx.xx.xx.xx, server_share_path: “/mnt/share2”,   client_share_path: “/mnt/client2”, client_mount_options:   “nosuid,rw,sync,hard,intr”} - { server_ip: yy.yy.yy.yy, server_share_path: “/mnt/share3”,   client_share_path: “/mnt/client3”, client_mount_options:   “nosuid,rw,sync,hard,intr”}

Default value: { server_ip: ,   server_share_path: , client_share_path: , client_mount_options: }

beegfs_support

boolean Optional

This variable is used to install beegfs-client on compute and manager nodes

Choices:

  • false <- Default

  • true

beegfs_rdma_support

boolean Optional

This variable is used if user has RDMA-capable network hardware (e.g., InfiniBand)

Choices:

  • false <- Default

  • true

beegfs_ofed_kernel_modules_path

string Optional

The path where separate OFED kernel modules are installed.

Default value: "/usr/src/ofa_kernel/default/include"

beegfs_mgmt_server

string Required

BeeGFS management server IP. Note: The provided IP should have an explicit BeeGFS management server running .

beegfs_mounts

string Optional

Beegfs-client file system mount location. If storage_yml is being used to change the BeeGFS mounts location, set beegfs_unmount_client to true

Default value: “/mnt/beegfs”

beegfs_unmount_client

boolean [1] Optional

Changing this value to true will unmount running instance of BeeGFS client and should only be used when decommisioning BeeGFS, changing the mount location or changing the BeeGFS version.

Choices:

  • false <- Default

  • true

beegfs_client_version

string Optional

Beegfs client version needed on compute and manager nodes.

Default value: 7.2.6

beegfs_version_change

boolean [1] Optional

Use this variable to change the BeeGFS version on the target nodes.

Choices:

  • false <- Default

  • true

beegfs_secret_storage_filepath

string Required

  • The filepath (including the filename) where the connauthfile is placed.

  • Required for Beegfs version >= 7.2.7

    Default values: /home/connauthfile

telemetry_config.yml

Parameters

Parameter

Details

idrac_telemetry_support

boolean [1]

Required

  • Enables iDRAC telemetry support and visualizations.

  • Values:

* false <- Default

* true

Note

When idrac_telemetry_support is true, mysqldb_user, mysqldb_password and mysqldb_root_password become mandatory.

omnia_telemetry_support

boolean [1]

Required

  • Starts or stops Omnia telemetry

  • If omnia_telemetry_support is true, then at least one of collect_regular_metrics or collect_health_check_metrics or collect_gpu_metrics should be true, to collect metrics.

  • If omnia_telemetry_support is false, telemetry acquisition will be stopped.

  • Values:

* false <- Default

* true

visualization_support

boolean [1]

Required

  • Enables visualizations.

  • Values:

* false <- Default

* true

Note

When visualization_support is true, grafana_username and grafana_password become mandatory.

appliance_k8s_pod_net_cidr

string

Required

  • Kubernetes pod network CIDR for appliance k8s network.

  • Make sure this value does not overlap with any of the host networks.

  • Default value: "192.168.0.0/16"

pod_external_ip_start_range

string

Required

  • The start of the range that will be used by Loadbalancer for assigning IPs to K8s services in admin NIC subnet configured on the control plane.

  • The first and second octets (x,y) are not used/validated by Omnia. These values are internally calculated based on the value of admin_nic_subnet in input/provision_config.yml.

  • If pod_external_ip_start_range: “x.y.240.100” and pod_external_ip_end_range: “x.y.240.105” and

  • If admin_nic_subnet provided in provision_config.yml is 10.5.0.0, pod_external_ip_start_range will be 10.5.240.100 and pod_external_ip_end_range will be 10.5.240.105

Note

Make sure the IP range is not assigned to any node in the cluster.

  • Default value: "x.y.240.100"

pod_external_ip_end_range

string

Required

  • The end of the range that will be used by Loadbalancer for assigning IPs to K8s services in admin NIC subnet configured on the control plane.

  • The first and second octets (x,y) are not used/validated by Omnia. These values are internally calculated based on the value of admin_nic_subnet in input/provision_config.yml.

  • To create a meaningful range, the third octet of pod_external_ip_end_range should equal or exceed the third octet of pod_external_ip_start_range. If the third octets are equal, the forth octet of pod_external_ip_end_range should exceed the forth octet of pod_external_ip_start_range.

  • If pod_external_ip_start_range: “x.y.240.100” and pod_external_ip_end_range: “x.y.240.105” and

  • If admin_nic_subnet provided in provision_config.yml is 10.5.0.0, pod_external_ip_start_range will be 10.5.240.100 and pod_external_ip_end_range will be 10.5.240.105

Note

Make sure the IP range is not assigned to any node in the cluster.

  • Default value: "x.y.240.105"

timescaledb_user

string

Required

  • Username used to access timescaleDB.

  • The username must not contain -,, ‘,”.

  • The Length of the username should be at least 2 characters.

timescaledb_password

string

Required

  • Password used to used to access timescaleDB.

  • The password must not contain -,, ‘,”.

  • The length of the password should be at least 2 characters.

  • The first character of the string should be an alphabet.

idrac_username

string

Optional

  • Username used to authenticate to iDRAC.

  • The username must not contain -,, ‘,”.

  • Required if idrac_telemetry_support is true.

idrac_password

string

Optional

  • Password used to authenticate to iDRAC.

  • The password must not contain -,, ‘,”.

  • Required if idrac_telemetry_support is true.

  • The first character of the string should be an alphabet.

mysqldb_user

string

Optional

  • Username used to authenticate to mysqldb.

  • The username must not contain -,, ‘,”.

  • The length of the username should be at least 2 characters.

  • Required if idrac_telemetry_support is true.

mysqldb_password

string

Optional

  • Password used to authenticate to mysqldb.

  • The password must not contain -,, ‘,”.

  • The length of the password should be at least 2 characters.

  • Required if idrac_telemetry_support is true.

  • The first character of the string should be an alphabet.

mysqldb_root_password

string

Optional

  • Password used to authenticate to mysqldb as a root user.

  • The password must not contain -,, ‘,”.

  • The length of the password should be at least 2 characters.

  • Required if idrac_telemetry_support is true.

  • The first character of the string should be an alphabet.

omnia_telemetry_collection_interval

integer

Required

  • This variable denotes the time interval (seconds) of telemetry data collection from required compute nodes.

  • Range (seconds): 60-3600 [1 minute to 1 hour]

  • Default value: 300

collect_regular_metrics

boolean [1]

Required

  • This variable is used to enable metric collection part of the regular metric group.

  • For a list of regular metrics collected, click here.

  • Values:

* true <- Default

* false

collect_health_check_metrics

boolean [1]

Required

  • This variable is used to enable metric collection part of the health check metric group.

  • For a list of health metrics collected, click here.

  • Values:

* true <- Default

* false

collect_gpu_metrics

boolean [1]

Required

  • This variable is used to enable metric collection related to GPU.

  • For a list of GPU metrics collected, click here.

  • Values:

* true <- Default

* false

fuzzy_offset

integer

Required

  • This variable is used to set an appropriate time interval in seconds for all cluster nodes so that they do not congest the admin network.

  • Individual nodes generate a random number between 0 and fuzzy_offset and telemetry data collection of that node initially waits for that much of second before starting data collection.

  • Default value (seconds): 60

  • For large clusters, a higher value is recommended.

  • This value should be less than or equal to the value of omnia_telemetry_collection_interval but greater than or equal to 60.

metric_collection_timeout

integer

Required

  • This variable is used to define data collection timeout period in seconds.

  • Default value: 5

  • This value should be less than the value of omnia_telemetry_collection_interval but greater than 0.

grafana_username

string

Optional

  • The username for grafana UI

  • The length of username should be at least 5

  • The username must not contain -,, ‘,”

  • Mandatory when visualization_support is true.

grafana_password

string

Optional

  • The password for grafana UI

  • The length of password should be at least 5

  • The password must not contain -,, ‘,”

  • The password cannot be set to ‘admin’.

  • The first character of the string should be an alphabet.

  • Mandatory when visualization_support is true.

mount_location

string

Optional

  • At this location grafana persistent volume will be created.

  • If using telemetry, all telemetry related files will also be stored and both timescale and mysql databases will be mounted to this location.

  • ‘/’ is mandatory at the end of the path.

  • Default value: “/opt/omnia/telemetry/”

Click here for more information on FreeIPA, LDAP, Telemetry, BeeGFS or, NFS.

If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.