Deploying Cluster Bring-Up WEB Framework
This section describes how to deploy Cluster Bring-Up Web on a Linux machine.
Python 3.6 or greater is required on the host where the framework is to be deployed.
The system that runs the cluster bring-up framework must satisfy the following requirements:
- At least 4GB of memory
- At least 2 CPU cores
- At least 30GB of space
- Running Kubernetes
Supported Operating Systems
- CentOS 8 or later 64-bit (x86)
- Red Hat Enterprise Linux 8.2 or later 64-bit (x86)
- Ubuntu 20.04 or later 64-bit (x86)
Supported Deployment Platforms
NVIDIA currently supports running cluster bring-up framework as a containerized application using Docker images deployed to a Kubernetes cluster.
In the following sections, you'll find deployment details and instructions for a Kubernetes platform.
Deploying via Kubernetes
This section describes how to deploy Cluster Bring-Up WEB in Kubernetes cluster.
The installation is performed by using a virtual machine (VM) image which includes COT with all its dependencies.
Installation with Image
This section shows how to install the and deploy the cluster-bring up in offline mode which requires the user to download and restore a machine image with most of the dependencies already located on the machine.
The following is a list of requirements that must be met:
- Clonezilla version 188.8.131.52
For offline installation, perform the following steps:
- Download the tar image file located here.
- Move the download file to the data center and untar the file.
- Restore the image on your machine via Clonezilla. See section Restore Image for procedure.
- Log into the installation machine as the
rootuser with the password "
Make sure Kubernetes is running in
Change directory to the location of the installation script located under
Run the installation script with the
As part of the installation process, an image with Kubernetes and AWX-Operator already present must be restored on a machine. To restore, the Clonezilla software must be utilized.
Restore VM Using Hypervisor
The Clonezeilla restoration procedure can also be used for virtualization.
The following subsections provide the list of virtualization solutions that are supported.
Kernel-based Virtual Machine, or KVM, is a full virtualization solution for Linux on x86 hardware containing virtualization extensions. Using KVM, users can run multiple VMs running unmodified Linux or Windows images. Each VM has private virtualized hardware: A network card, disk, graphics adapter, etc.
The following is a list of required dependencies:
- virt-manager application
Follow these steps to restore the image on a VM. Each step has a name prepended to the step which indicated from which machine to perform the action:
On the machine running a hypervisor, check if there is enough space in the root and
On the machine running the hypervisor, download Clonezilla ISO and move it to the
- On the machine running the hypervisor, create a new directory in the
/imagesdirectory with the name of the newly created machine.
On the machine running the hypervisor, create a disk image with 65G.
On the machine running the hypervisor, open the Virtual Manager GUI.
- In the Virtual Manager GUI, click the "Create a virtual machine" icon on the top left.
- Create a new VM (5 steps):
- Select "Local install media".
- For "Choose ISO", select the Clonezilla ISO placed in
/tmp, uncheck "Automatically detect from the installation media", type and select the OS of choice (must be supported).
- Memory: 4096; CPUs: 2
- For "Select or create custom storage" and browse to the image disk created earlier.
- Type in a unique machine name and check the "Customize configuration before install" box
- Click "Finish".
- In the Virtual Manager GUI, change the boot order:
- Open the settings of the VM you are restoring on.
- Boot Options.
- Check the "Clonezilla CDROM" box which is linked to the Clonezilla ISO from step 2 above.
- Click the up arrow to move it up in the boot order.
- Click "Apply".
- Click "Begin Installation".
- After restarting the machine, the Clonezilla software will boot. Follow these steps to successfully restore the image:
- Clonezilla live.
- Type the IP address of the machine which stores the untar file from step 2 of section "Installation Steps".
- Port stays at "22" (default ssh).
- Keep "root" as user.
- Type the directory path which stores the untar file from step 2 of section "Installation Steps".
- Type password to
- Mode: Beginner.
- Select the name of your image.
- Select the name of your storage.
- Yes, check.
- Power off.
- In the Virtual Manager GUI, select "Change Boot Order". Then move disk image created in step 4 to the top of the list ahead of Clonezilla (CDROM).
- In the Virtual Manager GUI, select "Force off" and "Start VM".
After booting, log in as root user with the password "password".
(Restore) Change the name of the machine since it has the cloned machine name configured.
(Restore) If no Internet access is available on the machine, change the network interface in use.
- (Restore) Reboot machine → reboot.
Restore on Bare Metal
This section explains how to restore the image on a physical computer server.
ProLiant DL380p Gen8
- Connect to machine's remote management, ILO for HPE.
- Mount/add Clonezilla ISO via: Virtual Drives → Image File CDROM → Select Clonezilla ISO
- Reset the machine: Power Switch → Reset.
- Boot via Clonezilla ISO: Press F11 on startup → select CDROM Clonezilla ISO for boot.
- Continue from step 9 of section Restore on VM Machine to the end.
For additional information on HPE's remote management, visit HPE's support website.
The installation script,
install.sh, performs the following operations:
- Creates a new virtual environment for installation
- Ensures the dependencies for the installer are installed
- Deploys cluster bring-up WEB framework on Kubernetes platform
- Deploys cluster bring-up AWX framework on Kubernetes platform
- Configures AWX resources for cluster orchestration
Make sure to be located in the folder of the installation script (under
The following options are available for the installation script:
|Specify path to hosts file that contains hostnames for the inventory|
|Specify end-host list expression that represents hostnames for the inventory|
|Specify hostname to be a member of the |
|Specify username to authenticate against the hosts|
|Specify password (encoded in base64) to authenticate against the hosts|
|Specify to run the installation script in offline mode. Supported only when using COT image.|
|Specify the path to the configuration file to incorporate into the installation|
In this example, 3 hosts named
ib-node-05 are added to the inventory.
In addition, the
ib-node-01 host configured to be a member of the
ib_host_manager group for the In-Band operations.
This section provides the required information to add a YAML configuration file during the installation process.
Currently, the configuration file only supports adding inventory variables so that they are included in the IB Cluster Inventory variable list when AWX loads for the first time.
The YAML file must consist of an
extra_variables parent key paired with a dictionary value. That dictionary must include an
inventory_vars key which also has its own dictionary value. It will consist of a list of key-value pairs that are added to the inventory variables.
YAML configuration file example:
In this example, there are two variables,
anotherVar, that will be added to the inventory variables list in AWX.
Example usage with the configuration file flag:
As seen in the picture above,
anotherVar have been added to the IB Cluster Inventory after AWX loads for the first time.
Upgrading Framework Script
upgrade.sh script upgrades the COT containers and configuration files, including the COT API itself, while preserving the existing data.
To upgrade the COT:
tar.gzupgrade file from the COT download center.
- Extract the upgrade file.
- Run the
upgrade.shscript located in the extracted folder.
This section details the operations that could be performed once the installation process concludes.
The following code block demonstrates all the available actions:
The install and uninstall operations must be utilized via the
The update command allows updating certain components of the Cluster Bring-up Tool.
The update command relies on the
cot_dir argument, which refers to the path of the folder extracted from the
tar.gz file given .
Specify the path of the folder extracted from the new
The tool uses the data inside the folder as the new data for the update operation.
|Update the ansible playbooks|
|Update the AWX templates (job templates and workflows). This updates the ansible playbooks as a pre-task.|
|Update the COT client (on the |
|Get AWX URL and credentials|
|Get file server URL and files folder|
|Get the REST API URL|
The export operation allows creating a snapshot of the data within an existing COT environment. This may be used to transport the data between environments.
Directory path to save the snapshot. Default:
List of components to export, separated by spaces. Default:
This command builds a snapshot containing the playbooks and the database of the current COT environment. The
.tar.gz snapshot file produced is saved to
The import operation allows importing data of a given snapshot into an existing COT environment.
Adds the file server files from the snapshot to the existing files in the file server of the COT environment.
Without this flag, the files in the file server are overridden.
Path to snapshot file.
List of components to import, separated by spaces.
If not provided, the command imports the data of all the components contained in the snapshot.
This command imports the file server files and the database content from the snapshot into the COT environment. The file server files from the snapshot are added to the files that already exist in the file server.