Creating a BaseOS Image#

This chapter provides instructions for provisioning a software image with the NVIDIA BaseOS Software in BCM.

The NVIDIA BaseOS Software delivers the essential software stack for compute nodes, featuring optimized system configurations, enhanced drivers, comprehensive diagnostics, and advanced monitoring tools. It is derived from the same software stack included in the DGX OS ISO.

Important

The BCM installation already includes a pre-configured BaseOS image when the “DGX” option was selected in the download page.

Additional versions of BaseOS Software release are also available from the Enterprise Support Site (requires a subscription). Refer to Import a BaseOS Archive File for instructions on locating and downloading the BaseOS images for BCM.

Provisioning the NVIDIA BaseOS Software#

BCM provides two similar commands for creating a disk image:

  • cm-create-image

  • cm-image create

cm-create-image is typically used to integrate an existing software image directory with BCM. It’s often employed when you have an image directory already prepared manually or as a copy of an image locally or from another cluster. The image is closely related to the headnode’s distribution. Refer to Base Command Manager 11 Admin Guide - Section 11.6.2

cm-image create provides more generic options for creating images. cm-image is a wrapper to cm-image-create providing additional options. Refer to Base Command Manager 11 Admin Guide - Section 11.7.1

It’s particularly relevant when creating images for different distributions than the headnode or a different processor architecture.

These commands support the following option for provisioning the BaseOS software:

  • --dgx

    Enabling this option applies the following changes to an image:

    • Add the additional BaseOS repositories

    • Install system optimizations

    • Install GPU driver, DOCA OFED, and other software

  • --dgx-type <dgx-type>

    Provide theReplace <dgx-type> with the appropriate DGX system type:

    • dgx_a100 - for DGX A100 systems

    • dgx_h100 - for DGX H100 systems

    • dgx_h200 - for DGX H100 systems

    • dgx_b200 - for DGX B200 systems

    • dgx_gb200 - for DGX GB200 systems

    Note

    BCM also provides the post-installation tool bcm-pod-setup command (Using the BCM Post-Installation Tool), which also applies kernel configurations similar to the --dgx-type option. The bcm-pod-setup is intended as a one-time setup tool for a cluster, so the recommendation is for using the --dgx-type option when creating an image, even if the bcm-pod-setup tool is used during deployment.

  • --no-cm-cuda-repo

    The BaseOS installation already added the CUDA repository, so use --no-cm-cuda-repo to avoid conflicts. (This option is not required when using --cmdvd)

  • --skipdist

    When this flag is used, the tool will skip installing packages from the Linux distribution repositories and rely on, for example, the BCM ISO.

Refer to the BCM Documentation for in-depth information and additional options.

Note

The instructions examples in this chapter typically use the long form for the command arguments for clarity, for example, using --imagename instead of the more practical -n short form.

Download a BaseOS Software Release#

You can download versions of BaseOS Software release from the Enterprise Support Site.

  1. Download the BaseOS image from the NVIDIA Download Center.

    Search for the release announcement for the desired DGX OS (BaseOS) version or follow these steps to navigate to the release announcement:

    1. Navigate to Home/Products/Servers and Workstations/DGX

    2. Select Go to Downloads and go to the Download Center (this might require to log in again).

    3. Select “Servers and Workstations” and the DGX system you are using.

    4. You can find a list of BaseOS releases under OS.

    5. Use Download to go to the release announcement.

    The release announcement provides links to the available BaseOS tar.gz images for BCM.

  2. Create a system-specific image from the archive. Assuming the downloaded file was baseos-image.tgz, use:

    $ cm-create-image --fromarchive baseos-image.tgz [--dgx-type <dgx-type>] --no-cm-cuda-repo --add-only
    

Import a BaseOS Archive File#

BCM supports importing an image generated from a different cluster, even with a different architecture.

  1. Import the image to BCM:

    Use cm-create-image to create the new image baseos-name and import the content from an archive file (tar.gz or tgz) Enter c to finalize the process when prompted:

    $ cm-create-image --fromarchive <archive-image.tgz> [--dgx-type <dgx-type>] --no-cm-cuda-repo [--skip-disk] --name baseos-image
    
    Use ``--skip-disk`` if the image already includes BCM
    
    Running validate base tar........................ [  OK  ]
    Running sanity check............................. [  OK  ]
    ...
    
    Continue(c)/Exit(e)? c
    
    Finalize base distribution....................... [ OK ]
    ...
    

Create the BaseOS image from the BCM default image#

BCM provides multiple options for provisioning a BaseOS image from the default-image:

  • Provision BaseOS from public repositories

    Use the following command to create a BaseOS image from the default BCM image, retrieving the latest BaseOS Software from public repositories:

    $ cm-create-image --dgx [--dgx-type <dgx-type>] --imagename baseos-image --fromdir /cm/image/default-image --no-cm-cuda-repo
    
  • Provision BaseOS from the software included in the BCM ISO

    Use cm-create-image with the additional --cmdvd option to install the software from the BCM ISO avoiding upgrading-baseos-bcm The following command expects that the BCM ISO has been placed to /root/

    $ cm-create-image --dgx [--dgx-type <dgx-type>] --cmdvd /root/bcm-11.0-ubuntu2404.iso --imagename baseos-image --fromdirectory /cm/image/default-image
    

    This option creates the temporary repository configuration /etc/apt/sources.list.d/cm-dvd.list with a higher priority to prohibit retrieving newer packages from external repositories. This file is removed after the image has been created.

Apply additional modifications to the BaseOS image#

After provisioning the BaseOS image, you can customize the image by installing additional software or upgrading existing components.

  • Upgrading BaseOS Software components

    Before assigning an image to the nodes, Refer to Installing or Upgrading BaseOS Components for instructions on installing or upgrading components in the BaseOS image.

  • Adding Additional Software and Scripts

    Copy any additional scripts or files directly to the image directory: /cm/images/baseos-image/. All changes to the image directory will automatically applied to the nodes during the provisioning process.

  • Assigning the Image to Nodes

    After creating and customizing the BaseOS image, assign the image to nodes or node categories as described in Assigning Images to Nodes and Post Installation Configurations.

Creating a BaseOS image for a different processor architecture#

BCM supports running commands in a cross-architecture environment emulating the processor of the target architecture.

All provisioning options listed in the sections above apply equally in cross-architecture environments as long as the base images and additional software are available for the target architecture.

Note

Creating or modifying an image for a different processor architecture requires emulating the processor and can be much slower than running natively on the same architecture.

The suggested alternative is to create and manage the BaseOS image in a BCM instance running on the same architecture as the target cluster.

BCM requires the following files on the headnode in a cross-architecture environment:

  • /cm/images/default-image-ubuntu2404-aarch64

  • /cm/images/dgx-image-ubuntu2404-aarch64

  • /cm/node-installer-ubuntu2404-aarch64

  • /cm/shared-ubuntu2404-aarch64

There are three options to obtain an image for the target processor architecture:

  • Download a BaseOS release for the target processor architecture from the Enterprise Support Site (requires a subscription).

  • Export the BCM default-image` from a BCM headnode with the same processor architecture as the target compute node.

  • Create the image from the Linux distribution that is included in the BCM ISO for the target architecture. Refer to Creating a default BCM image from a distribution for more details.

Creating a default BCM image from a distribution#

BCM offers the option to create an image from a default Linux distribution file system..

Note

Creating an image from a distribution is typically only required to install a different Linux distribution as the headnode or for cross-architecture environments.

Important

Creating the default image for BCM from a distribution for a different target architecture as the headnode can take a lot of time.

The instructions are intended to create and manage the BaseOS image in a BCM instance running on the same architecture as the target cluster and transfer the image to the target cluster.

  1. Download the ISO with the target architecture from the BCM Customer Portal.

  2. Mount the ISO on the headnode, for example, to /mnt:

    mount bcm-11.0-ubuntu2404-dgx-os-7.1.iso /mnt -o loop
    
  3. Use cm-image create to create all required images. The --dgx option instructs the command to also provision the BaseOS Software.

    --dgx-type <dgx-type>

$ cm-image create all --dgx --arch aarch64 --distro ubuntu2404 --source /mnt --add-only
  1. Copy the images to the following locations on the headnode of the target cluster:

    • /cm/images/default-image-ubuntu2404-aarch64

    • /cm/images/dgx-image-ubuntu2404-aarch64

    • /cm/node-installer-ubuntu2404-aarch64

    • /cm/shared-ubuntu2404-aarch64

Follow the instructions for Import a BaseOS Archive File to import the images on the target cluster if created on a separate cluster.