Importing Artifacts into the Air-Gapped Environment#

After you transfer the bundle into the headnode of your air-gapped environment, you must load container images into your private registry and make Helm charts and files available for installation.

Pushing container images to your registry#

Images in the bundle use the OCI directory layout. Tools such as skopeo (or crane) can be used to push them to your private registry while preserving the image path so that registry mirroring continues to work.

Example using skopeo:

TARGET_REGISTRY="my-registry.local"
find bundle/images -name "index.json" -exec dirname {} \; | \
while read img_dir; do
  rel_path="${img_dir#bundle/images/}"
  image_with_tag="${rel_path#*/}"
  tag="${image_with_tag##*/}"
  image_path="${image_with_tag%/*}"
  echo "Pushing ${image_path}:${tag}"
  skopeo copy --all \
    "oci:${img_dir}" \
    "docker://${TARGET_REGISTRY}/${image_path}:${tag}"
done

Authenticate to your private registry using the standard tools (for example, skopeo login my-registry.local or crane auth login my-registry.local).

Note: BCM provides support for local registry and mirroring of the images from the internet. If using BCM registry, TARGET_REGISTRY should be set to master.cm.cluster:5000. Refer to section 3.2 of the BCM containerization manual for more details.

Staging Helm charts#

You can either push charts to an OCI-compatible registry or install from local files:

  • Push to a Helm OCI registry:

    REGISTRY="oci://my-registry.local/helm-charts"
    
    for chart in bundle/helm/*.tgz; do
      echo "Pushing $chart"
      helm push "$chart" "$REGISTRY"
    done
    

Note: BCM provides support for local registry for helm charts from the internet. If using BCM registry, REGISTRY should be set to oci://master.cm.cluster:5000/helm-charts. Refer to section 3.2 of the BCM containerization manual for more details.

Uploading AHR artifacts to the air-gapped head node#

After transferring the AHR bundle to the air-gapped head node, run the upload_ahr_dependencies.sh script to push container images to the local registry, stage Helm charts and packages, and install supporting files.

Before running the upload script, load the Docker module:

module load docker

Prerequisites:

  • Root access on the head node.

  • python3 on PATH (version 3.11 or higher) with the pyyaml package installed.

  • docker — required unless --skip-images is passed (load with module load docker).

The script reads registry and authentication settings from scripts/airgap/upload_config.yaml:

registry_url: "master.cm.cluster:5000"
registry_ca_cert: "/path/to/registry/ca.crt"
registry_username: admin                   # optional
registry_password: ""                      # optional
skip_images: false
skip_charts: false
skip_packages: false
skip_files: false

Quick start:

sudo ./upload_ahr_dependencies.sh ./bundle

Usage:

sudo ./upload_ahr_dependencies.sh <bundle_dir> [options]

Argument

Description

<bundle_dir>

Path to the downloaded bundle directory (required).

--config <path>

Path to a custom configuration YAML (default: scripts/airgap/upload_config.yaml).

--skip-images

Skip pushing Docker images to the registry.

--skip-charts

Skip pushing Helm charts.

--skip-packages

Skip copying .deb packages.

--skip-files

Skip installing files (OpenTofu binary, runbooks).

Examples:

# Upload everything
sudo ./upload_ahr_dependencies.sh ./bundle

# Upload only packages and files (skip images and charts)
sudo ./upload_ahr_dependencies.sh ./bundle --skip-images --skip-charts

The upload process runs five steps in order:

  1. Images — Loads each .tar from bundle/images/, retags for the local registry, and pushes. Images that already exist remotely are skipped.

  2. Charts — Copies each .tgz from bundle/charts/ to /cm/local/apps/autonomous-hardware-recovery/var/charts/.

  3. Packages — Copies .deb files from bundle/packages/ to /cm/local/apps/autonomous-hardware-recovery/var/packages/.

    Note

    The upload script copies a package only if it is not already installed on the head node or in any software image. This selective approach avoids overwriting existing packages. Because the download script captures the entire dependency tree, the bundle contains all transitive dependencies and should satisfy every package requirement during installation.

  4. Files — Extracts the OpenTofu binary from bundle/terraform/bin/tofu_*.zip into /cm/local/apps/autonomous-hardware-recovery/bin/tofu and copies runbook archives to /cm/local/apps/autonomous-hardware-recovery/var/runbooks/downloaded/.

  5. Plugin install — Installs the cm-setup-ahr-*.deb package from the packages directory using dpkg -i, deploying the AHR cm-setup plugin into the BCM site-packages.

A summary is printed when the upload completes. If any step fails, the script provides a retry command targeting only the failed categories. Logs are written to /var/log/nmc-autonomous-hardware-recovery-airgap-upload.log.

Expected plugin folder structure

After upload, the AHR plugin directory has the following layout:

/cm/local/apps/autonomous-hardware-recovery/
├── bin/
│   └── tofu
├── etc/
│   ├── autonomous-hardware-recovery.key
│   └── autonomous-hardware-recovery.pem
└── var/
    ├── charts/
    │   └── shoreline-onprem-backend-29.1.*.tgz
    ├── packages/
    │   ├── shoreline_29.1.*-enroot.deb
    │   ├── datacenter-gpu-manager-4-*.deb
    │   └── ...
    └── runbooks/
        └── downloaded/
            └── nmc-ahr-2.3.*.tgz

Setup PID dependencies

The following steps install PID dependencies on the head node and into the compute/GPU node software image. Run these steps once — the installation targets both the head node and the software image in a single pass. The examples below use amd64 artifacts; arm64 variants are noted in separate blocks where applicable.

  1. nvdebug — install to /opt/nvidia/nvdebug/bin/:

    unzip -o nvdebug_v2.0.1.zip -d nvdebug_v2.0.1_extract
    

    For amd64:

    tar xzf nvdebug_v2.0.1_extract/nvdebug_v2.0.1/nvdebug-linux-amd64-2.0.1.tar.gz -C nvdebug_v2.0.1_extract/
    mkdir -p /opt/nvidia/nvdebug/bin
    cp -rf nvdebug_v2.0.1_extract/nvdebug/* /opt/nvidia/nvdebug/bin/
    chmod +x /opt/nvidia/nvdebug/bin/nvdebug
    

    For arm64:

    tar xzf nvdebug_v2.0.1_extract/nvdebug_v2.0.1/nvdebug-linux-arm64-2.0.1.tar.gz -C nvdebug_v2.0.1_extract/
    mkdir -p /opt/nvidia/nvdebug/bin
    cp -rf nvdebug_v2.0.1_extract/nvdebug/* /opt/nvidia/nvdebug/bin/
    chmod +x /opt/nvidia/nvdebug/bin/nvdebug
    
  2. nvssvt — install to /cm/shared/apps/nvssvt/:

    unzip -o nvssvt-v1.7.1.zip -d nvssvt_extract
    mkdir -p nvssvt_v1.7.1
    

    For amd64:

    tar xzf nvssvt_extract/v1.7.1/nvssvt-release-1.7.1-amd64.tar.gz -C nvssvt_v1.7.1/
    mkdir -p /cm/shared/apps/nvssvt/1.7.1
    cp -rf nvssvt_v1.7.1/* /cm/shared/apps/nvssvt/1.7.1/
    chmod +x /cm/shared/apps/nvssvt/1.7.1/nvssvt
    

    For arm64:

    tar xzf nvssvt_extract/v1.7.1/nvssvt-release-1.7.1-arm64.tar.gz -C nvssvt_v1.7.1/
    mkdir -p /cm/shared/apps/nvssvt/1.7.1
    cp -rf nvssvt_v1.7.1/* /cm/shared/apps/nvssvt/1.7.1/
    chmod +x /cm/shared/apps/nvssvt/1.7.1/nvssvt
    
  3. nvlmapper — install to /cm/shared/nvlmapper/ and /tmp/nvlmapper/:

    unzip -o nvlmapper_archive_v14.zip -d nvlmapper_extract
    

    For amd64:

    tar xzf nvlmapper_extract/release_14.0_amd64.tgz -C nvlmapper_extract/
    mkdir -p /cm/shared/nvlmapper/14
    cp -rf nvlmapper_extract/dist_amd64/* /cm/shared/nvlmapper/14/
    chmod +x /cm/shared/nvlmapper/14/nvlmapper
    mkdir -p /tmp/nvlmapper/14
    cp -rf nvlmapper_extract/dist_amd64/* /tmp/nvlmapper/14/
    chmod +x /tmp/nvlmapper/14/nvlmapper
    

    For arm64:

    tar xzf nvlmapper_extract/release_14.0_arm64.tgz -C nvlmapper_extract/
    mkdir -p /cm/shared/nvlmapper/14
    cp -rf nvlmapper_extract/dist_arm64/* /cm/shared/nvlmapper/14/
    chmod +x /cm/shared/nvlmapper/14/nvlmapper
    mkdir -p /tmp/nvlmapper/14
    cp -rf nvlmapper_extract/dist_arm64/* /tmp/nvlmapper/14/
    chmod +x /tmp/nvlmapper/14/nvlmapper
    
  4. partnerdiag mfg switch L10 — install to /cm/shared/partnerdiag_switch/ and /tmp/partnerdiag/:

    tar xzf 629-9K36F-00MV-FLD-42174.tgz
    
    mkdir -p /cm/shared/partnerdiag_switch/42174
    cp -rf 629-9K36F-00MV-FLD-42174/* /cm/shared/partnerdiag_switch/42174/
    chmod +x /cm/shared/partnerdiag_switch/42174/partnerdiag
    
    mkdir -p /tmp/partnerdiag/42174
    cp -rf 629-9K36F-00MV-FLD-42174/* /tmp/partnerdiag/42174/
    chmod +x /tmp/partnerdiag/42174/partnerdiag
    
  5. partnerdiag mfg computer L10 — install to /cm/shared/partnerdiag/:

    tar xzf 629-24975-0000-FLD-50896-rev13.tgz
    
    mkdir -p /cm/shared/partnerdiag/50896-rev13
    cp -rf 629-24975-0000-FLD-50896-rev13/* /cm/shared/partnerdiag/50896-rev13/
    chmod +x /cm/shared/partnerdiag/50896-rev13/partnerdiag
    
  6. EUD — extract to /cm/shared/eud/, copy the .deb into the software image, and install:

    Note

    Replace <gpu-node-image> in the paths below with the actual BCM software image name for your GPU nodes (for example, dgx-gb200-slurm). Do not use the literal string default-image.

    mkdir -p /cm/shared/eud
    tar xzf EUD_580.126.12.tar.gz -C /cm/shared/eud/
    
    mkdir -p /cm/images/<gpu-node-image>/pid/eud
    

    For amd64:

    cp -f /cm/shared/eud/EUD_580.126.12/nvidia-diagnostic-local-repo-ubuntu2404-580.126.12-mode1_1.0-1_amd64.deb \
      /cm/images/<gpu-node-image>/pid/eud/
    
    systemd-nspawn --directory=/cm/images/<gpu-node-image> --chdir=/pid/eud \
      bash -c "dpkg -i /pid/eud/nvidia-diagnostic-local-repo-ubuntu2404-580.126.12-mode1_1.0-1_amd64.deb"
    
    systemd-nspawn --directory=/cm/images/<gpu-node-image> --chdir=/root \
      bash -c "dpkg -i /var/nvidia-diagnostic-local-repo-ubuntu2404-580.126.12-mode1/nvidia-diagnostic-580_580.126.12-1_amd64.deb"
    
    systemd-nspawn --directory=/cm/images/<gpu-node-image> \
      bash -c "cp /var/nvidia-diagnostic-local-repo-ubuntu2404-580.126.12-mode1/nvidia-diagnostic-local-*-keyring.gpg /usr/share/keyrings/"
    

    For arm64:

    cp -f /cm/shared/eud/EUD_580.126.12/nvidia-diagnostic-local-tegra-repo-ubuntu2404-580.126.12-mode1_1.0-1_arm64.deb \
      /cm/images/<gpu-node-image>/pid/eud/
    
    systemd-nspawn --directory=/cm/images/<gpu-node-image> --chdir=/pid/eud \
      bash -c "dpkg -i /pid/eud/nvidia-diagnostic-local-tegra-repo-ubuntu2404-580.126.12-mode1_1.0-1_arm64.deb"
    
    systemd-nspawn --directory=/cm/images/<gpu-node-image> --chdir=/root \
      bash -c "dpkg -i /var/nvidia-diagnostic-local-tegra-repo-ubuntu2404-580.126.12-mode1/nvidia-diagnostic-580_580.126.12-1_arm64.deb"
    
    systemd-nspawn --directory=/cm/images/<gpu-node-image> \
      bash -c "cp /var/nvidia-diagnostic-local-tegra-repo-ubuntu2404-580.126.12-mode1/nvidia-diagnostic-local-*-keyring.gpg /usr/share/keyrings/"
    

    After installing the packages, update the software image:

    cmsh
    [headnode]% device
    [headnode->device]% imageupdate -w -c <gpu-node-image>