What can I help you with?
NVIDIA UFM Enterprise User Manual v6.21.0

Installing UFM Infra Using Rootless with Podman

The UFM Infra feature introduces a structured architecture where services are divided into two categories, each deployed differently based on functionality:

  • UFM Infra: A set of persistent infrastructure services that run on all nodes. These services support system-level operations and ensure distributed availability.

  • UFM Enterprise: Services that run exclusively on the master node, responsible for management, orchestration, and user-facing functionality.

For more information on the UFM Infra architecture, refer to UFM Infra.

Prerequisites

  1. Download the UFM and plugins bundle tar file to /tmp.

  2. Extract the contents using the command:

    Copy
    Copied!
                

    tar -xvf <bundle tar>

This archive (tar file) includes the following components:

  • Relevant UFM container image

  • Relevant FAST-API container image

  • Relevant Infra container image (for internal Redis usage). Refer to Redis-Related Configuration for more information.

  • Deployment script, titled deploy_rootless_ufm.sh

  • README for the deploy script

  • Default plugin bundle for UFM

  • UFM-HA package

To enable the UFM Infra feature, UFM HA must be installed in a new mode (`external-storage`), using a new product (`enterprise-multinode`).

Additionally, NFS must be configured as follows:

NFS Setup Prerequisites

  • Select a dedicated NFS server to host the shared directories.

  • Create a shared directory on the NFS server for UFM configuration and logs.

  • Install the NFS client on each UFM node if not already present.

Enable HA ports in firewall

If you have firewall rules that blocks non-standard ports, we need to open these ports so high availability services could communicate with each other on the HA nodes. To do so, run these commands:

Copy
Copied!
            

firewall-cmd --permanent --add-service=high-availability # or firewall-cmd --add-service=high-availability # and then reload the rules firewall-cmd --reload


Create and Mount the UFM Directory

Note

At this stage, apply point #2 (Mount the UFM directory) only on the master machine.

Other nodes will be visited for mount later.

  1. Create the UFM directory:

    Copy
    Copied!
                

    mkdir -p /opt/ufm/files/

  2. Mount the UFM directory:

    • If using NFS 4.2:

      Copy
      Copied!
                  

      mount -t nfs4 -o context="system_u:object_r:container_file_t:s0" <server>:/shared_folder /opt/ufm/files

    • If using NFS 3:

      Copy
      Copied!
                  

      mount -t nfs -o vers=3,context="system_u:object_r:container_file_t:s0" <server>:/shared_folder /opt/ufm/files

  3. Ensure the NFS version and mount options are compatible with the NFS server.

  4. Verify that the following HA packages are installed: pcs, pacemaker, and corosync. Install them if they are missing.

  5. Follow the HA installation steps in Run the HA Installation.

Run the HA Installation

Follow the HA installation instructions at UFM High-Availability Installation and Configuration.

When running the HA installation script, use the following command:

Copy
Copied!
            

./install.sh -p enterprise-multinode -l /opt/ufm/files

  • The -l flag must always point to the shared directory path: /opt/ufm/files

  • No need to provide the DRBD disk argument to the installation script.

Deploy Script Information

The deploy_rootless_ufm.sh script is a standalone utility that deploys all required components on a single UFM node (for both standalone or HA setups).

Note

If you plan to run UFM-Enterprise with SELinux in enforcing mode, SELinux must be enabled and set to enforcing at the OS level before installation. The installer will apply the necessary configurations only if enforcing mode is detected; otherwise, SELinux-related setup will be skipped.

Usage:

Copy
Copied!
            

./docker_ubuntu/rootless_ufm/deploy_rootless_ufm.sh [[--install] | [--uninstall]] [OPTIONS]

Description

This script performs the following tasks:

  • Creates a user/group for UFM (default: ufmadm:ufmadm with UID/GID 733).

  • Uses port 8443 (non-privileged) and configures firewall rules if needed.

  • Grants access permissions to umadX (based on selected IB interface).

  • Installs and runs the UFM container as an unprivileged user.

  • Configures and loads custom podman-ufm.socket and podman-ufm.service into systemd.

  • Add neccessary configuration to SELinux if enforced mode is detected.

Available Options

Option

Description

--install

Install UFM as an unprivileged user (default).

--uninstall

Uninstall UFM and all related configurations.

--ib-interface <INTERFACE>

IB fabric interface (default: ib0).

--mgmt-interface <INTERFACE>

Management interface (default: system route or eth0).

--local-certs-dir <DIRECTORY>

Directory containing SSL certs (optional).

--user / --uid

UFM user name and UID (default: ufmadm, 733).

--group / --gid

UFM group name and GID (default: ufmadm, 733).

--skip-user

Skip user/group creation or removal.

-h or --help

Show help information.


Example: Install with Defaults

Copy
Copied!
            

./docker_ubuntu/rootless_ufm/deploy_rootless_ufm.sh


Example: Install with Custom Options

Copy
Copied!
            

./docker_ubuntu/rootless_ufm/deploy_rootless_ufm.sh --install --user <user_name> --uid <uid> --group <group_name>

To install using all default settings, call the install script:

Copy
Copied!
            

deploy_rootless_ufm.sh --install

The script performs the following:

  • Verifies Podman is installed.

  • Loads UFM, Redis, and FAST-API images.

  • Deploys a standalone, rootless UFM instance in UFM Infra mode.

To start UFM as a standalone instance, run:

Copy
Copied!
            

systemctl daemon-reload systemctl start ufm-infra systemctl start ufm-enterprise

Running in HA Mode

Note

Do not manually start any services.

  1. Ensure UFM and UFM-HA are installed on all nodes as described in the above sections.

  2. Mount /opt/ufm/files on all standby nodes as described point #2 (Mount the UFM directory)

  3. On one node, edit the HA configuration file:

    Copy
    Copied!
                

    /etc/ufm_ha/ha_nodes.cfg

    Fill each node parameters

    Copy
    Copied!
                

    [Node.1] # valid role options: master/standby role = master # Mandatory primary_ip = # Mandatory if dual_link = true  secondary_ip =   [Node.2] role = standby primary_ip = secondary_ip =   [Node.3] role = standby primary_ip = secondary_ip =

  4. Ensure the file sync mode is set to external-storage, and that the shared file system is mounted prior to HA configuration.

    Copy
    Copied!
                

    [FileSync] # valid options are: drbd/external-storage # in case of external-storage the user MUST mount the files system PRIOR to ha configuration mode = external-storage

  5. Copy the edited file to all nodes at the same path.

  6. Configure the cluster, starting from standby nodes and ending with the master node:

    Copy
    Copied!
                

    ufm_ha_cluster config -p <password>

    Note

    Use the same password on all nodes.

  7. After finishing the configuration on all nodes, run:

    Copy
    Copied!
                

    ufm_ha_cluster status

  8. Start the cluster:

    Copy
    Copied!
                

    ufm_ha_cluster start

  9. Check cluster status again to ensure all services have started successfully.

© Copyright 2025, NVIDIA. Last updated on May 8, 2025.