UFM Enterprise Appliance OS Upgrade

NVIDIA UFM Enterprise Appliance Software User Manual v1.4.1

This section provides a step-by-step guide for UFM Enterprise Appliance Operating System upgrade.

Each UFM Enterprise Appliance software has an additional tar file with a -omu.tar suffix (OMU stands for OS Manufacture and Upgrade). This tar file can be used to re-manufacture the server and to upgrade the operating system/software on the server.

  1. Copy the OMU tar file to a temporary directory on the server.
    UFM-APPLIANCE - ufm-appliance<version>-<revision>-omu.tar

  2. Extract the contents of the tar file to /tmp.

    Copy
    Copied!
                

    tar vxf ./ufm-appliance-<version>-<revision>-omu.tar -C /tmp/

  3. Change to the extracted directory.

    Copy
    Copied!
                

    cd /tmp/ufm-appliance-<version>-<revision>-omu

  4. An upgrade script and an ISO file are included in the extracted directory.

    Copy
    Copied!
                

    ls -1 ./# ls -1 ./ ./ufm-os-upgrade.sh ufm-appliance-<version>-<revision>.iso

    The following flags are available in the upgrade script help.

    Copy
    Copied!
                

    # ufm-os-upgrade.sh --help ufm-os-upgrade.sh will upgrade and install OS packages.   IMPORTANT!!! a reboot is mandatory after the finalization of this script, kernel and kernel models will not work properly until the server is rebooted.   Additional SW installations will be automatically invoked after reboot, a message will pop on all open terminals with the installation status: "UFM-OS-FIRSTBOOT-FAILURE" - if installation is failed. "UFM-OS-FIRSTBOOT-SUCCESS" - if installation succeeded.   Additional info will be available in "/var/log/ufm_os_upgrade_@@UFM-OS-VERSION@@.log" log file. Upgrade steps status information can be viewed in "/var/log/ufm_os_upgrade_@@UFM-OS-VERSION@@_status.log" log file.   Syntax: ufm-os-upgrade.sh [options]   options -d,--debug debug info will be visible on the screen.   -r,--reboot Automatically reboot the server when upgrade is finished. P.S. if secure boot is enabled and a new certificate is enrolled the server will not automatically reboot even if this flag is set.   -y,--yes Will not prompt for user acknowledgements, use with CAUTION user prompts will be assumed as answered yes.   -h,--help print this help message.

    Important

    IMPORTANT!!! System reboot is mandatory once the upgrade procedure is completed. The -r flag can be used to automatically reboot the server at the end of the upgrade. Note that some kernel modules may not work properly until server reboot is performed.

  1. Stop UFM service by running the following command:

    Copy
    Copied!
                

    systemctl stop ufm-enterprise.service

  2. Run the upgrade script.

    Warning

    System reboot is mandatory once the upgrade procedure is completed. The -r flag can be used to automatically reboot the server.
    The --appliance-sw-upgrade flag CAN NOT !!! be supplied to upgrade the UFM Enterprise Appliance SW.

    The -y flag can be supplied to skip user questions (the flag does not automatically reboot the server on its own. For auto reboot, combine with the -r flag)
    Once a secure boot certificate is updated/installed, the script will not auto reboot even if -y and -r flags are provided. That is because the addition of certificates require manual user intervention at boot (after the upgrade).
    There is a 10 seconds window to press any button when prompted during the boot procedure and insert the server root password in order to import the certificate. Further details are available in Appendix - Secure Boot Activation and Deactivation.
    In the following example the server will auto reboot when upgrade is finished.

    Copy
    Copied!
                

    ./ufm-os-upgrade.sh -y -r

  3. In case a secure-boot certificate is installed/upgraded, the following warning is presented:

    OS1.png

    In that case the server does not reboot automatically, a manual configuration is required at boot (a 10 second prompt appears during the boot. For more information, refer to Appendix - Secure Boot Activation and Deactivation. To continue with the upgrade procedure, manually reboot the server from as instructed in Appendix - Secure Boot Activation and Deactivation.

  4. After the reboot procedure is complete, a systemd service (ufm-os-firstboot.service) runs the remainder of the upgrade procedure. Once completed, a message is prompted to all open terminals including the status:
    "UFM-OS-FIRSTBOOT-FAILURE" - if installation is failed.
    "UFM-OS-FIRSTBOOT-SUCCESS" - if installation succeeded.
    Example:

    OS2.png


    To manually check the status, run systemctl status ufm-os-firstboot.service. If it is already finished, an error message is prompted stating that there is no such service. In that case, the log /var/log/ufm-os-firstboot.log can be checked instead.

    Copy
    Copied!
                

    systemctl status ufm-os-firstboot.service

    Example:

    OS3.png

Upgrade on HA should be done first on the stand-by node and after that on the master node, each node upgrade is similar to the SA instructions.

In case the Standby node is unavailable, the upgrade can be run on the Master node only, however, some additional steps will be required after the appliance is upgraded.

Warning

In case a secure boot certificate needs to be updated/installed, the script will stop execution and request the user to install the secure-boot certificate, secure-boot does not have to be active (although it is highly recommended), but the certificate must be installed/updated by the user before proceeding to the upgrade.

The upgrade script will verify that the certificate is up to date and will stop execution if it needs to be installed/updated (this happens at the start of the script)

  1. [On the stand-by Node]: Copy and extract the OMU tar file to a temporary directory.

  2. [On the stand-by Node]: Run the upgrade script.

    Warning

    System reboot is mandatory once the upgrade procedure is completed. The -r flag can be used to automatically reboot the server.

    The flag CAN NOT !!! be supplied to upgrade the UFM Enterprise Appliance SW.

    The -y flag can be supplied to skip user questions (the flag does not automatically reboot the server on its own. For auto reboot, combine with the -r flag).

    In the following example the server auto reboots once the upgrade procedure is completed:

    Copy
    Copied!
                

    cd /tmp/ufm-appliance-<version>-<revision>-omu ./ufm-os-upgrade.sh -y -r

  3. If -r flag was not provided reboot the server when the script will finish (a question will show on the screen that will ask to reboot if No was answered a manual reboot is required)
    to manually reboot the server:

    Copy
    Copied!
                

    reboot now

  4. After the reboot procedure is complete, a systemd service (ufm-os-firstboot.service) runs the remainder of the upgrade procedure. Once completed, a message is prompted to all open terminals including the status:
    "UFM-OS-FIRSTBOOT-FAILURE" - if installation is failed.
    "UFM-OS-FIRSTBOOT-SUCCESS" - if installation succeeded.
    Example:

    OS4.png

    To manually check the status, run systemctl status ufm-os-firstboot.service. If it is already finished, an error message is prompted stating that there is no such service. In that case, the log /var/log/ufm-os-firstboot.log can be checked instead.

    Copy
    Copied!
                

    systemctl status ufm-os-firstboot.service

    Example:

    OS5.png

  5. After the stand-by node have finished the upgrade check the HA cluster status

    Copy
    Copied!
                

    ufm_ha_cluster status

    OS6.png

    All the nodes in the cluster should be online and the current node should remain a stand-by (Secondary in DRBD_ROLE)

  6. [On the Master Node]: Fail-over the UFM to the stand-by node (upgraded node will become master and current node will become stand-by).

    Copy
    Copied!
                

    ufm_ha_cluster failover

    wait for all the resource of UFM are up and running on the upgraded node.

  7. repeat the procedure on the un-upgraded node (which is now acting as stand-by).

© Copyright 2023, NVIDIA. Last updated on Sep 5, 2023.