Install and Configure NetQ#

NetQ NvLink (previously NMX-M) provides a single interface for management and telemetry collection of NVLink switches. NetQ is deployed on Kubernetes, along with the other components that make up Mission Control.

NetQ Kubernetes Setup#

NetQ Permanent License Generation and Application Guide#

When installing NetQ, the system receives an evaluation license valid for 60 days. When the evaluation license expires, REST API access is blocked until a new license is applied.

Generating a License File#

Before you generate the license file, you need to do the following:

  • Prepare a list of servers with the MAC address of each server on which you plan to install the NetQ software.

  • Have access to the NVIDIA Licensing Portal (NLP) with valid credentials.

To generate the license file, follow the steps below:

  1. Access the NVIDIA Licensing Portal

    • Go to the NVIDIA Licensing Portal (NLP).

    • Log in using your credentials.

  2. Navigate to Network Entitlements

    • Click on the Network Entitlements tab.

    • You’ll see a list of all your software product serial licenses, license information, and status.

  3. Select and Activate License

    • Select the license you want to activate.

    • Click on the “Actions” button.

  4. Configure MAC Addresses

    • In the MAC Address field, enter the MAC address of the delegated license-registered host.

    • If applicable, in the HA MAC Address field, enter your High Availability (HA) server MAC address.

    • Note: If you have more than one NIC installed on a UFM Server, use any of the MAC addresses.

  5. Generate and Download License

    • Click on Generate License File to create the license key file for the software.

    • Click on Download License File and save it on your local computer.

Important Notes about License Regeneration#

When you regenerate a license, you need to keep the following in mind:

  • If you replace your NIC or server, repeat the process of generating the license to set new MAC addresses.

  • You can only regenerate a license two times.

  • To regenerate the license after that, contact NVIDIA Sales Administration at enterprisesupport@nvidia.com.

Download the NetQ install package#

The NetQ install package can be downloaded from the NVIDIA Licensing Portal (NLP).

Downloading NetQ#

To download the package, follow the steps below:

  1. Go to the NVIDIA Licensing Portal (NLP) and log in using your credentials.

  2. Click on Software Downloads, filter the product family to NetQ, find the relevant version, and download the Appliance platform package.

  3. Click on Download.

  4. Save the file on your local drive.

  5. Click Close.

  6. Copy the .tar.gz file to the BCM head node:

rsync -azP <path-to-tar.gz-file> root@bcm11-head-01:/root

To download the Debian packages, use the following link: https://download.nvidia.com/cumulus/apps3.cumulusnetworks.com/repos/deb/pool/netq-5.0/

Find and download the relevant apps and agents packages for Ubuntu 24 (ub24) and the relevant CPU architecture (arm or amd).

Installing NetQ#

This section describes how to install NetQ on a BCM-managed Kubernetes cluster.

Prerequisites#

  • Kubernetes 1.33 is installed

  • Nginx ingress is installed

  • Three nodes with the minimum hardware requirements (4TB disk space, 512GB Free RAM, Core count 48)

  • An IP address that is reserved (to be used as Virtual IP)

Note that NetQ requires 512 GB of free memory rather than installed memory. If installed on a cluster with 512 GB of RAM, you may pass the –skip-netq-prerequisites-checks flag to the command. Please validate that the system has the required CPU core count before doing so.

Installation#

Start the installation wizard using the cm-mission-control-setup command.

Select the NetQ installation here:

NetQ installation selection screen

Choose the related Kubernetes cluster:

Kubernetes cluster selection screen

Choose the node category for the nodes where NetQ will be installed (usually the category for k8s-admin control plane nodes):

Node category selection screen

If left empty, you can also choose 3 nodes:

Node selection screen showing 3 nodes option

Provide the NetQ overlay name and priority (in most cases ‘default’ can be used):

NetQ overlay name and priority configuration screen

Provide a Virtual IP for the Cluster - An unused IP address allocated from the same subnet assigned to the default interface for your master and worker nodes.

Virtual IP configuration screen for the cluster

Provide the paths to the NetQ tarball and debians:

NetQ tarball and debian package paths configuration screen

Select the NetQ deployment mode: Choose the NVL Mode.

NetQ deployment mode selection screen showing NVL Mode option

Set Kong (NMX Api) username and password:

Kong username configuration screen Kong password configuration screen

Choose the storage path for Longhorn (replicated storage system). This must be a path with at least 4 TB of available space. On the k8s-admin node, run df -H to check storage availability; /local/longhorn may be a suitable option.

Longhorn storage path configuration screen

Save and deploy:

Save and deploy confirmation screen

Then allow the installation to run to completion.

Configure Longhorn to not be a default storageclass#

After installation of NetQ, the default behavior is to set Longhorn as a default storageclass. We only want to use Longhorn for the NetQ components and not for storage.

To configure Longhorn to not be a default storageclass, use the following steps:

  1. Get the current storageclasses:

kubectl get storageclass
NAME                      PROVISIONER                                      RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)      cluster.local/local-path-provisioner             Delete          WaitForFirstConsumer   true                   25h
longhorn (default)        driver.longhorn.io                               Delete          Immediate              true                   24h
longhorn-no-replication   driver.longhorn.io                               Delete          Immediate              true                   24h
longhorn-static           driver.longhorn.io                               Delete          Immediate              true                   24h
shoreline-local-path-sc   cluster.local/shoreline-local-path-provisioner   Delete          WaitForFirstConsumer   true                   18h
  1. Run the following code to patch the Longhorn storageclass:

kubectl patch storageclass longhorn -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'


kubectl get storageclass
NAME                      PROVISIONER                                      RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)      cluster.local/local-path-provisioner             Delete          WaitForFirstConsumer   true                   26h
longhorn                  driver.longhorn.io                               Delete          Immediate              true                   25h
longhorn-no-replication   driver.longhorn.io                               Delete          Immediate              true                   25h
longhorn-static           driver.longhorn.io                               Delete          Immediate              true                   25h
shoreline-local-path-sc   cluster.local/shoreline-local-path-provisioner   Delete          WaitForFirstConsumer   true                   19h

Post-Installation Validation#

Run the command:

kubectl get pods -A

Ensure all pods are in a Running or Complete state.

Connect to:

https://<Virtual IP>:30443/nmx/swag/index.html

Use the rw-user or ro-user credentials and the password set during the installation.

Uninstall NetQ#

Run cm-mission-control-setup and select the “NVIDIA Mission Control NetQ uninstallation” option (this option appears if NetQ is installed).

The wizard prompts for confirmation.

Note

Files in the Longhorn directory must be deleted manually after the uninstallation on all nodes where NetQ ran.