Deploying NVIDIA Converged Accelerator

NVIDIA BlueField DPU BSP v4.7.0
Info

It is recommended to upgrade your BlueField product to the latest software and firmware versions available to benefit from new features and latest bug fixes.

This section assumes that you have installed the BlueField OS BFB on your NVIDIA® Converged Accelerator using any of the following guides:

NVIDIA® CUDA® (GPU driver) must be installed in order to use the GPU. For information on how to install CUDA on your Converged Accelerator, refer to NVIDIA CUDA Installation Guide for Linux.

After installing the BFB, you may now select the mode you want your NVIDIA Converged Accelerator to operate in.

  • Standard (default) – the NVIDIA® BlueField® DPU and the GPU operate separately (GPU is owned by the host)

  • BlueField-X – the GPU is exposed to the DPU and is no longer visible on the host (GPU is owned by the DPU)

Note

It is important to know your device name (e.g., mt41686_pciconf0).

MST tool is necessary for this purpose which is installed by default on the DPU.

Run:

Copy
Copied!
            

mst status -v

Example output:

Copy
Copied!
            

MST modules: ------------ MST PCI module is not loaded MST PCI configuration module loaded PCI devices: ------------ DEVICE_TYPE MST PCI RDMA NET NUMA BlueField2(rev:1) /dev/mst/mt41686_pciconf0.1 3b:00.1 mlx5_1 net-ens1f1 0   BlueField2(rev:1) /dev/mst/mt41686_pciconf0 3b:00.0 mlx5_0 net-ens1f0 0

BlueField-X Mode

  1. Run the following command from the host:

    Copy
    Copied!
                

    mlxconfig -d /dev/mst/<device-name> s PCI_DOWNSTREAM_PORT_OWNER[4]=0xF

  2. P erform a BlueField system-level reset for the mlxconfig settings to take effect.

Standard Mode

To return the DPU from BlueField-X mode to Standard mode:

  1. Run the following command from the host:

    Copy
    Copied!
                

    mlxconfig -d /dev/mst/<device-name> s PCI_DOWNSTREAM_PORT_OWNER[4]=0x0

  2. P erform a BlueField system-level reset for the mlxconfig settings to take effect.

Use the following command from the host or BlueField:

Copy
Copied!
            

$ sudo mlxconfig -d /dev/mst/<device-name> q PCI_DOWNSTREAM_PORT_OWNER[4]

  • Example of Standard mode output:

    Copy
    Copied!
                

    Device #1: ----------   [...]   Configurations: Next Boot PCI_DOWNSTREAM_PORT_OWNER[4] DEVICE_DEFAULT(0)

  • Example of BlueField-X mode output:

    Copy
    Copied!
                

    Device #1: ---------- [...]   Configurations: Next Boot PCI_DOWNSTREAM_PORT_OWNER[4]        EMBEDDED_CPU(15)

The following are example outputs for when the DPU is configured to BlueField-X mode.

The GPU is no longer visible from the host:

Copy
Copied!
            

root@host:~# lspci | grep -i nv None

The GPU is now visible from the DPU:

Copy
Copied!
            

ubuntu@dpu:~$ lspci | grep -i nv 06:00.0 3D controller: NVIDIA Corporation GA20B8 (rev a1)

Get GPU Firmware

Copy
Copied!
            

smbpbi: (See SMBPBI spec)   root@dpu:~# i2cset -y 3 0x4f 0x5c 0x05 0x08 0x00 0x80 s root@dpu:~# i2cget -y 3 0x4f 0x5c ip 5 5: 0x04 0x05 0x08 0x00 0x5f root@dpu:~# i2cget -y 3 0x4f 0x5d ip 5 5: 0x04 0x39 0x32 0x2e 0x30 root@dpu:~# root@dpu:~# root@dpu:~# i2cset -y 3 0x4f 0x5c 0x05 0x08 0x01 0x80 s root@dpu:~# i2cget -y 3 0x4f 0x5c ip 5 5: 0x04 0x05 0x08 0x01 0x5f root@dpu:~# i2cget -y 3 0x4f 0x5d ip 5 5: 0x04 0x30 0x2e 0x36 0x42 root@dpu:~# i2cset -y 3 0x4f 0x5c 0x05 0x08 0x02 0x80 s root@dpu:~# i2cget -y 3 0x4f 0x5c ip 5 5: 0x04 0x05 0x08 0x02 0x5f root@dpu:~# i2cget -y 3 0x4f 0x5d ip 5 5: 0x04 0x2e 0x30 0x30 0x2e root@dpu:~# i2cset -y 3 0x4f 0x5c 0x05 0x08 0x03 0x80 s root@dpu:~# i2cget -y 3 0x4f 0x5c ip 5 5: 0x04 0x05 0x08 0x03 0x5f root@dpu:~# i2cget -y 3 0x4f 0x5d ip 5 5: 0x04 0x30 0x31 0x00 0x00 root@dpu:~#   39 32 2e 30 30 2e 36 42 2e 30 30 2e 30 31 00 00 → 92.00.6B.00.01


Updating GPU Firmware

Copy
Copied!
            

root@dpu:~# scp root@10.23.201.227:/<path-to-fw-bin>/1004_0230_891__92006B0001-dbg-ota.bin /tmp/gpu_images/ root@10.23.201.227's password: 1004_0230_891__92006B0001-dbg-ota.bin 100% 384KB 384.4KB/s 00:01   root@dpu:~# cat /tmp/gpu_images/progress.txt TaskState="Running" TaskStatus="OK" TaskProgress="50"   root@dpu:~# cat /tmp/gpu_images/progress.txt TaskState="Running" TaskStatus="OK" TaskProgress="50"   root@dpu:~# cat /tmp/gpu_images/progress.txt TaskState=Frimware update succeeded. TaskStatus=OK TaskProgress=100


© Copyright 2024, NVIDIA. Last updated on May 9, 2024.