NVIDIA DOCA Management Service Guide

1.0

This guide provides instructions on how to use the DOCA Management Service on top of NVIDIA® BlueField® Networking Platform or ConnectX® Network Adapters.

Note

DOCA DMS service is currently supported at Alpha level.

DOCA Management Service (DMS) is a one-stop shop for the user to configure and operate NVIDIA BlueField and ConnectX devices. DMS governs all scripts/tools of NVIDIA with an easy and industry-standard API created by the OpenConfig community. The user can configure BlueField or ConnectX for any mode whether locally (ssh) or remotely (grpc). It makes it easy to migrate and bootstrap any customer for any NVIDIA network device.

DMS exposes configurable BlueField/ConnectX parameters over the external interface to support a management station in an automated configuration of the NVIDIA Network Adapters. The exposed interface presents a uniform approach for BF/CX device configuration and keeps hidden details about the internal tools used for the configuration of BlueField or ConnectX features.

The DMS is a Client-Server architecture. Using a daemon, the service handles the discovery of resources, and is ready to receive commands from clients, the user can use DMSc (DMS Client) which delivers as part of the DMS, or use/create any other client.

Info

Please refer to the OpenConfig site for an explanation of the OpenConfig protocol.

The Yang models describe a config tree which is easy to navigate and find any "config leaf" using XPath capabilities. Most gNMI/gNOI protocols are common with the OpenConfig community, utilizing gRPC protocol for transferring the command.

Note

The DOCA Yang model is experimental.

Note

The gNMI Subscribe mechanism for streaming telemetry is not currently supported yet.

Info

DMS can run either on the host machine where BlueField or ConnectX devices are installed or on BlueField Arm itself (when BlueField is operating in DPU mode).

DMS requires DOCA to be installed on the target system, where DMS Service will be running:

  • DMS for Host - requires DOCA for Host package to be installed on the host system (with doca-networking or doca-all profiles).

  • DMS for DPU (BlueField Arm) - requires DOCA Image to be installed on BlueField Arm.

Please follow these instructions to install DOCA: NVIDIA DOCA Installation Guide for Linux.

Note

DMS supports only Linux-based environments today.

DMS has 3 major components:

  • DMSd – Server – DMS server inside the BlueField or on the host with an NVIDIA PCIe device

  • DMSc – Client – DOCA provides OpenConfig client. Customers can choose to use this client, any other open-source client, or develop their own (gRPC-based) client.

  • Yang files – Yang model files contain the data model used to configure the BlueField device, NVIDIA-specific extension to common OpenConfig YANG Models.

OpenConfig consists of 2 main protocols:

  • gNMI – gRPC Network Management Interface, protocol to configure of network device.

  • gNOI – gRPC Network Operations Interface, a protocol to perform operational commands on network device (i.e., provision, upgrade, reboot).

The following is an architectural diagram of DMS:

Screenshot_2024-04-07_095621-version-3-modificationdate-1714578008037-api-v2.png

The following diagram presents the DMS mode of operation, as the DMS client can operate from anywhere:

  1. Both DMS client and server components are deployed on the Host

  2. Both DMS client and server components are deployed on DPU (BlueField Arm)

  3. DMS server component is deployed on the Host, while DMS client is deployed remotely (connecting to DMS server over management network)

  4. DMS server component is deployed on DPU (BlueField Arm), while DMS client is deployed remotely (connecting to DMS server over management network)

Screenshot_2024-04-07_095501-version-3-modificationdate-1714578149817-api-v2.png

To see the full list of flags, user the help flag (i.e., dmsd -help, dmsd -h).

General Flags

  • -bind_address <string> – Bind to <address>:<port> or just :<port> (default is :9339). Can be localhost for local use case, or an IP address for remote use case.

  • -v <value> – log level for V logs

  • -target_pci <string> – The target PCIe address (i.e., 03:00). Auto-select if only one NVIDIA network device is present; otherwise, the PCIe address must be specified.

Security Flags

-auth string – this flag has 3 options:

  • Shadow

    • Zero-touch, admin not required to create any dedicated additional user for DMS (re-use OS user)

    • Read the hashed password in real time on each client request

    • Use flags -username -shadow

    • Example: -username root -shadow /etc/shadow/

    • To disable: -noauth flag

  • Credentials

    • Admin must set a strong password

    • Use flags -username -password

    • Example: -username root -password 123456

    • To disable: -noauth flag

    • Can leave password flag empty to invoke prompt for password at demon boot

  • Certificate File

    • The most secure option, based on (m)TLS

    • Example: -ca /tmp/ca.crt -ca_key /tmp/ca.key

    • To disable: -notls option

Provisioning Flags

  • -target_pci <string> – The target PCIe address (i.e., 03:00). Auto-select if only one NVIDIA network device is present; otherwise, the PCIe address must be specified.

  • -image_folder <string> – Specify image install folder. Can copy images directly to the folder to avoid transfer over the net. Default create folder: /tmp/dms.

  • -chunk_size_ack <uint> – The chunk size of the image to respond with a TransfreResponse in bytes (default: 12000000)

gNMI Command

In DMSc, the gNMI part is powered by the GNMIC project.

Info

For more information, please refer to GNMIC documentation.

Copy
Copied!
            

dmsc -a localhost:9339 -u root -p <password> --file /opt/mellanox/doca/service/dms/yang <command>

Prompt mode with autocomplete options can be invoked using the command prompt.

Get Request

Get requests happen in real-time without cache. Get command require providing the Yang Xpath as described in the following:

Copy
Copied!
            

dmsc <flags> get --path /interfaces/interface[name=p0]/config/mtu [ { "source": "localhost:9339", "timestamp": 1712485149723248511, "time": "2024-04-07T10:19:09.723248511Z", "updates": [ { "Path": "interfaces/interface[name=p0]/config/mtu", "values": { "interfaces/interface/config/mtu": "1500" } } ] } ]

Info

To insert params in the path, as an indication of the interface name (p0).


Set Request

Note

Some set commands cannot currently be detected with GET commands.

Set requests happen immediately, invoking tools to configure the OS.

Set commands require providing Yang Xpath as described in the following:

Copy
Copied!
            

dmsc <flags> set --update /interfaces/interface[name=p0]/config/mtu:::int:::9216 { "source": "localhost:9339", "time": "1970-01-01T00:00:00Z", "results": [ { "operation": "UPDATE", "path": "interfaces/interface[name=p0]/config/mtu" } ] }

Info

To insert params in the path, as an indication of the interface name (p0).

Note

The value provided must be separated by value type and char.

Note

Currently, only the --update flag is supported in set.

It is also possible to invoke a command JSON list:

Copy
Copied!
            

dmsc <flags> set --request-file req.json

req.json example:

Copy
Copied!
            

{ "updates": [ { "path": "/interfaces/interface[name=p0]/config/mtu", "value": 9216, "encoding": "uint" }, { "path": "/interfaces/interface[name=p0]/config/enabled", "value": true, "encoding": "bool" } ] }

gNOI Commands

In DMSc, the gNOI part is powered by GNOIC project, for full docs refer to GNOIC docs

Copy
Copied!
            

dmsc -a localhost --port 9339 --tls-cert client.crt --tls-key client.key <command>

Prompt mode with autocomplete options can be invoked using the command prompt.

All commands are blocking unless specified otherwise.

OS

The following subsections present actions for provisioning a new DOCA Image (BFB) or firmware on BlueField.

Install

This command transmits the file from the client to the server and authenticates the file's validity:

Copy
Copied!
            

dmsc <flags> os install --version <free_text_version> --pkg <bfb|cfg|fw path> dmsc <flags> os install --version 2_7_0 --pkg DOCA_2.7.0_Ubuntu.bfb dmsc <flags> os install --version 2_7_0 --pkg config.cfg dmsc <flags> os install --version 1_3_5_custom.bfb --pkg custom.bfb

The file is saved to the folder specified in the -image_folder flag (default /tmp/dms) if the file authenticates successfully. The file's extension is autodetected and is written automatically if none is provided in the --version field. Users may copy the file to the folder manually and invoke the command with file extension to authenticate the file. No file transfer is initiated if the file already exists in the folder and the version specified with the extension.

Activate

Activate the command deploy the BFB bundle/firmware to the hardware:

Copy
Copied!
            

dmsc <flags> os activate --version 2_7_0 # Invoke all files under 2_7_0 name dmsc <flags> os activate --version "2_7_0.bfb;0_0_1.cfg;24_29_0046.fw"

The --version flag provides a version to search for in the folder specified by the -image_folder flag (default /tmp/dms). If no extension is provided, the command uses all files under the version name.

To activate separate files, use the --version flag separated by semi-colon.

Note

After running the command to activate firmware, firmware reset is automatically invoked.


Verify

Verify command retrieves the firmware and BFB bundle version:

Copy
Copied!
            

dmsc <flags> os verify

The return value consists of both versions separated by semi-colon.

Note

Currently, the BFB bundle can only be retrieved if it was installed via DMS.

System

The following subsections provide actions for rebooting the BFB bundle/firmware on the BlueField.

Note

Alpha version does not support components

Reboot Status

Verify BFB is on reboot operation

Copy
Copied!
            

dmsc <flags> system reboot-status

The value returned is false if the system is active. It is true if the system is in reboot status.

If the status cannot be retrieved, the status appears as a failure and the message field indicates what the issue is.

Reboot

Reboot the BlueField Arm and firmware.

Copy
Copied!
            

dmsc <flags> system reboot --delay 10s

This command is not blocking and returns immediately.

The flag --delay specifies the time interval to wait before invoking the reset.

© Copyright 2024, NVIDIA. Last updated on May 7, 2024.