NVIDIA DOCA UROM Service Guide

1.0

This guide provides instructions on how to use the DOCA UROM Service on top of the NVIDIA® BlueField® networking platform.

The DOCA UROM service provides a framework for offloading significant portions of HPC software stack directly from the host and to the BlueField device.

Using a daemon, the service handles the discovery of resources, the coordination between the host and BlueField, and the spawning, management, and teardown of the BlueField workers themselves.

image-2024-3-7_12-16-1-version-1-modificationdate-1709806560297-api-v2.png

The first step in initiating an offload request involves the UROM host application establishing a connection with the UROM service. Upon receiving the plugin discovery command, the UROM service responds by providing the application with a list of plugins available on the BlueField. The application then attaches the plugin IDs that correspond to the desired workers to their network identifiers. Finally, the service triggers UROM worker plugin instances on the BlueField to execute the parallel computing tasks. Within the service's Kubernetes pod, workers are spawned by the daemon in response to these offload requests. Each computation can utilize either a single library or multiple computational libraries.

Before deploying the UROM service container, ensure that the following prerequisites are satisfied:

  • Allocate huge pages as needed by DOCA (this requires root privileges):

    Copy
    Copied!
                

    $ sudo echo 2048 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

  • Or alternatively:

    Copy
    Copied!
                

    $ sudo echo '2048' | sudo tee -a /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages $ sudo mkdir /mnt/huge $ sudo mount -t hugetlbfs nodev /mnt/huge

For information about the deployment of DOCA containers on top of the BlueField, refer to the NVIDIA BlueField Container Deployment Guide.

Service-specific configuration steps and deployment instructions can be found under the service's container page.

Plugin Discovery and Reporting

When the application initiates a connection request to the DOCA UROM Service, the daemon reads the UROM_PLUGIN_PATH environment variable. This variable stores directory paths to .so files for the plugins with multiple paths separated by semicolons. The daemon scans these paths sequentially and tries loading each .so file. Once the daemon finishes the scan, it reports the available BlueField plugins to the host application.

The host application gets the list of available plugins as a list of doca_urom_service_plugin_info structures:

Copy
Copied!
            

struct doca_urom_service_plugin_info { uint64_t id; // Unique ID to send commands to the plugin uint64_t version; // Plugin version char plugin_name[DOCA_UROM_PLUGIN_NAME_MAX_LEN]; // .so filename };

The UROM daemon is responsible for generating unique identifiers for the plugins, which are necessary to enable the worker to distinguish between different plugin tasks.

Loading Plugin in Worker

During the spawning of UROM workers by the UROM daemon, the daemon attaches a list of desired plugins in the worker command line. Each plugin is passed in a format of so_path:id.

As part of worker bootstrapping, the flow iterates all .so files and tries to load them by using dlopen system call and look for urom_plugin_get_iface() symbol to get the plugin operations interface.

Yaml File

The .yaml file downloaded from NGC can be easily edited according to users' needs:

Copy
Copied!
            

env: # Service-Specific command line arguments - name: SERVICE_ARGS value: "-l 60 -m 4096" - name: UROM_PLUGIN_PATH value: "/opt/mellanox/doca/samples/doca_urom/plugins/worker_sandbox/;/opt/mellanox/doca/samples/doca_urom/plugins/worker_graph/"

  • The SERVICE_ARGS are the runtime arguments received by the service:

    • -l, --log-level <value> – sets the (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>

    • --sdk-log-level – sets the SDK (numeric) log level for the program <10=DISABLE, 20=CRITICAL, 30=ERROR, 40=WARNING, 50=INFO, 60=DEBUG, 70=TRACE>

    • -m, --max-msg-size – specify UROM communication channel maximum message size

  • The UROM_PLUGIN_PATH is an env variable that stores directory paths to .so files for the plugins

For each plugin on the BlueField, it is necessary to add a volume mount inside the service container. For example:

Copy
Copied!
            

volumes: - name: urom-sandbox-plugin hostPath: path: /opt/mellanox/doca/samples/doca_urom/plugins/worker_sandbox     type: DirectoryOrCreate ... volumeMounts: - mountPath: /opt/mellanox/doca/samples/doca_urom/plugins/worker_sandbox     name: urom-sandbox-plugin


When troubleshooting a container deployment issues, it is highly recommended to follow the deployment steps and tips found in the "Review Container Deployment" section of the NVIDIA BlueField Container Deployment Guide .

One could also check the /var/log/doca/urom log files for more details about the running cycles of service components (daemon and workers).

The log file name for workers is urom_worker_<pid>_dev.log and for the daemon it is urom_daemon_dev.log.

Pod is Marked as "Ready" and No Container is Listed

Error

When deploying the container, the pod's STATE is marked as Ready and an image is listed, however, no container can be seen running:

Copy
Copied!
            

$ sudo crictl pods POD ID CREATED STATE NAME NAMESPACE ATTEMPT RUNTIME 3162b71e67677 4 seconds ago Ready               doca-urom-my-dpu                               default 0 (default)   $ sudo crictl images IMAGE TAG IMAGE ID SIZE k8s.gcr.io/pause 3.2                 2a060e2e7101d 487kB nvcr.io/nvidia/doca/doca_urom 1.0.0-doca2.7.0     2af1e539eb7ab 86.8MB   $ sudo crictl ps CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD


Solution

In most cases, the container did start but immediately exited. This could be checked using the following command:

Copy
Copied!
            

$ sudo crictl ps -a CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD 556bb78281e1d       2af1e539eb7ab       6 seconds ago Exited doca-urom 1                   3162b71e67677       doca-urom-my-dpu

Should the container fail (i.e., reporting a state of Exited), it is recommended to examine the UROM's main log at /var/log/doca/urom/urom_daemon_dev.log.

In addition, for a short period of time after termination, the container logs could also be viewed using the container's ID:

Copy
Copied!
            

$ sudo crictl logs 556bb78281e1d ...

Pod is Not Listed

Error

When placing the container's YAML file in the Kubelet's input folder, the service pod is not listed in the list of pods:

Copy
Copied!
            

$ sudo crictl pods POD ID CREATED STATE NAME NAMESPACE ATTEMPT RUNTIME


Solution

In most cases, the pod has not started because of the absence of the requested hugepages. This can be verified using the following command:

Copy
Copied!
            

$ sudo journalctl -u kubelet -e. . .  Oct 04 12:12:19 <my-dpu> kubelet[2442376]: I1004 12:12:19.905064 2442376 predicate.go:103] "Failed to admit pod, unexpected error while attempting to recover from admission failure" pod="default/doca-urom-service-<my-dpu>" err="preemption: error finding a set of pods to preempt: no set of running pods found to reclaim resources: [(res: hugepages-2Mi, q: 104563999874), ]"

© Copyright 2024, NVIDIA. Last updated on May 7, 2024.