DOCA Documentation v3.3.0

DPL Container Deployment

The DPL Runtime Service supports Bluefield devices (DPU mode) and ConnectX-9 (Host mode).

Note

There are few differences when preparing the system for running the DPL Runtime Service on DPU mode (BlueField) vs Host mode (ConnectX-9), these differences are outlined in the sections below where applicable.

This section is specific to DPU mode when running on BlueField devices.

Setting BlueField to DPU Mode

BlueField must run in DPU mode to use the DPL Runtime Service. For details on how to change modes, see: BlueField Modes of Operation.

Determining Your BlueField Variant

Your BlueField may be installed in a host server or it may be a standalone server.

If your BlueField is a standalone server, ignore the parts that mention the host server or SR-IOV. You may still use Scalable Functions (SFs) if your BlueField is a standalone server.

Setting Up DPU Management Access and Updating BlueField-Bundle

These pages provide detailed information about DPU management access, software installation, and updates:

Note

Systems with a host server typically use RShim (i.e., the tmfifo_net0 interface). Standalone systems must use the OOB interface option for management access.


Changing the eSwitch to switchdev mode

Info

Do this before creating SR-IOV Virtual functions. In case Virtual Functions already exist for the interface, remove them before trying to change the mode.

The DPL Runtime Service can only start if the eSwitch is in switchdev mode. If it's not, an error will be logged on startup and the process will exit.

If the platform is Bluefield in DPU mode, run this command in the DPU shell, otherwise, (e.g. ConnectX-9), use the host shell.

Your Bluefield DPU may be pre-configured in switchdev mode after the bfb installation. If this is the case, this step may be unnecessary.

Find the PCI address of the interface that you'd like to use with the DPL Runtime Service and use the following command (replace the pci/<addr> part with correct values)

Example

Copy
Copied!
            

sudo devlink dev eswitch set pci/0000:03:00.0 mode switchdev

Here are a few options for commands that may help you find your PCI address:

  • lspci -D

  • mst status -v

  • ip -d link

  • ethtool -i <interface name>

Note

devlink settings are not persistent across reboots.


Enabling Multiport eSwitch Mode (Optional)

Info

This step is optional and depends on your DPL program and setup needs.

Multiport eSwitch mode allows for traffic forwarding between multiple physical ports and their VFs/SFs (e.g., between p0 and p1).

Before enabling this mode:

  1. Ensure LAG_RESOURCE_ALLOCATION is enabled in firmware:

    Example

    Copy
    Copied!
                

    sudo mlxconfig -d 0000:03:00.0 s LAG_RESOURCE_ALLOCATION=1

    Info

    Refer to the Using mlxconfig guide for more information.

  2. After reboot or firmware reset, enable esw_multiport mode:

    Example

    Copy
    Copied!
                

    sudo devlink dev param set pci/0000:03:00.0 name esw_multiport value 1 cmode runtime

Note

devlink settings are not persistent across reboots.


Creating SR-IOV Virtual Functions

To use SR-IOV, first create Virtual Functions (VFs) on the host server:

Example

Copy
Copied!
            

sudo -s # enter sudo shell echo 4 > /sys/class/net/eth2/device/sriov_numvfs exit # exit sudo shell

Note

Entering a sudo shell is necessary because sudo only applies to the echo command, and not the redirection (>), which would otherwise result in "Permission denied."

This example creates 4 VFs under Physical Function eth2. Adjust the number as needed.

Info

If a PF already has VFs and you'd like to change the number, first set it to 0 before applying the new value.


Creating Scalable Functions (Optional)

Info

This step is optional and depends on your DPL program and setup needs.

For more information, see the BlueField Scalable Function User Guide, TODO: CX9.

If you create SFs, refer to their representors in the configuration file.

Downloading Container Resources from NGC

Start by downloading and installing the ngc-cli tools.

For example:

  • For DPU mode, download the ARM ngc-cli tool:

    Example

    Copy
    Copied!
                

    wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.5.2/files/ngccli_arm64.zip -O ngccli_arm64.zip unzip ngccli_arm64.zip

  • For Host mode, download the appropriate ngc-cli tool for your system architecture:

    Example for x86_64

    Copy
    Copied!
                

    wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.5.2/files/ngccli_linux.zip -O ngccli_linux.zip unzip ngccli_linux.zip

Once the ngc-cli tool has been downloaded, use it to download the latest dpl_rt_service resources:

Copy
Copied!
            

./ngc-cli/ngc registry resource download-version "nvidia/doca/dpl_rt_service"

This creates a directory in the format dpl_rt_service_va.b.c-docax.y.z. Where a.b.c is the DPL Runtime Service version number, and x.y.z is the DOCA version number.

For example: dpl_rt_service_v1.2.0-doca3.1.0.

Info

You can find available versions at NGC Catalog.

Info

Each release includes a kubelet.d YAML file that is used by the dpl_rt_service_ctl.sh script for retrieving the correct container image for either DPU or Host mode.


Running the Preparation Script

Run the dpl_system_setup.sh script to configure the system:

  • For DPU mode:

    Copy
    Copied!
                

    cd dpl_rt_service_va.b.c-docax.y.z chmod +x ./scripts/dpl_system_setup.sh sudo ./scripts/dpl_system_setup.sh sudo systemctl restart kubelet.service sudo systemctl restart containerd.service

    Warning

    For DPU mode, restarting kubelet and containerd is required whenever hugepages configuration changes for the changes to take effect.

  • For Host mode, specify the ConnectX device(s) that should be configured for DPL use using the --dev option (this option can be repeated):

    Copy
    Copied!
                

    cd dpl_rt_service_va.b.c-docax.y.z chmod +x ./scripts/dpl_system_setup.sh sudo ./scripts/dpl_system_setup.sh --dev 0000:08:00.0

The dpl_system_setup.sh script will perform the following:

  • Configures mlxconfig values:

    • FLEX_PARSER_PROFILE_ENABLE=4

    • PROG_PARSE_GRAPH=true

    • SRIOV_EN=1

  • Enables SR-IOV

  • Sets up initial DPL Runtime Service configuration folder at /etc/dpl_rt_service/

  • Configures hugepages

Info

Please note that the dpl_system_setup.sh script takes optional arguments to control the hugepages.

For DPU mode, if you set it larger than 4GB of hugepages, you also have to modify dpl_rt_service.yaml with a higher limit for spec->resources->limits->hugepages-2Mi


Editing the Configuration Files

Create device(s) configuration file based on the provided template config file.

See DPL Service Configuration for details.

For example:

Example

Copy
Copied!
            

sudo cp /etc/dpl_rt_service/devices.d/NAME.conf.template /etc/dpl_rt_service/devices.d/1000.conf # Then update /etc/dpl_rt_service/devices.d/1000.conf as needed. sudo vim /etc/dpl_rt_service/devices.d/1000.conf

Warning

You must create at least one device configuration file.

Otherwise, the DPL Runtime Service Container will not be able to start.


Firewall configuration to open gRPC server ports

The DPL Runtime Service has a couple of gRPC servers, each listening on a dedicated TCP port, supporting a corresponding DPL Developer tool.

It is critical to make sure that these ports are accessible from the system(s) where you plan to run the DPL Developer tools.

Those tools will connect to the DPL Runtime Service using the corresponding tool's gRPC server TCP port.

The needed ports are configurable (see server_tcp_port settings at DPL Service Configuration). By default they have the following values:

gRPC server

TCP Port

P4 Runtime

9559

DPL Admin

9600

DPL Nspect/ Debugger

9560

Example for allowing the ports on RHEL-9

Copy
Copied!
            

sudo firewall-cmd --permanent --add-port=9559/tcp sudo firewall-cmd --permanent --add-port=9600/tcp sudo firewall-cmd --permanent --add-port=9560/tcp sudo firewall-cmd --reload   # List Configurations to confirm ports were allowed. sudo firewall-cmd --list-all


Starting the DPL Runtime Service Container

Once your configuration files are ready, use the dpl_rt_service_ctl.sh script to start the container:

Note

Before running the script for the first time, user must grant it execution rights:

Copy
Copied!
            

sudo chmod +x ./scripts/dpl_rt_service_ctl.sh

Copy
Copied!
            

sudo ./scripts/dpl_rt_service_ctl.sh --start

Info

For DPU mode, the script will copy the YAML file into the/etc/kubelet.d/directory, which will trigger automatic creation and start of DPL RT Service Pod and container.

For Host mode, the script will start a Docker container named dpl-rt-service.

Allow a few minutes for the container to start. To monitor status:

  • For DPU mode:

    • Check logs:

      Copy
      Copied!
                  

      sudo journalctl -u kubelet --since -5m

    • List images:

      Copy
      Copied!
                  

      sudo crictl images

    • List pods:

      Copy
      Copied!
                  

      sudo crictl pods

  • For Host mode:

    • Check logs:

      Copy
      Copied!
                  

      sudo docker logs dpl-rt-service

    • List images:

      Copy
      Copied!
                  

      sudo docker images

  • View runtime logs:

    Copy
    Copied!
                

    /var/log/doca/dpl_rt_service/dpl_rtd.log

Note

If the container fails to start due to configuration errors, then the log file at /var/log/doca/dpl_rt_service/dpl_rtd.log might be empty or missing the relevant error logs.

In such case, you can view logs with the relevant errors using the relevant tool:

For DPU:

Copy
Copied!
            

sudo crictl logs $(sudo crictl ps -a | grep dpl-rt-service | awk '{print $1}')

For Host:

Copy
Copied!
            

sudo docker logs dpl-rt-service


Stopping the DPL Runtime Service Container

Stop the container by using the dpl_rt_service_ctl.sh script:

Copy
Copied!
            

sudo ./scripts/dpl_rt_service_ctl.sh --stop

Info

For DPU mode, the script will remove the YAML file from the /etc/kubelet.d/ directory.

For Host mode, the script will stop the Docker container named dpl-rt-service.

To confirm the pod is gone (this might take a few seconds to complete):

Copy
Copied!
            

# For DPU: sudo crictl pods | grep dpl-rt-service   # For Host: sudo docker ps | grep dpl-rt-service


Restarting the DPL Runtime Service After Configuration Changes

Once the DPL Runtime Service container is up and running, any change to any file under the /etc/dpl_rt_service/ configuration folder requires restarting the container in order for the new changes to take effect.

Perform the following steps to restart the container:

  1. Stop the container.

    Copy
    Copied!
                

    sudo ./scripts/dpl_rt_service_ctl.sh --stop

  2. Wait for the container to stop.

  3. Start the container:

    Copy
    Copied!
                

    sudo ./scripts/dpl_rt_service_ctl.sh --start

End-to-End Installation Steps

Note

Replace device IDs and filenames as appropriate for your setup.

DPU Example

Copy
Copied!
            

# Download NGC CLI tool: wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.5.2/files/ngccli_arm64.zip -O ngccli_arm64.zip unzip ngccli_arm64.zip   # Download the DPL Runtime Service Resources bundle: ./ngc-cli/ngc registry resource download-version "nvidia/doca/dpl_rt_service"   # Prepare DPU and restart services: cd dpl_rt_service_va.b.c-docax.y.z chmod +x ./scripts/dpl_system_setup.sh sudo ./scripts/dpl_system_setup.sh sudo systemctl restart kubelet.service sudo systemctl restart containerd.service   # Create a device configuration file with relevant interfaces info: sudo cp /etc/dpl_rt_service/devices.d/NAME.conf.template /etc/dpl_rt_service/devices.d/1000.conf sudo vim /etc/dpl_rt_service/devices.d/1000.conf   # Launch the Pod and container: sudo ./scripts/dpl_rt_service_ctl.sh --start

Host Example

Copy
Copied!
            

# Download NGC CLI tool: wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/4.5.2/files/ngccli_linux.zip -O ngccli_linux.zip unzip ngccli_linux.zip   # Download the DPL Runtime Service Resources bundle: ./ngc-cli/ngc registry resource download-version "nvidia/doca/dpl_rt_service"   # Prepare DPU and restart services: cd dpl_rt_service_va.b.c-docax.y.z chmod +x ./scripts/dpl_system_setup.sh sudo ./scripts/dpl_system_setup.sh --dev 0000:08:00.0   # Create a device configuration file with relevant interfaces info: sudo cp /etc/dpl_rt_service/devices.d/NAME.conf.template /etc/dpl_rt_service/devices.d/1000.conf sudo vim /etc/dpl_rt_service/devices.d/1000.conf   # Launch the Docker container: sudo ./scripts/dpl_rt_service_ctl.sh --start


For additional troubleshooting steps and deeper explanations, refer to BlueField Container Deployment Guide.

Checkpoint

Command

View recent kubelet logs (DPU only)

sudo journalctl -u kubelet --since -5m

View logs of the dpl-rt-service container

Helpful if /var/log/doca/dpl_rt_service/dpl_rtd.log is missing or incomplete

For DPU: sudo crictl logs $(sudo crictl ps -a | grep dpl-rt-service | awk '{print $1}')

For Host: sudo docker logs dpl-rt-service

List pulled container images

For DPU: sudo crictl images

For Host: sudo docker images

List all created pods (DPU only)

sudo crictl pods

List running containers

For DPU: sudo crictl ps

For Host: sudo docker ps

View DPL service logs

/var/log/doca/dpl_rt_service/dpl_rtd.log

Make sure the following conditions are met before or during deployment:

  • VFs were created before deploying the container (if using SR-IOV)

  • All required configuration files exist under /etc/dpl_rt_service/, are correctly named, and include valid device IDs

  • Network interface names and MTU settings match the physical and virtual network topology

  • Firmware is up to date and matches DOCA compatibility requirements

  • For DPU mode, BlueField is operating in the correct mode (DPU mode) using sudo mlxconfig -d <pci-device> q

© Copyright 2026, NVIDIA. Last updated on Mar 2, 2026