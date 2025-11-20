On This Page
DPL Container Deployment
Setting BlueField to DPU Mode
BlueField must run in DPU mode to use the DPL Runtime Service. For details on how to change modes, see: BlueField Modes of Operation.
Determining Your BlueField Variant
Your BlueField may be installed in a host server or it may be a standalone server.
If your BlueField is a standalone server, ignore the parts that mention the host server or SR-IOV. You may still use Scalable Functions (SFs) if your BlueField is a standalone server.
Setting Up DPU Management Access and Updating BlueField-Bundle
These pages provide detailed information about DPU management access, software installation, and updates:
Systems with a host server typically use RShim (i.e., the
tmfifo_net0 interface). Standalone systems must use the OOB interface option for management access.
Creating SR-IOV Virtual Functions (Host Server)
To use SR-IOV, first create Virtual Functions (VFs) on the host server:
sudo -s
# enter sudo shell
echo 4 > /sys/class/net/eth2/device/sriov_numvfs
exit
# exit sudo shell
Entering a sudo shell is necessary because
sudo only applies to the
echo command, and not the redirection (
>), which would otherwise result in "Permission denied."
This example creates 4 VFs under Physical Function
eth2. Adjust the number as needed.
If a PF already has VFs and you'd like to change the number, first set it to
0 before applying the new value.
Creating Scalable Functions (Optional)
This step is optional and depends on your DPL program and setup needs.
For more information, see the BlueField Scalable Function User Guide.
If you create SFs, refer to their representors in the configuration file.
Enabling Multiport eSwitch Mode (Optional)
This step is optional and depends on your DPL program and setup needs.
Multiport eSwitch mode allows for traffic forwarding between multiple physical ports and their VFs/SFs (e.g., between
p0 and
p1).
Before enabling this mode:
Ensure
LAG_RESOURCE_ALLOCATIONis enabled in firmware:
sudo mlxconfig -d 0000:03:00.0 s LAG_RESOURCE_ALLOCATION=1Info
Refer to the Using mlxconfig guide for more information.
After reboot or firmware reset, enable
esw_multiportmode:
sudo devlink dev param set pci/0000:03:00.0 name esw_multiport value 1 cmode runtimeNote
devlinksettings are not persistent across reboots.
Downloading Container Resources from NGC
Start by downloading and installing the ngc-cli tools.
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.58.0/files/ngccli_arm64.zip -O ngccli_arm64.zip
unzip ngccli_arm64.zip
./ngc-cli/ngc registry resource download-version
"nvidia/doca/dpl_rt_service"
This creates a directory such as
dpl_rt_service_v1.2.0-doca3.1.0.
You can find available versions at NGC Catalog.
Each release includes a
kubelet.d YAML file pointing to the correct container image for automatic download.
Running the Preparation Script
Run the script to configure the DPU for DPL use:
cd dpl_rt_service_va.b.c-docax.y.z
chmod +x ./scripts/dpl_dpu_setup.sh
sudo ./scripts/dpl_dpu_setup.sh
sudo systemctl restart kubelet.service
sudo systemctl restart containerd.service
Restarting
kubelet and
containerd is required whenever
hugepages configuration changes for the changes to take effect.
This script:
Configures
mlxconfigvalues:
FLEX_PARSER_PROFILE_ENABLE=4
PROG_PARSE_GRAPH=true
SRIOV_EN=1
Enables SR-IOV
Sets up
/etc/dpl_rt_service/
Configures hugepages
Editing the Configuration Files
You must create at least one device configuration file. For example:
sudo
cp /etc/dpl_rt_service/devices.d/NAME.conf.template /etc/dpl_rt_service/devices.d/1000.conf
Then edit
/etc/dpl_rt_service/devices.d/1000.conf as needed.
See DPL Service Configuration for details.
Starting the DPL Runtime Service Pod
Once your configuration files are ready, copy the file
configs/dpl_rt_service.yaml from the directory that you pulled with the ngc-cli into
/etc/kubelet.d:
sudo
cp ./configs/dpl_rt_service.yaml /etc/kubelet.d/
Allow a few minutes for the pod to start.
To monitor status:
Check logs:
sudojournalctl -u kubelet --since -5m
List images:
sudocrictl images
List pods:
sudocrictl pods
View runtime logs:
/var/log/doca/dpl_rt_service/dpl_rtd.logNote
If the container fails to start due to configuration errors, view logs with:
sudocrictl logs $(
sudocrictl
ps-a |
grepdpl-rt-service |
awk
'{print $1}')
Restarting the Pod After Configuration Changes
Remove the YAML file:
sudo
rm-fv /etc/kubelet.d/dpl_rt_service.yaml
Wait for the pod to stop.
Re-copy the YAML file to restart:
sudo
cp./configs/dpl_rt_service.yaml /etc/kubelet.d/
End-to-End Installation Steps
# Download NGC CLI and container bundle
wget --content-disposition https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.58.0/files/ngccli_arm64.zip -O ngccli_arm64.zip
unzip ngccli_arm64.zip
./ngc-cli/ngc registry resource download-version
"nvidia/doca/dpl_rt_service"
# Prepare DPU and restart services
cd dpl_rt_service_va.b.c-docax.y.z
chmod +x ./scripts/dpl_dpu_setup.sh
sudo ./scripts/dpl_dpu_setup.sh
sudo systemctl restart kubelet.service
sudo systemctl restart containerd.service
# Configure the service
sudo
cp /etc/dpl_rt_service/devices.d/NAME.conf.template /etc/dpl_rt_service/devices.d/1000.conf
# Edit the file above
# Launch the pod
sudo
cp ./configs/dpl_rt_service.yaml /etc/kubelet.d/
Replace device IDs and filenames as appropriate for your setup.
Stopping the DPL Runtime Service kubelet Pod
Stop the pod by removing its kubelet YAML file:
sudo /bin/rm -fv /etc/kubelet.d/dpl_rt_service.yaml
Then confirm the pod is gone:
sudo crictl pods | grep dpl-rt-service
For additional troubleshooting steps and deeper explanations, refer to BlueField Container Deployment Guide.
Checkpoint
Command
View recent kubelet logs
View logs of the
Helpful if
List pulled container images
List all created pods
List running containers
View DPL service logs
Make sure the following conditions are met before or during deployment:
VFs were created before deploying the container (if using SR-IOV)
All required configuration files exist under
/etc/dpl_rt_service/, are correctly named, and include valid device IDs
Network interface names and MTU settings match the physical and virtual network topology
Firmware is up to date and matches DOCA compatibility requirements
BlueField is operating in the correct mode (DPU mode) using
sudo mlxconfig -d <pci-device> q