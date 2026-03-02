The sample shows how to enable Accurate Send Scheduling (or wait-on-time) feature in the context of a GPUNetIO application. Accurate Send Scheduling is the ability of an NVIDIA NIC to send packets in the future according to application-provided timestamps.

Note This feature is supported on ConnectX-7 and later .

This sample demonstrates how to send packets from the GPU using Accurate Send Scheduling by calling the high-level doca_gpu_dev_eth_txq_wait_send function with a BLOCK execution scope.

Info This NVIDIA blog post offers an example for how this feature has been used in 5G networks.

Before starting the sample, it is important to properly synchronize the CPU clock with the NIC clock. This way, timestamps provided by the system clock are synchronized with the time in the NIC.

For this purpose, at least the phc2sys service must be used. To install it on an Ubuntu system:

phc2sys Collapse Source Copy Copied! sudo apt install linuxptp

To start the phc2sys service properly, a config file must be created in /lib/systemd/system/phc2sys.service . Assuming the network interface is ens6f0 :

phc2sys Collapse Source Copy Copied! [Unit] Description=Synchronize system clock or PTP hardware clock (PHC) Documentation= man :phc2sys [Service] Restart=always RestartSec=5s Type=simple ExecStart=/bin/sh -c "taskset -c 15 /usr/sbin/phc2sys -s /dev/ptp$(ethtool -T ens6f0 | grep PTP | awk '{print $4}') -c CLOCK_REALTIME -n 24 -O 0 -R 256 -u 256" [Install] WantedBy=multi-user.target

Now phc2sys service can be started:

phc2sys Collapse Source Copy Copied! sudo systemctl stop systemd-timesyncd sudo systemctl disable systemd-timesyncd sudo systemctl daemon-reload sudo systemctl start phc2sys.service

To check the status of phc2sys :

phc2sys Collapse Source Copy Copied! $ sudo systemctl status phc2sys.service

Output:

phc2sys Collapse Source Copy Copied! ● phc2sys.service - Synchronize system clock or PTP hardware clock (PHC) Loaded: loaded (/lib/systemd/system/phc2sys.service; disabled; vendor preset: enabled) Active: active (running) since Mon 2023-04-03 10:59:13 UTC; 2 days ago Docs: man :phc2sys Main PID: 337824 (sh) Tasks: 2 (limit: 303788) Memory: 560.0K CPU: 52min 8.199s CGroup: /system.slice/phc2sys.service ├─337824 /bin/sh -c "taskset -c 15 /usr/sbin/phc2sys -s /dev/ptp\$( ethtool -T enp23s0f1np1 | grep PTP | awk '{print \$4}' ) -c CLOCK_REALTIME -n 24 -O 0 -R > └─337829 /usr/sbin/phc2sys -s /dev/ptp3 -c CLOCK_REALTIME -n 24 -O 0 -R 256 -u 256 Apr 05 16:35:52 doca-vr-045 phc2sys[337829]: [457395.040] CLOCK_REALTIME rms 8 max 18 freq +110532 +/- 27 delay 770 +/- 3 Apr 05 16:35:53 doca-vr-045 phc2sys[337829]: [457396.071] CLOCK_REALTIME rms 8 max 20 freq +110513 +/- 30 delay 769 +/- 3 Apr 05 16:35:54 doca-vr-045 phc2sys[337829]: [457397.102] CLOCK_REALTIME rms 8 max 18 freq +110527 +/- 30 delay 769 +/- 3 Apr 05 16:35:55 doca-vr-045 phc2sys[337829]: [457398.130] CLOCK_REALTIME rms 8 max 18 freq +110517 +/- 31 delay 769 +/- 3 Apr 05 16:35:56 doca-vr-045 phc2sys[337829]: [457399.159] CLOCK_REALTIME rms 8 max 19 freq +110523 +/- 32 delay 770 +/- 3 Apr 05 16:35:57 doca-vr-045 phc2sys[337829]: [457400.191] CLOCK_REALTIME rms 8 max 20 freq +110528 +/- 33 delay 770 +/- 3 Apr 05 16:35:58 doca-vr-045 phc2sys[337829]: [457401.221] CLOCK_REALTIME rms 8 max 19 freq +110512 +/- 38 delay 770 +/- 3 Apr 05 16:35:59 doca-vr-045 phc2sys[337829]: [457402.253] CLOCK_REALTIME rms 9 max 20 freq +110538 +/- 47 delay 770 +/- 4 Apr 05 16:36:00 doca-vr-045 phc2sys[337829]: [457403.281] CLOCK_REALTIME rms 8 max 21 freq +110517 +/- 38 delay 769 +/- 3 Apr 05 16:36:01 doca-vr-045 phc2sys[337829]: [457404.311] CLOCK_REALTIME rms 8 max 17 freq +110526 +/- 26 delay 769 +/- 3 ...

At this point, the system and NIC clocks are synchronized so timestamps provided by the CPU are correctly interpreted by the NIC.

Warning The timestamps you get may not reflect the real time and day. To get that, you must properly set the ptp4l service with an external grand master on the system. Doing that is out of the scope of this sample.





To build a given sample, run the following command. If you downloaded the sample from GitHub, update the path in the first line to reflect the location of the sample file:

phc2sys Collapse Source Copy Copied! cd /opt/mellanox/doca/samples/doca_gpunetio/gpunetio_send_wait_time meson build ninja -C build

The sample sends 8 bursts of 32 raw Ethernet packets or 1kB to a dummy Ethernet address, 10:11:12:13:14:15 , in a timed way. Program the NIC to send every t nanoseconds (command line option -t ).

The following example programs a system with GPU PCIe address ca:00.0 and NIC PCIe address 17:00.0 to send 32 packets every 5 milliseconds:

Run Collapse Source Copy Copied! $ sudo ./build/doca_gpunetio_send_wait_time -n 17:00.0 -g ca:00.0 -t 5000000[09:22:54:165778][1316878][DOCA][INF][gpunetio_send_wait_time_main.c:195][main] Starting the sample [09:22:54:438260][1316878][DOCA][INF][gpunetio_send_wait_time_main.c:224][main] Sample configuration: GPU ca:00.0 NIC 17:00.0 Timeout 5000000ns EAL: Detected CPU lcores: 128 ... EAL: Probe PCI driver: mlx5_pci (15b3:a2d6) device: 0000:17:00.0 (socket 0) [09:22:54:819996][1316878][DOCA][INF][gpunetio_send_wait_time_sample.c:607][gpunetio_send_wait_time] Wait on time supported mode: DPDK EAL: Probe PCI driver: gpu_cuda (10de:20b5) device: 0000:ca:00.0 (socket 1) [09:22:54:830212][1316878][DOCA][INF][gpunetio_send_wait_time_sample.c:252][create_tx_buf] Mapping send queue buffer (0x0x7f48e32a0000 size 262144B) with legacy nvidia-peermem mode [09:22:54:832462][1316878][DOCA][INF][gpunetio_send_wait_time_sample.c:657][gpunetio_send_wait_time] Launching CUDA kernel to send packets [09:22:54:842945][1316878][DOCA][INF][gpunetio_send_wait_time_sample.c:664][gpunetio_send_wait_time] Waiting 10 sec for 256 packets to be sent [09:23:04:883309][1316878][DOCA][INF][gpunetio_send_wait_time_sample.c:684][gpunetio_send_wait_time] Sample finished successfully [09:23:04:883339][1316878][DOCA][INF][gpunetio_send_wait_time_main.c:239][main] Sample finished successfully

To verify that packets are actually sent at the right time, use a packet sniffer on the other side (e.g., tcpdump ):

phc2sys Collapse Source Copy Copied! $ sudo tcpdump -i enp23s0f1np1 -A -s 64 17:12:23.480318 IP5 (invalid) Sent from DOCA GPUNetIO........................... .... 17:12:23.480368 IP5 (invalid) Sent from DOCA GPUNetIO........................... 17:12:23.485321 IP5 (invalid) Sent from DOCA GPUNetIO........................... ... 17:12:23.485369 IP5 (invalid) Sent from DOCA GPUNetIO........................... 17:12:23.490278 IP5 (invalid) Sent from DOCA GPUNetIO........................... ...

The output should show a jump of approximately 5 milliseconds every 32 packets.

Note tcpdump may increase latency in sniffing packets and reporting the receive timestamp, so the difference between bursts of 32 packets reported may be less than expected, especially with small interval times like 500 microseconds ( -t 500000 ).



