DOCA Firefly Service Guide

This guide provides instructions on how to use the DOCA Firefly service container on top of NVIDIA® BlueField® DPU.

Introduction

DOCA Firefly Service provides precision time protocol (PTP) based time syncing services to the BlueField DPU .

PTP is a protocol used to synchronize clocks in a network. When used in conjunction with hardware support, PTP is capable of sub-microsecond accuracy, which is far better than is what is normally obtainable with network time protocol (NTP). PTP support is divided between the kernel and user space. The ptp4l program implements the PTP boundary clock and ordinary clock. With hardware time stamping, it is used to synchronize the PTP hardware clock to the master clock.

arch-diagram-version-1-modificationdate-1743579651787-api-v2.png

Requirements

Some of the features provided by Firefly require specific BlueField DPU hardware capabilities:

PTP – Supported by all BlueField DPUs
PPS – Requires BlueField DPU with PPS capabilities
SyncE - Requires converged card BlueField DPUs

Failure to run PPS due to missing hardware support will be noted in the service's output. However, the service will continue to run the timing services it can provide on the provided hardware.

Firmware Version

Firmware version must be 24.34.1002 or higher.

BlueField BSP Version

Supported BlueField image versions are 3.9.0 and higher.

Embedded Mode

Configuring Firmware Settings on DPU for Embedded Mode

Set the DPU to embedded mode (default mode):

Copy
Copied!

            
            sudo mlxconfig -y -d 03:00.0 s INTERNAL_CPU_MODEL=1

Enable the real time clock (RTC):

Copy
Copied!

            
            sudo mlxconfig -d 03:00.0 set REAL_TIME_CLOCK_ENABLE=1

Graceful shutdown and power cycle the DPU to apply the configuration.

You may check the DPU mode using the following command:

Copy
Copied!

            
            sudo mlxconfig -d 03:00.0 q | grep INTERNAL_CPU_MODEL
# Example output
         INTERNAL_CPU_MODEL                  EMBEDDED_CPU(1)

Ensuring OVS Hardware Offload

DOCA Firefly requires that hardware offload is activated in Open vSwitch (OVS). This is enabled by default as part of the BFB image installed on the DPU.

To verify the hardware offload configuration in OVS:

Copy
Copied!

            
            sudo ovs-vsctl get Open_vSwitch . other_config | grep hw-offload
# Example output
         {hw-offload="true"}

If inactive:

Activate hardware offloading by running:

Copy
Copied!

            
            sudo ovs-vsctl set Open_vSwitch . other_config:hw-offload=true;

Restart the OVS service:

Copy
Copied!

            
            sudo /etc/init.d/openvswitch-switch restart

Graceful shutdown and power cycle the DPU to apply the configuration.

Helper Scripts

Firefly's deployment contains a script to help with the configuration steps required for the network interface in embedded mode:

scripts/doca_firefly/<firefly-version>/prepare_for_embedded_mode.sh
scripts/doca_firefly/<firefly-version>/set_new_sf.sh

The latest DOCA Firefly version is 1.7.0.

Both scripts are included as part of DOCA's container resource which can be downloaded according to the instructions in the NVIDIA DOCA Container Deployment Guide. For more information about the structure of the DOCA container resource, refer to section "Structure of NGC Resource" in the deployment guide.

Note

Due to technical limitations of the NGC resource, both scripts are provided without execute (+x) permissions. This could be resolved by running the following command:

Copy
Copied!

            
            chmod +x scripts/doca_firefly/<firefly-version>/*.sh

prepare_for_embedded_mode.sh

This script automates all the steps mentioned in section "Setting Up Network Interfaces for Embedded Mode" and configures a freshly installed BFB image to the settings required by DOCA Firefly.

Notes:

The script deletes all previous OVS settings and creates a single OVS bridge that matches the definitions in section "Setting Up Network Interfaces for Embedded Mode"
The script should only be run once when connecting to the DPU for the first time or after a power cycle
The only manual step required after using this script is configuring the IP address for the created network interface (step 5 in section "Setting Up Network Interfaces for Embedded Mode")
The script automatically uses port 0 (p0). Configurations for port 1 should be done manually based on the commands listed in sections "set_new_sf.sh" and "Setting Up Network Interfaces for DPU Mode".

Script arguments:

SF number (checks if already exists)

Examples:

Prepare OVS settings using an SF indexed 4:

Copy
Copied!

            
            chmod +x ./*.sh
./prepare_for_embedded_mode.sh 4

The script makes use of set_new_sf.sh as a helper script.

set_new_sf.sh

Creates a new trusted SF and marks it as "trusted".

Script arguments:

PCIe address
SF number (checks if already exists)
MAC address (if absent, a random address is generated)

Examples:

Create SF with number "4" over port 0 of the DPU:

Copy
Copied!

            
            ./set_new_sf.sh 0000:03:00.0 4

Create SF with number "5" over port 0 of the DPU and a specific MAC address:

Copy
Copied!

            
            ./set_new_sf.sh 0000:03:00.0 5 aa:bb:cc:dd:ee:ff

Create SF with number "4" over port 1 of the DPU:

Copy
Copied!

            
            ./set_new_sf.sh 0000:03:00.1 4

The first two examples should work out of the box for a BlueField-2 device and create SF4 and SF5 respectively.

Setting Up Network Interfaces for DPU Mode

Create a trusted SF to be used by the service according to the Scalable Function Setup Guide .

Note

The following instructions assume that the SF has been created using index 4.

Create the required OVS setting as is shown in the architecture diagram:

Copy
Copied!

            
            $ sudo ovs-vsctl add-br uplink
$ sudo ovs-vsctl add-port uplink p0
$ sudo ovs-vsctl add-port uplink en3f0pf0sf4
# This port is needed to ensure we have traffic host<->network as well
$ sudo ovs-vsctl add-port uplink pf0hpf

Verify the OVS settings:

Copy
Copied!

            
            sudo ovs-vsctl show
    Bridge uplink
        Port pf0hpf
            Interface pf0hpf
        Port en3f0pf0sf4
            Interface en3f0pf0sf4
        Port p0
            Interface p0
        Port uplink
            Interface uplink
                type: internal

Enable TX timestamping on the SF interface (not the representor):

Copy
Copied!

            
            # tx port timestamp offloading
sudo ethtool --set-priv-flags enp3s0f0s4 tx_port_ts on

Enable the interface and set an IP address for it:

Copy
Copied!

            
            # configure ip for the interface:
sudo ifconfig enp3s0f0s4 <ip-addr> up

Configure OVS to support TX timestamping over this SF and multicast traffic in general:

Copy
Copied!

            
            # Multicast-related definitions
$ sudo ovs-vsctl set Bridge uplink mcast_snooping_enable=true
$ sudo ovs-vsctl set Bridge uplink other_config:mcast-snooping-disable-flood-unregistered=true
$ sudo ovs-vsctl set Port p0 other_config:mcast-snooping-flood=true
$ sudo ovs-vsctl set Port p0 other_config:mcast-snooping-flood-reports=true
# PTP-related definitions
$ sudo ovs-ofctl add-flow uplink in_port=en3f0pf0sf4,udp,tp_src=319,actions=output:p0
$ sudo ovs-ofctl add-flow uplink in_port=p0,udp,tp_src=319,actions=output:en3f0pf0sf4
$ sudo ovs-ofctl add-flow uplink in_port=en3f0pf0sf4,udp,tp_src=320,actions=output:p0
$ sudo ovs-ofctl add-flow uplink in_port=p0,udp,tp_src=320,actions=output:en3f0pf0sf4

Note

If your OVS bridge uses a name other than uplink, make sure that the used name is reflected in the ovs-vsctland ovs-ofctl commands. For instance:

Copy
Copied!

            
            $ sudo ovs-vsctl set Bridge <bridge-name> mcast_snooping_enable=true

Separated Mode

Configuring Firmware Settings on DPU for Separated Mode

Set the BlueField mode of operation to "Separated":

Copy
Copied!

            
            sudo mlxconfig -y -d 03:00.0 s INTERNAL_CPU_MODEL=0

Enable RTC:

Copy
Copied!

            
            sudo mlxconfig -d 03:00.0 set REAL_TIME_CLOCK_ENABLE=1

Graceful shutdown and power cycle the DPU to apply the configuration.

You may check the BlueField's operation mode using the following command:

Copy
Copied!

            
            sudo mlxconfig -d 03:00.0 q | grep INTERNAL_CPU_MODEL
# Example output
         INTERNAL_CPU_MODEL                  SEPARATED_HOST(0)

Setting Up Network Interfaces for Separated Mode

Make sure that that p0 is not connected to an OVS bridge:

Copy
Copied!

            
            sudo ovs-vsctl show

Enable TX timestamping on the p0 interface:

Copy
Copied!

            
            # TX port timestamp offloading (assuming PTP interface is p0)
sudo ethtool --set-priv-flags p0 tx_port_ts on

Enable the interface and set an IP address for it:

Copy
Copied!

            
            # Configure IP for the interface
sudo ifconfig p0 <ip-addr> up

Host-based Deployment

Host-based deployment requires the same configuration described under section "Separated Mode".

Service Deployment

DPU Deployment

For information about the deployment of DOCA containers on top of the BlueField DPU, refer to NVIDIA DOCA Container Deployment Guide.

Service-specific configuration steps and deployment instructions can be found under the service's container page.

Note

DOCA Firefly can also be deployed on DPUs not connected to the Internet. For instructions, refer to the relevant section in the NVIDIA DOCA Container Deployment Guide.

Host Deployment

DOCA Firefly has a version adapted for host-based deployments. For more information about the deployment of DOCA containers on top of a host, refer to the NVIDIA BlueField DPU Container Deployment Guide.

The following is the docker command for deploying DOCA Firefly on the host:

Copy
Copied!

            
            sudo docker run --privileged --net=host -v /var/log/doca/firefly:/var/log/firefly -v /etc/firefly:/etc/firefly -e PTP_INTERFACE='eth2' -it nvcr.io/nvidia/doca/doca_firefly:1.7.0-doca3.0.0-host /entrypoint.sh

Where:

Additional YAML configs may be passed as environment variables as additional -e key-value pairs as done with PTP_INTERFACE above
The exact container tag should be the desired tag as chosen on DOCA Firefly's NGC page

Configuration

All modules within the service have configuration files that allow customizing various settings, both general and PTP-related.

Built-In Config File

Each profile has its own base PTP configuration file for ptp4l. For example, the Media profile PTP configuration file is ptp4l-media.conf.

The built-in PTP configuration files can be found in section "PTP Profile Default Config Files". For ease-of-use, those files are provided as part of DOCA's container resource as downloaded from NGC and are placed under Firefly's configs directory (scripts/doca_firefly/<firefly version>/configs).

Note

When using a built-in configuration file, Firefly uses the files as stored within the container itself in the /etc/linuxptp directory. The configuration files included in the NGC resource are only provided for ease of access. Modifying them does not impact the configuration used in practice by the container. Instead, updates to the configuration should be done as described in the following sections.

Custom Config File

Instead of using a profile's base config file, users can create a file of their own, for each of the modules.

To set a custom config file, users should locate their config file in the directory /etc/firefly and set the config file name in DOCA Firefly's YAML file.

For example, to set a custom linuxptp config file, the user can set the parameter PTP_CONFIG_FILE in the YAML file:

Copy
Copied!

            
            - name: PTP_CONFIG_FILE
  value: my_custom_ptp.conf

In this example, my_custom_ptp.conf should be placed at /etc/firefly/my_custom_ptp.conf.

Note

A config file must not define values for the UDS-related ports (/var/run/ptp4l and /var/run/ptp4lro), as those will impact internal container behavior. Such settings will prompt a warning and will be ignored when preparing the finalized configuration (See more in the next sections).

Overriding Specific Config File Parameters

Instead of replacing the entire config file, users may opt to override specific parameters. This can be done using the following variable syntax in the YAML file: CONF_<TYPE>_<SECTION>_<PARAMETER_NAME>.

TYPE – either PTP, MONITOR, PHC2SYS, SYNCE, or SERVO
SECTION – the section in the config file that the parameter should be placed in

Note

If the specified section does not already exist in the config file, a new section is created unless it refers to a PTP network interface that has not been included in the PTP_INTERFACE YAML field.
PARAMETER_NAME – the config parameter name as should be placed in the config file

Note

If the parameter name already exists in the config file, then the value is changed according to the value provided in the .yaml file. If the parameter name does not already exist in the config file, then it is added.

For example, the following variable in the YAML file definition changes the value of the parameter priority1 under section global in the PTP config file to 64.

Copy
Copied!

            
            - name: CONF_PTP_global_priority1
  value: "64"

Note

Configuring unicast_master_table through the YAML file is not supported due to the structure of the table (i.e., multiple entries sharing the same key).

Ensuring and Debugging Correctness of Config Files

The previous sections describe 2 layers for the configuration file definitions:

Basic configuration file – either a built-in config file or a custom config file
Adding/overriding values to/from the YAML file

In practice, there are slightly more layers in place, and the precedence is as follows (presented in increasing order):

Default configuration values of the PTP program (ptp4l for instance) – holds values of all available configuration options
Your chosen configuration file – contains a subset of options
Definitions from the YAML file – narrower subset
Firefly mandatory values

When combining the supplied configuration file with the definitions from the YAML file, Firefly goes over those definitions and checks them against a predefined set of configuration options:

Warning only – warns if a certain value leads to known issues in a supported deployment scenario
Override – container-internal definitions that should not be set by the user and will be overridden by Firefly

Suitable log messages are provided in either case:

Copy
Copied!

            
            # Example for a warning
2023-01-31 11:55:13 - Firefly - Config - INFO    - Missing explicit definition "fault_reset_interval", verifying default value instead: "4"
2023-01-31 11:55:13 - Firefly - Config - WARNING - Value "4" for definition "fault_reset_interval" will be invalid in Embedded Mode, expected a value lesser or equal to "1"
2023-01-31 11:55:13 - Firefly - Config - WARNING - Continuing with invalid value
# Example for an override
2023-01-31 11:21:00 - Firefly - Config - WARNING - Invalid value "/var/run/ptp4l2" for definition "uds_address", expected "/var/run/ptp4l"
2023-01-31 11:21:00 - Firefly - Config - INFO    - Setting definition "uds_address" value to the following: "/var/run/ptp4l"

At the end of this process, an updated configuration file is generated by Firefly to be used later by the various time providers. To avoid accidental modification of a user-supplied configuration file or permission issues, the finalized file is generated within the container under the /tmp directory.

For instance, if using a custom configuration file named my_custom_ptp.conf under the /etc/firefly directory on the DPU, the updated file will reside within the container at the following path: /tmp/my_custom_ptp.conf.

For troubleshooting possible issues with the configuration file, one can do one of the following:

Connect to the container directly as is explained in the debugging finalized configuration file bullet under "Troubleshooting".

Map the container's /tmp directory to the DPU using the built-in support in the YAML file:

Before the change:

Copy
Copied!

            
                # Uncomment when debugging the finalized configuration files used - Part #1
    #- name: debug-firefly-volume
    #  hostPath:
    #    path: /tmp/firefly
    #    type: DirectoryOrCreate
  containers:
    ...
      volumeMounts:
      - name: logs-firefly-volume
        mountPath: /var/log/firefly
      - name: conf-firefly-volume
        mountPath: /etc/firefly
      # Uncomment when debugging the finalized configuration files used - Part #2
      #- name: debug-firefly-volume
      #  mountPath: /tmp

After the change:

Copy
Copied!

            
                # Uncomment when debugging the finalized configuration files used - Part #1
    - name: debug-firefly-volume
      hostPath:
        path: /tmp/firefly
        type: DirectoryOrCreate
  containers:
    ...
      volumeMounts:
      - name: logs-firefly-volume
        mountPath: /var/log/firefly
      - name: conf-firefly-volume
        mountPath: /etc/firefly
      # Uncomment when debugging the finalized configuration files used - Part #2
      - name: debug-firefly-volume
        mountPath: /tmp

Note

The finalized configuration file keeps the sections and config options in the same order as they appear in the original file, yet the file is stripped from spare new lines or comment lines. This should be taken into considerations when directly accessing it during a debugging session.

Description

Providers

DOCA Firefly Service uses the following third-party providers to provide time syncing services:

Linuxptp - Version v4.2
- PTP – PTP service, provided by the PTP4L program
- PHC2SYS – OS time calibration, provided by the PHC2SYS program
Testptp
- PPS - PPS settings service

In addition, DOCA Firefly Service also makes use of the following NVIDIA modules:

SyncE
- SYNCE – Synchronous Ethernet Deamon (synced)
Firefly
- MONITOR - Firefly PTP Monitor
Firefly
- SERVO - Firefly PTP Servo

Each of the providers can be enabled, disabled, or set to use the setting defined by the configuration profile:

YAML setting – <provider name>_STATE
Supported values – enable, disable, defined_by_profile

Note

For the default profile settings per provider, refer to the table under section "Profiles".

An example YAML setting for specifically disabling the phc2sys provider is the following:

Copy
Copied!

            
            - name: PHC2SYS_STATE
  value: "disable"

Note

The defined_by_profile setting is only available for well-defined profiles. As such, it cannot be used when the custom profile is selected. For more information about the profile settings, refer to the table under section "Profiles".

Profiles

DOCA Firefly Service includes profiles which represent common use cases for the Firefly service that provide a different default configuration per profile:

	Default	Media	Telco (L2)	Custom
Purpose	Any user that requires PTP	Media productions	Telco networks	Custom configuration for a dedicated user scenario
PTP	Enabled	Enabled	Enabled	No default. Enable/disable should be set by the user.
PTP profile	PTP default profile	SMPTE 2059-2	G.8275.1	Set by the user
PTP Client/Server ¹	Both	Client-only	Both	Set by the user
PHC2SYS	Enabled	Enabled	Enabled	No default. Enable/disable should be set by the user.
PPS (in/out)	Enabled	Enabled	Enabled	No default. Enable/disable should be set by the user.
PTP Monitor	Disabled	Disabled	Disabled	No default. Enable/disable should be set by the user.
SyncE	Disabled	Disabled	Enabled	No default. Enable/disable should be set by the user.
Servo	Disabled	Disabled	Disabled	No default. Enable/disable should be set by the user.

Client-only is only relevant to a single PTP interface. If more than one PTP interface is provided in the YAML file, both modes are enabled.

Outputs

Container Output

While running, the full output of the DOCA Firefly Service container can be viewed using the following command:

Copy
Copied!

            
            sudo crictl logs <CONTAINER-ID>

Where CONTANIER-ID can be retrieved using the following command:

Copy
Copied!

            
            sudo crictl ps

For example, in the following output, the container ID is 8f368b98d025b.

Copy
Copied!

            
            $ sudo crictl ps
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
8f368b98d025b       289809f312b4c       2 seconds ago       Running             doca-firefly        0                   5af59511b4be4       doca-firefly-some-computer-name

The output of the container depends on the services supported by the hardware and enabled by configuration and the selected profile. However, note that any of the configurations runs PTP, so when DOCA FireFly is running successfully expect to see the line "Running ptp4l".

The following is an example of the expected container output when running the default profile on a DPU that supports PPS:

Copy
Copied!

            
            2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init 2023-09-07 14:04:23 - Firefly - Init

- INFO     - Starting DOCA Firefly - Version 1.4.0 - INFO     - Selected features: - INFO     - [+] PTP     - Enabled - ptp4l will be used - INFO     - [+] MONITOR - Enabled - PTP Monitor will be used - INFO     - [+] PHC2SYS - Enabled - phc2sys will be used - INFO     - [-] SyncE   - Disabled - INFO     - [-] SERVO   - Disabled - INFO     - [+] PPS     - Enabled - testptp will be used (if supported by hardware) - INFO     - Going to analyze the configuration files - INFO     - Requested the following PTP interface: p0 - INFO     - Starting PPS configuration - INFO     - [+] PPS is supported by hardware - INFO     - set pin function okay - INFO     - [+] PPS in - Activated - INFO     - set pin function okay - INFO     - [+] PPS out - Activated - INFO     - name mlx5_pps0 index 0 func 1 chan 0 - INFO     - name mlx5_pps1 index 1 func 2 chan 0 - INFO     - periodic output request okay - INFO     - Running ptp4l - INFO     - Running Firefly PTP Monitor - INFO     - Running phc2sys

The following is an example of the expected container output when running the default profile on a DPU that does not support PPS:

Copy
Copied!

            
            2023-09-07 14:04:23 - Firefly - Init    - INFO     - Starting DOCA Firefly - Version 1.3.0
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Selected features:
2023-09-07 14:04:23 - Firefly - Init    - INFO     - [+] PTP     - Enabled - ptp4l will be used
2023-09-07 14:04:23 - Firefly - Init    - INFO     - [+] MONITOR - Enabled - PTP Monitor will be used
2023-09-07 14:04:23 - Firefly - Init    - INFO     - [+] PHC2SYS - Enabled - phc2sys will be used
2023-09-07 14:04:23 - Firefly - Init    - INFO     - [-] SyncE   - Disabled
2023-09-07 14:04:23 - Firefly - Init    - INFO     - [-] SERVO   - Disabled
2023-09-07 14:04:23 - Firefly - Init    - INFO     - [+] PPS     - Enabled - testptp will be used (if supported by hardware)
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Going to analyze the configuration files
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Requested the following PTP interface: p0
2023-09-07 14:04:23 - Firefly
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Starting PPS configuration
2023-09-07 14:04:23 - Firefly - Init    - WARNING  - [-] PPS capability is missing, seems that the card doesn't support PPS
2023-09-07 14:04:23 - Firefly - Init    - INFO     - capabilities:
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   50000000 maximum frequency adjustment (ppb)
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 programmable alarms
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 external time stamp channels
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 programmable periodic signals
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 pulse per second
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 programmable pins
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 cross timestamping
2023-09-07 14:04:23 - Firefly
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Running ptp4l
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Running Firefly PTP Monitor
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Running phc2sys

Firefly Output

On top of the container's log, Firefly defines an additional, non-volatile log that can be found in /var/log/doca/firefly/firefly.log.

This file contains the same output described in section "Container Output" and is useful for debugging deployment errors should the container stop its execution.

Note

To avoid disk space issues, the /var/log/doca/firefly/firefly.log file only contains the log from Firefly's initialization, and not the logs of the rest of the modules (ptp4l, phc2sys, etc.) or that of the PTP monitor. The latter is still included in the container log and can be inspected using the command sudo crictl logs <CONTAINER-ID>.

ptp4l Output

The ptp4l output can be found in the file /var/log/doca/firefly/ptp4l.log.

Example output:

Copy
Copied!

            
            ptp4l[192710.691]: rms 1 max 1 freq -114506 +/- 0 delay -15 +/- 0
ptp4l[192712.692]: rms 6 max 9 freq -114501 +/- 3 delay -15 +/- 0
ptp4l[192714.692]: rms 7 max 9 freq -114511 +/- 3 delay -13 +/- 0
ptp4l[192716.692]: rms 5 max 7 freq -114502 +/- 1 delay -13 +/- 0
ptp4l[192718.693]: rms 4 max 6 freq -114509 +/- 2 delay -13 +/- 0
ptp4l[192720.693]: rms 3 max 3 freq -114506 +/- 2 delay -13 +/- 0
ptp4l[192722.694]: rms 4 max 6 freq -114510 +/- 3 delay -12 +/- 0
ptp4l[192724.694]: rms 5 max 7 freq -114510 +/- 5 delay -12 +/- 1
ptp4l[192726.695]: rms 4 max 5 freq -114508 +/- 3 delay -11 +/- 0
ptp4l[192728.695]: rms 6 max 9 freq -114504 +/- 4 delay -11 +/- 0

phc2sys Output

The phc2sys output can be found in the file /var/log/doca/firefly/phc2sys.log.

Example output:

Copy
Copied!

            
            phc2sys[1873325.928]: reconfiguring after port state change
phc2sys[1873325.928]: selecting CLOCK_REALTIME for synchronization
phc2sys[1873325.928]: selecting enp3s0f0s4 as the master clock
phc2sys[1873325.928]: CLOCK_REALTIME phc offset      1378 s2 freq -165051 delay    255
phc2sys[1873326.928]: CLOCK_REALTIME phc offset      1378 s2 freq -163673 delay    240
phc2sys[1873327.928]: port 62b785.fffe.0c9369-1 changed state
phc2sys[1873327.929]: CLOCK_REALTIME phc offset        14 s2 freq -164624 delay    255
phc2sys[1873328.936]: CLOCK_REALTIME phc offset        89 s2 freq -164545 delay    240

SyncE Output

The SyncE output can be found in the file /var/log/doca/firefly/synced.log.

Example output:

Copy
Copied!

            
            INFO     [05/09/2023 05:11:01.493414]: SyncE Group #0: is in TRACKING holdover acquired mode on p0, frequency_diff: 0 (ppb)
INFO     [05/09/2023 05:11:02.502963]: SyncE Group #0: is in TRACKING holdover acquired mode on p0, frequency_diff: -113 (ppb)
INFO     [05/09/2023 05:11:03.512491]: SyncE Group #0: is in TRACKING holdover acquired mode on p0, frequency_diff: 37 (ppb)

Note

The verbosity of the output from the SYNCE module is limited by default. To set the output to be more verbose, set the verbose option to 1 (True).

Before:

Copy
Copied!

            
            # Example #4 - Overwrite the value of verbose in the [global] section of the SyncE configuration file.
#- name: CONF_SYNCE_global_verbose
#  value: "1"

After:

Copy
Copied!

            
            # Example #4 - Overwrite the value of verbose in the [global] section of the SyncE configuration file.
- name: CONF_SYNCE_global_verbose
  value: "1"

Firefly Servo Output

The Firefly servo output can be found in the file /var/log/doca/firefly/servo.log.

Example output:

Copy
Copied!

            
            2024-03-18 09:04:22 - Firefly - SERVO  - INFO     - offset   +8 +/- 2  freq   -5.66 +/- 0.41  delay -48 +/- 2
2024-03-18 09:04:24 - Firefly - SERVO  - INFO     - offset   +4 +/- 2  freq   -6.35 +/- 0.36  delay -47 +/- 2
2024-03-18 09:04:26 - Firefly - SERVO  - INFO     - offset   +2 +/- 2  freq   -6.75 +/- 0.41  delay -47 +/- 1
2024-03-18 09:04:28 - Firefly - SERVO  - INFO     - offset   +0 +/- 2  freq   -6.97 +/- 0.35  delay -47 +/- 1
2024-03-18 09:04:30 - Firefly - SERVO  - INFO     - offset   +0 +/- 3  freq   -7.30 +/- 0.60  delay -47 +/- 1
2024-03-18 09:04:33 - Firefly - SERVO  - INFO     - offset   +1 +/- 2  freq   -6.93 +/- 0.41  delay -47 +/- 1
2024-03-18 09:04:35 - Firefly - SERVO  - INFO     - offset   +1 +/- 2  freq   -6.81 +/- 0.48  delay -47 +/- 1
2024-03-18 09:04:37 - Firefly - SERVO  - INFO     - offset   +2 +/- 2  freq   -6.76 +/- 0.52  delay -48 +/- 2

Tx Timestamping Support on DPU Mode

When the BlueField is operating in DPU mode, additional OVS configuration is required as mentioned in step 6 of section "Setting Up Network Interfaces for DPU Mode". This configuration achieves the following:

Proper support for incoming/outgoing multicast traffic
Enabling Tx timestamping

Firefly only gets the packet timestamping for outgoing PTP messages (Tx timestamping) when they are offloaded to the hardware. As such, when working with OVS, users must ensure this traffic flow is properly recognized and offloaded. If offloading does not take place, Firefly gets stuck in a fault loop while waiting to receive the Tx timestamp events:

Copy
Copied!

            
            ptp4l[2912.797]: timed out while polling for tx timestamp
ptp4l[2912.797]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
ptp4l[2912.797]: port 1 (enp3s0f0s4): send sync failed
ptp4l[2923.528]: timed out while polling for tx timestamp
ptp4l[2923.528]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
ptp4l[2923.528]: port 1 (enp3s0f0s4): send sync failed

The solution to this issue:

Activation of hardware offloading in OVS
OpenFlow rules that ensure OVS properly recognizes the traffic and offloads it to the hardware
Modification to the fault_reset_interval configuration value to ensure timely recovery from the fault induced by the first packet being always treated by software (until the rule is offloaded to hardware). As such, Firefly requires that the fault_reset_interval value is 1 or less. Proper warnings are raised if an improper value is detected. The value is updated accordingly in the built-in profiles.

When these configurations are in order, Firefly includes a report for a single fault during boot, but recovers from it and continues as usual:

Copy
Copied!

            
            ptp4l[3715.687]: timed out while polling for tx timestamp
ptp4l[3715.687]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
ptp4l[3715.687]: port 1 (enp3s0f0s4): send delay request failed

Troubleshooting Tx Timestamp Issues

As explained earlier, there are several layers required to ensure Tx timestamping works as necessary by Firefly. The following is a list of commands to debug the state of each layer:

Inspect the OpenFlow rules:

Copy
Copied!

            
            $ sudo ovs-ofctl dump-flows uplink
cookie=0x0, duration=4075.576s, table=0, n_packets=2437, n_bytes=209582, udp,in_port=en3f0pf0sf4,tp_src=319 actions=output:p0
cookie=0x0, duration=4075.549s, table=0, n_packets=1216, n_bytes=109420, udp,in_port=p0,tp_src=319 actions=output:en3f0pf0sf4
cookie=0x0, duration=4075.521s, table=0, n_packets=13, n_bytes=1242, udp,in_port=en3f0pf0sf4,tp_src=320 actions=output:p0
cookie=0x0, duration=4074.604s, table=0, n_packets=3034, n_bytes=297376, udp,in_port=p0,tp_src=320 actions=output:en3f0pf0sf4
cookie=0x0, duration=4075.856s, table=0, n_packets=184, n_bytes=12901, priority=0 actions=NORMAL

Inspect hardware TC rules while DOCA Firefly is deployed (the rules age out after 10 seconds without traffic):

Copy
Copied!

            
            $ sudo tc -s -d filter show dev en3f0pf0sf4 egress
filter ingress protocol ip pref 4 flower chain 0 
filter ingress protocol ip pref 4 flower chain 0 handle 0x1 
  eth_type ipv4
  ip_proto udp
  src_port 320
  ip_flags nofrag
  in_hw in_hw_count 1
	action order 1: mirred (Egress Redirect to device p0) stolen
 	index 3 ref 1 bind 1 installed 7 sec used 7 sec
 	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
	backlog 0b 0p requeues 0
	cookie bec8bd6ede4e86341e9045a6edb58ca2
	no_percpu
 
filter ingress protocol ip pref 4 flower chain 0 handle 0x2 
  eth_type ipv4
  ip_proto udp
  src_port 319
  ip_flags nofrag
  in_hw in_hw_count 1
	action order 1: mirred (Egress Redirect to device p0) stolen
 	index 4 ref 1 bind 1 installed 6 sec used 6 sec
 	Action statistics:
	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
	backlog 0b 0p requeues 0
	cookie c568d97efd400de98608fbbf86ccdf3c
	no_percpu

Note

If no TC rules are present when Firefly is running, this usually indicates that hardware offloading is disabled at the OVS level, in which case it should be activated as explained under "Ensuring OVS Hardware Offload".

PTP

Firefly uses the ptp4l utility to handle the Precision Time Protocol (IEEE 1588).

Through the YAML file, users can configure the network interfaces used for the protocol:

Copy
Copied!

            
            # Network interfaces to be used (For multiple interfaces use a space (" ") separated list)
- name: PTP_INTERFACE
  # Set according to used interfaces on the local setup
  value: "p0"

Before the deployment of the container, users should configure this field to point at the desired network interface(s) configured in the previous steps.

PHC2SYS

Firefly uses the phc2sys utility to synchronize the OS's clock to the accurate time stamps received by ptp4l.

Through the YAML file, users can configure the command-line arguments used by the phc2sys program:

Copy
Copied!

            
            - name: PHC2SYS_ARGS
  value: "-a -r"

Firefly adds the following command-line arguments on top of the user-selected flags:

Use of chosen configuration file (empty configuration file by default, or user-supplied file if specified in the YAML file)
Redirection of output to a log file using the -m command line option

Note

phc2sys must use the same domainNumber setting used by ptp4l. If the same domainNumber is not set by the user, Firefly does that automatically.

Note

phc2sys is only able to accurately sync the clock of the hosting environment (usually the DPU, but may also be the host if deployed there) if other timing services, such as NTP, are disabled.

So, for instance, on Ubuntu 22.04, users must ensure that the NTP timing service is disabled by running:

Copy
Copied!

            
            systemctl stop systemd-timesyncd

SYNCE

Firefly uses the proprietary synced utility to implement the Synchronous Ethernet protocol, aimed at ensuring synchronization of the clock's frequency with the reference clock. Once achieved, both clocks are declared as "syntonized".

Through the YAML file, users can configure the network interfaces used for the protocol:

Copy
Copied!

            
            # Network interfaces to be used (For multiple interfaces use a space (" ") separated list)
- name: SYNCE_INTERFACE
  # Set according to used interfaces on the local setup
  value: "p0"

Before the deployment of the container, one should configure this field to point at the desired network interface(s) configured in the previous steps.

DOCA includes synced support for the "dpll" backend (default) which adds support for SFs and VFs. The "dpll" backend is the default backend used. If DOCA detects the system does not support it, it will automatically falls back to the "mft" backend.

Note

In versions older than kernel 6.8 or BlueField Platform Software 2.8.0, only PFs are supported and only using the "mft" backend.

The backend option can be explicitly set using the YAML file by uncommenting the following lines:

Before

Copy
Copied!

            
            # Example #5 - Explicitly specify the used backend in the [global] section of the SyncE configuration file.
#- name: CONF_SYNCE_global_backend
#  # Options are "mft"/"dpll". If nothing is specified in YAML, "dpll" is taken as the default
#  value: "mft"

After

Copy
Copied!

            
            # Example #5 - Explicitly specify the used backend in the [global] section of the SyncE configuration file.
- name: CONF_SYNCE_global_backend
  # Options are "mft"/"dpll". If nothing is specified in YAML, "dpll" is taken as the default
  value: "mft"

The following is an example for the OVS commands required to route the SyncE-related traffic when using a SF on top of the "dpll" backend:

Copy
Copied!

            
            $ sudo ovs-ofctl add-flow uplink dl_dst=01:80:c2:00:00:02,in_port=en3f0pf0sf4,actions=p0
$ sudo ovs-ofctl add-flow uplink dl_dst=01:80:c2:00:00:02,in_port=p0,actions=en3f0pf0sf4
$ sudo ovs-ofctl add-flow uplink dl_dst=01:80:c2:00:00:02,actions=controller

Info

This example uses the same OVS settings used earlier in the guide:

uplink – bridge name
en3f0pf0sf4 – SF representor
p0 – PF interface we are working (port 0)

If your deployment uses different values make sure to adjust the above commands accordingly.

If the kernel version does not yet support this feature, and SF/VF are used, the following error is printed:

Copy
Copied!

            
            ...
mlx5 DPLL kernel support appears to be missing
Falling back to MFT tools backend
...

If this error is shown, only PFs can be used, and synced falls back to using the "mft" backend.

PTP Monitor

PTP monitor periodically queries for various PTP-related information and prints it to the container's log.

The following is a sample output of this tool:

Copy
Copied!

            
            gmIdentity:                48:B0:2D:FF:FE:5C:4D:24 (48b02d.fffe.5c4d24)
portIdentity:              48:B0:2D:FF:FE:5C:53:44 (48b02d.fffe.5c5344-1)
port_state:                Active
domainNumber:              2
master_offset:             avg:	1	max:	-8	rms:	3
gmPresent:                 true
ptp_stable:                Recovered
UtcOffset:                 37
timeTraceable:             0
frequencyTraceable:        0
grandmasterPriority1:      128
gmClockClass:              248
gmClockAccuracy:           0x6
grandmasterPriority2:      128
gmOffsetScaledLogVariance: 0xffff
ptp_time (TAI):            Thu Sep  7 11:22:50 2023
ptp_time (UTC adjusted):   Thu Sep  7 11:22:13 2023
system_time (UTC):         Thu Sep  7 11:22:13 2023
ptp_ports:                 48:B0:2D:FF:FE:5C:53:44 (48b02d.fffe.5c5344-1) - Slave
error_count:               1
last_err_time (UTC):       Thu Sep  7 09:55:48 2023

Among others, this monitoring provides the following information:

Details about the Grandmaster the DPU is syncing with
Current PTP timestamp
Health information such as connection errors during execution and whether they have been recovered from

PTP monitoring is disabled by default and can be activated by replacing the disable value with the IP address for the monitor server to use:

Copy
Copied!

            
            - name: MONITOR_STATE
  Value: "<IP address for the monitoring server>"

Once activated, the information can be viewed from the container using the following command :

Copy
Copied!

            
            sudo crictl logs --tail=<MONITOR_LINE_NUMBER> <CONTAINER-ID>

Note

MONITOR_LINE_NUMBER is equal to 20 lines + the number of PTP interfaces supplied via the PTP_INTERFACE field.

It is recommended to use the following watch command to actively monitor the PTP state:

Copy
Copied!

            
            sudo watch -n 1 crictl logs --tail=<MONITOR_LINE_NUMBER> <CONTAINER-ID>

When triaging deployment issues, additional logging information can be found in the monitor's developer logs at /var/log/doca/firefly/firefly_monitor_dev.log.

Note

The monitoring feature connects to ptp4l's local UDS server to query the necessary information. This is why the configuration manager prevents users from modifying the uds_address and uds_ro_address fields used by ptp4l within the container.

Configuration

The PTP monitor supports configuration options which are passed through a dedicated configuration file like the rest of DOCA Firefly's modules. The built-in monitor configuration file can be found in the section "PTP Monitor". For ease of use, the file is also provided as part of DOCA's container resource as downloaded from NGC.

"Firefly Modules Configuration Options" contains a complete explanation of each of the configuration options alongside their default values.

To set a custom config file, users should locate their config file in the directory /etc/firefly and set the config file name in DOCA Firefly's YAML file.

Copy
Copied!

            
            - name: MONITOR_CONFIG_FILE
  value: my_custom_monitor.conf

In this example, my_custom_monitor.conf should be placed at /etc/firefly/my_custom_monitor.conf.

Time Representations (PTP Time vs System Time)

Under most deployment scenarios, the PTP time shown by the monitor is presented according to the International Atomic Time ( TAI ) standard, while the system time would most commonly use the Coordinated Universal Time (UTC). Due to the differences between these time representation models, the monitor provides 2 different time readings (each marked accordingly):

Copy
Copied!

            
            ...
UtcOffset:                 37
...
ptp_time (TAI):            Thu Sep  7 11:22:50 2023
ptp_time (UTC adjusted):   Thu Sep  7 11:22:13 2023
system_time (UTC):         Thu Sep  7 11:22:13 2023

This difference (37 seconds in the above example) is intentional and stems from the amount of leap seconds since epoch. This is indicated by the UtcOffset field that is also included in the monitor's report.

Monitor Server

In addition to printing the monitoring data to the container's standard output available through the container logs, the monitoring data is also exposed through a gRPC server that clients can subscribe to. This allows a monitoring client on the host to subscribe to monitor events from the service running on top of the DPU, thus providing better visibility.

The following diagram presents the recommended deployment architecture for connecting the monitoring client (on the host) to the monitor server (on the DPU).

monitor-arch-version-1-modificationdate-1743579652037-api-v2.png

Based on the above, when activating the monitor feature, the user must provide the IP address to be used by the monitor server:

Copy
Copied!

            
            - name: MONITOR_STATE
  value: "<IP address for the monitoring server>"

Users can choose to only view the monitoring events through the container logs without connecting to the monitoring server. In this case, it is recommended to configure the local host IP address (127.0.0.1) in the YAML file to avoid exposing it to an unwanted network.

Monitor Client

The required files for the monitor client are available under the service's dedicated NGC resource "scripts" directory.

To run Monitor Client you need to provide:

Copy
Copied!

            
            - name: MONITOR_STATE
  value: "<IP address for the monitoring server>"
- name: MONITOR_CLIENT_TYPE
  value: "vanilla"/"phc2sys"

vanilla is used for only monitoring the Server, while phc2sys is used for also running phc2sys on the host to synchronize the host's system clock with the Server's clock.

Note

The monitor client vanilla type can run on both DPU and host, while the phc2sys type can only run on the host.

If you are using phc2sys you need to provide also the interface the server is running on.

Copy
Copied!

            
            - name: MONITOR_CLIENT_PHC2SYS_INTERFACE
  value: "<interface_name>"

The monitor phc2sys client output will also include the following additional lines:

Copy
Copied!

            
            Host information:
Current system time:         Tue Apr  8 10:28:10 2025
Current phc time (Timezone): Tue Apr  8 10:28:10 2025
Current phc time (UTC):      2025-4-8 10:28:10.409982752 UTC

Example command line for executing the monitor client inside a container on host:

Copy
Copied!

            
            sudo docker run --privileged --net=host -v /var/log/doca/firefly:/var/log/firefly -v /etc/firefly:/etc/firefly -e PTP_INTERFACE='eth2' -e MONITOR_STATE='192.168.0.1' -e MONITOR_CLIENT_TYPE='vanilla' -it nvcr.io/nvidia/doca/doca_firefly:1.5.0-doca2.9.0-host /entrypoint.sh

Example command line for executing the python-based monitor client from a Linux host:

Copy
Copied!

            
            $ sudo pip3 install click protobuf grpcio
$ ./doca_firefly_monitor_client.py <ip-address-for-the-monitoring-server>

Note

Reference source files and the .proto file used for Firefly's monitor are placed under the src/ within the NGC resource.

Telemetry Export

On top of allowing clients to subscribe to the feed of monitoring events, the PTP Monitor server also supports an active export of the events to DOCA Telemetry Service (DTS). In such a scenario, users should deploy DTS to the machine on which DOCA Firefly is deployed (host/DPU) and then activate "telemetry export" using the following steps:

Enable the needed file mounts through the YAML file:

Before

Copy
Copied!

            
            ...
    # Uncomment when using the telemetry features with DTS - Part #1
    #- name: ipc-sockets-volume
    #  hostPath:
    #    path: /opt/mellanox/doca/services/telemetry/ipc_sockets
    #    type: DirectoryOrCreate
    #- name: shared-memory
    #  hostPath:
    #    path: /dev/shm/telemetry
    #    type: DirectoryOrCreate
  containers:
    - name: doca-firefly
...
      # Uncomment when debugging the finalized configuration files used - Part #2
      #- name: debug-firefly-volume
      #  mountPath: /tmp
      # Uncomment when using the telemetry features with DTS - Part #2
      #- name: ipc-sockets-volume
      #  mountPath: /opt/mellanox/doca/services/telemetry/ipc_sockets
      #- name: shared-memory
      #  mountPath: /dev/shm
...

After

Copy
Copied!

            
            ...
    # Uncomment when using the telemetry features with DTS - Part #1
    - name: ipc-sockets-volume
      hostPath:
        path: /opt/mellanox/doca/services/telemetry/ipc_sockets
        type: DirectoryOrCreate
    - name: shared-memory
      hostPath:
        path: /dev/shm/telemetry
        type: DirectoryOrCreate
  containers:
    - name: doca-firefly
...
      # Uncomment when debugging the finalized configuration files used - Part #2
      #- name: debug-firefly-volume
      #  mountPath: /tmp
      # Uncomment when using the telemetry features with DTS - Part #2
      - name: ipc-sockets-volume
        mountPath: /opt/mellanox/doca/services/telemetry/ipc_sockets
      - name: shared-memory
        mountPath: /dev/shm
...

Pass a configuration value of 1 to telemetry_export through the configuration. This could easily be done directly through the YAML file:

Before

Copy
Copied!

            
            ...
        # Example #7 - Activate the monitor telemetry export feature (export through DTS).
        #- name: CONF_MONITOR_global_telemetry_export
        #  value: "1"
...

After

Copy
Copied!

            
            ...
        # Example #7 - Activate the monitor telemetry export feature (export through DTS).
        - name: CONF_MONITOR_global_telemetry_export
          value: "1"
...

Once active, a log message should indicate the availability of the export feature (which uses the DOCA Telemetry Exporter library):

Copy
Copied!

            
            ...
2024-09-05 06:22:08 - Firefly - MONITOR - INFO     - Monitor records will also be exported via DOCA Telemetry Exporter
...

The following is a sample output of the telemetry export:

Copy
Copied!

            
            "ptp_time_str": "Tue Apr 22 17:26:16 2025",
    "adjusted_ptp_time_str": "Tue Apr 22 17:25:39 2025",
    "sys_time_str": "Tue Apr 22 17:25:39 2025",
    "last_error_time_str": "Tue Apr 22 12:40:47 2025",
    "gm_identity": "18:08:31:FF:FE:5A:F1:EE (180831.fffe.5af1ee)",
    "port_identity": "90:D7:6B:FF:FE:96:92:F8 (90d76b.fffe.9692f8-1)",
    "ptp_ports": "90:D7:6B:FF:FE:96:92:F8 (90d76b.fffe.9692f8-1) - Slave",
    "ptp_stability": 2,
    "ptp_time_raw": 1745342776089236500,
    "adjusted_ptp_time_raw": 1745342739089236500,
    "sys_time_raw": 1745342739,
    "error_count": 1,
    "last_error_time_raw": 1745325647,
    "master_offset_max": 1,
    "master_offset_avg": 1,
    "master_offset_rms": 0,
    "utc_offset": 37,
    "gm_priority1": 127,
    "gm_clock_class": 248,
    "gm_clock_accuracy": 254,
    "gm_priority2": 127,
    "gm_scaled_offset": 65535,
    "domain_number": 0,
    "port_state": 2,
    "source_id": "c-237-153-80-p88-00-0-bf2.mtl.labs.mlnx",
    "timestamp": 1745342739263351,
    "data_type": "Firefly_monitor"

- ptp_ports - a list of all existing PTP ports and there state according to PTP protocol states.
- ptp_stability - represent the stability state of the PTP. It has 3 options:
  1) 0 - PTP is in a stable state
  2) 1 - PTP is currently out of sync
  3) 2 - PTP managed to recover from a sync error
- error_count - represent the the number of errors we encountered thus far.
- port_state - represent the effective PTP port state of the most active port. It has 3 options:
  1) 0 - PTP port is in an inactive state
  2) 1 - PTP port is active, but uncalibrated
  3) 2 - PTP port is active and calibrated

Note

When locally debugging the telemetry information through DTS, it is important to remember to activate the data writer (storage output), as mentioned in the DOCA Telemetry Service Guide.

Info

For more information about visualizing the information once exported through DTS, please refer to the example about using Grafana alongside DTS.

Firefly Servo

Firefly's Servo module can be seen as an extension to the built-in set of servos offered by linuxptp. When active, linuxptp is automatically set to "free running" and the control over the physical hardware clock (PHC) is handed over to Firefly's own servo.

The following is a sample output of this tool when using the l2-telco profile (16 messages per seconds):

Copy
Copied!

            
            2024-03-18 07:46:45 - Firefly - SERVO  - INFO     - Detected new master clock: 48b02d.fffe.5c4d24-1
2024-03-18 07:46:45 - Firefly - SERVO  - INFO     - Transition from servo state IDLE to FREE_RUNNING
2024-03-18 07:46:47 - Firefly - SERVO  - INFO     - Estimated a logSyncInterval of: -4
2024-03-18 07:46:47 - Firefly - SERVO  - INFO     - Measured offset      18691      delay -47
2024-03-18 07:46:48 - Firefly - SERVO  - INFO     - Transition from servo state FREE_RUNNING to LOCKED
2024-03-18 07:46:50 - Firefly - SERVO  - INFO     - offset +164 +/- 164 freq   -1.50 +/- 0.00  delay -48 +/- 1
2024-03-18 07:46:52 - Firefly - SERVO  - INFO     - Transition from servo state LOCKED to LOCKED_STABLE
2024-03-18 07:46:52 - Firefly - SERVO  - INFO     - offset   +0 +/- 1  freq   -1.41 +/- 0.47  delay -48 +/- 1
2024-03-18 07:46:54 - Firefly - SERVO  - INFO     - offset   -8 +/- 4  freq   -4.21 +/- 1.40  delay -47 +/- 1
2024-03-18 07:46:57 - Firefly - SERVO  - INFO     - offset  -12 +/- 2  freq   -5.46 +/- 0.73  delay -47 +/- 1
2024-03-18 07:46:59 - Firefly - SERVO  - INFO     - offset  -13 +/- 2  freq   -6.13 +/- 0.65  delay -47 +/- 1
2024-03-18 07:47:01 - Firefly - SERVO  - INFO     - offset  -13 +/- 3  freq   -6.19 +/- 1.23  delay -47 +/- 2
2024-03-18 07:47:03 - Firefly - SERVO  - INFO     - offset  -19 +/- 2  freq   -8.04 +/- 0.96  delay -47 +/- 1
2024-03-18 07:47:06 - Firefly - SERVO  - INFO     - offset  -14 +/- 3  freq   -6.46 +/- 1.11  delay -47 +/- 1
2024-03-18 07:47:08 - Firefly - SERVO  - INFO     - offset  -16 +/- 2  freq   -7.32 +/- 0.78  delay -48 +/- 2
2024-03-18 07:47:10 - Firefly - SERVO  - INFO     - offset  -15 +/- 2  freq   -7.11 +/- 0.87  delay -47 +/- 2
2024-03-18 07:47:12 - Firefly - SERVO  - INFO     - offset  -14 +/- 1  freq   -6.74 +/- 0.57  delay -47 +/- 2
2024-03-18 07:47:15 - Firefly - SERVO  - INFO     - offset  -12 +/- 3  freq   -6.20 +/- 1.01  delay -48 +/- 1
2024-03-18 07:47:17 - Firefly - SERVO  - INFO     - offset  -13 +/- 2  freq   -6.40 +/- 0.89  delay -47 +/- 1
2024-03-18 07:47:19 - Firefly - SERVO  - INFO     - offset  -11 +/- 2  freq   -5.98 +/- 0.86  delay -48 +/- 1
2024-03-18 07:47:21 - Firefly - SERVO  - INFO     - offset  -10 +/- 2  freq   -5.75 +/- 0.87  delay -46 +/- 1
2024-03-18 07:47:24 - Firefly - SERVO  - INFO     - offset   -8 +/- 1  freq   -5.15 +/- 0.42  delay -47 +/- 1

As can be seen, the servo's behavior is similar to that of linuxptp's ptp4l and consists of a state machine that tracks the state of the active PTP port (FREE_RUNNING, LOCKED, LOCKED_STABLE, etc).

Firefly's Servo is disabled by default (in all profiles) and can be activated by replacing the define_by_profile value with enable:

Copy
Copied!

            
            # Activation status
- name: SERVO_STATE
  # Options are "enable"/"disable"/"defined_by_profile"
  value: "enable"

Once activated, the information can be viewed from the module's log file /var/log/doca/firefly/servo.log.

Firefly Servo Configuration

Firefly's Servo is currently aimed for telco-related deployments, using the l2-telco profile including the use of SyncE. As such, the default values in the built-in configuration file are optimized for those scenarios.

The servo supports configuration options which are passed through a dedicated configuration file like the rest of DOCA Firefly's modules. The built-in servo configuration file can be found in the section "Firefly Servo". For ease of use, the file is also provided as part of DOCA's container resource as downloaded from NGC.

"Firefly Modules Configuration Options" contains a complete explanation of each of the configuration options alongside their default values.

To set a custom config file, users should locate their config file in the directory /etc/firefly and set the config file name in DOCA Firefly's YAML file.

Copy
Copied!

            
            - name: SERVO_CONFIG_FILE
  value: my_custom_servo.conf

In this example, my_custom_servo.conf should be placed at /etc/firefly/my_custom_servo.conf.

Dynamic Packet Rate Support

The servo has the ability to dynamically detect the packet rate used by the PTP grandmaster clock, so to calibrate itself accordingly incase it differs from the recommended 16 packets per seconds.

Copy
Copied!

            
            2024-03-18 07:46:45 - Firefly - SERVO  - INFO     - Transition from servo state IDLE to FREE_RUNNING
2024-03-18 07:46:47 - Firefly - SERVO  - INFO     - Estimated a logSyncInterval of: -4
2024-03-18 07:46:47 - Firefly - SERVO  - INFO     - Measured offset      18691      delay -47

In a case the message rate is constant and known in advance, the dynamic estimation can be disabled, in favour of a provided message rate:

Copy
Copied!

            
            - name: CONF_SERVO_global_servo_const_log_sync_interval
  value: "-2"

In the above example, a fixed message rate of 4 packets per seconds will be used (logSyncInterval of "-2").

Note

While the servo was tested to produce stable results with various packets rates (2, 4, 8, 16, 32, 64, 128), it is only officially recommended for use in deployments using a packet rate of 16 packets per second.

VLAN Tagging

DOCA Firefly natively supports VLAN-tagging-enabled network interfaces.

Separated Mode

The name of the VLAN-enabled network interface should be the one passed through the YAML file in the PTP_INTERFACE field.

Embedded Mode

In addition to passing on the VLAN-enabled interface through the YAML as listed in the previous section, the user is also required to configure the network routing within the DPU to support the VLAN tagging:

The following example configures a VLAN tag of 10 to the enp3s0f0s4 interface:

Copy
Copied!

            
            $ sudo ip link add link enp3s0f0s4 name enp3s0f0s4.10 type vlan id 10
$ sudo ip link set up enp3s0f0s4.10
$ sudo ifconfig enp3s0f0s4.10 192.168.104.1 up

In this example, enp3s0f0s4.10 is the interface to be passed to DOCA Firefly.

Additional commands to route the traffic within the DPU:

Copy
Copied!

            
            $ sudo ovs-ofctl add-flow uplink in_port=en3f0pf0sf4,dl_vlan=10,actions=output:p0
$ sudo ovs-ofctl add-flow uplink in_port=p0,dl_vlan=10,actions=output:en3f0pf0sf4

Multiple Interfaces

DOCA Firefly can support multiple network interfaces through the following YAML file syntax:

Copy
Copied!

            
            - name: PTP_INTERFACE
  value: "<space (' ') separated list of interface names>"

For example:

Copy
Copied!

            
            - name: PTP_INTERFACE
  value: "p0 p1"

Note

The monitoring feature is supported for multiple interfaces only when the clientOnly configuration is enabled.

Note

For Firefly versions lower than 1.7.0, automatic mode (-a) for phc2sys is not supported when working with multiple interfaces. It is recommended to disable phc2sys in this mode.

Troubleshooting

When troubleshooting container deployment issues, it is highly recommended to follow the deployment steps and tips in the "Review Container Deployment" section of the NVIDIA DOCA Container Deployment Guide.

To debug the finalized configuration file used by Firefly, users can connect to the container as follows:

Open a shell session on the running container using the container ID:

Copy
Copied!

            
            sudo crictl exec -it <container-id> /bin/bash

Once connected to the container, the finalized configuration file can be found under the /tmp directory using the same filename as the original configuration file.

Info

More information regarding the configuration files can be found under section "Ensuring and Debugging Correctness of Config File".

Pod is Marked as "Ready" and No Container is Listed

Error

When deploying the container, the pod's STATE is marked as Ready, an image is listed, however no container can be seen running:

Copy
Copied!

            
            $ sudo crictl pods
POD ID              CREATED             STATE               NAME                                     NAMESPACE           ATTEMPT             RUNTIME
06bd84c07537e       4 seconds ago       Ready               doca-firefly-my-dpu                      default             0                   (default)
 
$ sudo crictl images
IMAGE                              TAG                 IMAGE ID            SIZE
k8s.gcr.io/pause                   3.2                 2a060e2e7101d       251kB
nvcr.io/nvidia/doca/doca_firefly   1.1.0-doca2.0.2     134cb22f34611       87.4MB
 
$ sudo crictl ps
CONTAINER           IMAGE               CREATED             STATE               NAME                     ATTEMPT             POD ID              POD

Solution

In most cases, the container did start, but immediately exited. This could be checked using the following command:

Copy
Copied!

            
            $ sudo crictl ps -a
CONTAINER           IMAGE               CREATED             STATE               NAME                     ATTEMPT             POD ID              POD
556bb78281e1d       134cb22f34611       7 seconds ago       Exited              doca-firefly             1                   06bd84c07537e       doca-firefly-my-dpu

Should the container fail (i.e., state of Exited) it is recommended to examine Firefly's main log at /var/log/doca/firefly/firefly.log.

In addition, for a short period of time after termination, the container logs could also be viewed using the the container's ID:

Copy
Copied!

            
            $ sudo crictl logs 556bb78281e1d
Starting DOCA Firefly - Version 1.1.0
...
Requested the following PTP interface: p10
Failed to find interface "p10". Aborting

Custom Config File is Not Found

Error

When DOCA Firefly is deployed using a custom configuration file, a deployment error occurs and the following log message appears:

Copy
Copied!

            
            ...
2023-09-07 14:04:23 - Firefly - Init    - ERROR    - Custom config file not found: my_file.conf. Aborting
...

Solution

Check the custom file name written in the YAML file and make sure that you properly placed the file with that name under the /etc/firefly/ directory of the DPU.

Profile is Not Supported

Error

When DOCA Firefly is deployed, a deployment error occurs and the following log message appears:

Copy
Copied!

            
            ...
2023-09-07 14:04:23 - Firefly - Init    - ERROR    - profile <name> is not supported. Aborting
...

Solution

Verify that the profile selected in the YAML file matches one of the supported profiles as listed in the profiles table.

Note

The profile name is case sensitive. The name must be specified in lower-case letters.

PPS Capability is Missing

Error

When DOCA Firefly is deployed and configured to use the PPS module, a deployment error occurs and the following log message appears:

Copy
Copied!

            
            ...
2023-09-07 14:04:23 - Firefly - Init    - INFO     - Starting PPS configuration
2023-09-07 14:04:23 - Firefly - Init    - WARNING  - [-] PPS capability is missing, seems that the card doesn't support PPS
2023-09-07 14:04:23 - Firefly - Init    - INFO     - capabilities:
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   50000000 maximum frequency adjustment (ppb)
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 programmable alarms
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 external time stamp channels
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 programmable periodic signals
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 pulse per second
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 programmable pins
2023-09-07 14:04:23 - Firefly - Init    - INFO     -   0 cross timestamping
...

Solution

This log indicates that the DPU hardware does not support PPS. However, PTP can still run on this hardware, and you should see the line Running ptp4l in the container log, indicating that PTP is running successfully.

Timed Out While Polling for Tx Timestamp

Error

When the BlueField is operating in DPU mode, DOCA Firefly gets stuck in a fault loop while waiting to receive the Tx timestamp events:

Copy
Copied!

            
            ptp4l[2912.797]: timed out while polling for tx timestamp
ptp4l[2912.797]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
ptp4l[2912.797]: port 1 (enp3s0f0s4): send sync failed
ptp4l[2923.528]: timed out while polling for tx timestamp
ptp4l[2923.528]: increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
ptp4l[2923.528]: port 1 (enp3s0f0s4): send sync failed

Info

DOCA Firefly has a known gap leading to this error appearing once, after which ptp4l recovers from it. This section only covers the case in which there is a fault loop and no recovery occurs.

Solution

DOCA Firefly's configurations were already adjusted to accommodate for Tx port timestamping. For more information about the reason for this error and for the designed recovery mechanism from it, refer to section "Tx Timestamping Support on DPU Mode".

Warning – Time Jumped Backwards

Error

When using Firefly's Servo module, the following warning log message is encountered on start:

Copy
Copied!

            
             2024-01-01 14:04:23 - Firefly - SERVO   - WARNING  - Clock is going to jump backwards in time - this might have a system-wide impact

Solution

This warning message indicates that the system's time jumped backwards with a value of at least one minute. This event is logged by Firefly given that such jumps might have system-wide implications. For more information, refer to section "Failed to Reserve Sandbox Name" in the DOCA Troubleshooting.

Such jumps can only happen during Firefly's boot, before the Servo achieves initial time synchronization with the reference clock.

PTP Profile Default Config Files

Media Profile

Copy
Copied!

            
            #
# This config file contains configurations for media & entertainment alongside
# DOCA Firefly specific adjustments.
#
 
[global]
domainNumber            127
priority1               128
priority2               127
use_syslog                1
logging_level             6
tx_timestamp_timeout     30
hybrid_e2e                1
dscp_event               46
dscp_general             46
logAnnounceInterval      -2
announceReceiptTimeout    3
logSyncInterval          -3
logMinDelayReqInterval   -3
delay_mechanism         E2E
network_transport     UDPv4
# Value lesser or equal to 1 is required for Embedded Mode
fault_reset_interval      1
# Required for multiple interfaces support
boundary_clock_jbod       1

Default Profile

Copy
Copied!

            
            #
# This config file extends linuxptp default.cfg config file with DOCA Firefly
# specific adjustments.
#
 
[global]
# Value lesser or equal to 1 is required for Embedded Mode
fault_reset_interval                 1
# Required for multiple interfaces support
boundary_clock_jbod                  1

Telco (L2) Profile

Copy
Copied!

            
            #
# This config file extends linuxptp G.8275.1.cfg config file with DOCA Firefly
# specific adjustments.
#
 
[global]
dataset_comparison                       G.8275.x
G.8275.defaultDS.localPriority                128
maxStepsRemoved                               255
logAnnounceInterval                            -3
logSyncInterval                                -4
logMinDelayReqInterval                         -4
G.8275.portDS.localPriority                   128
ptp_dst_mac                     01:80:C2:00:00:0E
network_transport                              L2
domainNumber                                   24
# Value lesser or equal to 1 is required for Embedded Mode
fault_reset_interval                            1
# Required for multiple interfaces support
boundary_clock_jbod                             1

Copy
Copied!

            
            #
# Default values for all of Firefly's PTP monitor configuration values.
#
 
[global]
# General
report_interval                     1000
# Debugging & Logging
doca_logging_level                    50
telemetry_export                       0

Configuration Options

report_interval – The time interval (in msecs) for when the monitor should publish a report to all defined output providers (standard output, gRPC clients, etc). Default: 1000 (1 second).
doca_logging_level – Logging level for the module, based on DOCA's logging levels. Default is 50 (INFO). Valid options:
- 10=DISABLE
- 20=CRITICAL
- 30=ERROR
- 40=WARNING
- 50=INFO
- 60=DEBUG
telemetry_export – Indicates whether monitor information should be exported through DOCA Telemetry Service. Valid options:
- 0=Disabled (default)
- 1=Enabled

Firefly Servo

servo-default.conf

Copy
Copied!

            
            #
# Default values for all of Firefly's servo configuration values
#
 
[global]
# Time thresholds
init_offset_from_master_threshold   10000000
offset_from_master_min_threshold       -1500
offset_from_master_max_threshold        1500
init_max_time_adjustment                   0
max_time_adjustment                     1500
step_adjustment_threshold                  0
hold_over_timer                            0
# Sampling Window & servo logic
warmup_period                           1500
sync_filter_length                         6
delay_request_filter_length                6
servo_adjustment_interval                  4
servo_init_adjustment_interval            24
servo_const_log_sync_interval           0xFF
servo_window_min_samples                   2
servo_num_offset_values                    5
servo_pi_cutoff_frequency             0.0159
servo_pi_dumping_factor                 7.85
 
# Debugging & Logging
summary_interval                        2000
doca_logging_level                        50
free_running                               0

Configuration Options

init_offset_from_master_threshold – Minimal threshold (in nsecs) for switching to PI servo on init. Default is 10000000 (10 msecs).
offset_from_master_min_threshold – Minimal threshold (in nsecs) for declaring time offset from the master clock as "stable". Default is -1500 (-1.5 µsecs).
offset_from_master_max_threshold – Maximal threshold (in nsecs) for declaring time offset from the master clock as "stable". Default is +1500 (+1.5 µsecs).
init_max_time_adjustment – When active, defines the maximal allowed time (step) adjustment (in nsecs) before the servo reaches the "locked" state. Default is 0 (disabled).
max_time_adjustment – When active, defines the maximal allowed reference time adjustment (in nsecs) after the servo has reached the "locked" state. Default is 1500 (1.5 µsecs).
step_adjustment_threshold – When active, defines the thresholds above which a time (step) adjustment (in nsecs) would be allowed, even after the servo has reached the "locked" state. Default is 0 (disabled).
hold_over_timer – When active, defines the time duration (in seconds) in which the servo stays in "hold over" mode, until reverting back to "free running". Default is 0 ("hold over" state is disabled).
warmup_period – Time span (in msecs) during which samples are collected to estimate the logSyncInterval value (packet rate). Default is 1500 (1.5 seconds).
sync_filter_length – Number of SYNC messages in the servo's history buffer. Default is 6.
delay_request_filter_length – Number of DELAY_REQUEST messages in the servo's history buffer. Default is 6 messages.
servo_adjustment_interval – Number of SYNC messages after which the PHC is updated once the servo has reached the "locked" state at least once. Default is 4 messages.
servo_init_adjustment_interval – Number of SYNC messages after which the PHC is updated before the servo has ever reached the "locked" state. Default is 24 messages.
servo_const_log_sync_interval – Known fixed value to be used as the logSyncInterval instead of trying to estimate it at runtime. Default is 0xFF (disabled).
servo_window_min_samples – Minimal number of samples needed for a servo calculation. Default is 2 messages.
servo_num_offset_values – Number of consecutive timestamps within the "offset from master" threshold that are required so to transition from the "locked" state and to the "locked stable" state. Default is 5 offset values.
servo_pi_cutoff_frequency – The PI servo's cutoff frequency value. Default is 0.0159.
servo_pi_dumping_factor – The PI servo's dumping factor value. Default is 7.85.
summary_interval – The time interval (in msecs) for when the servo should publish a report log event. Default is 2000 (2 seconds).
doca_logging_level – Logging level for the module, based on DOCA's logging levels. Default is 50 (INFO). Valid options:
- 10=DISABLE
- 20=CRITICAL
- 30=ERROR
- 40=WARNING
- 50=INFO
- 60=DEBUG
free_running – Tell the servo to only log the operations, without actually adjusting the PHC. Default is 0 (disabled).

On This Page