DOCA Pipeline Language Services Guide
This page outlines the DOCA Pipeline Language (DPL) approach to packet processing programmability for NVIDIA BlueField. DPL introduces a software development solution based on a domain-specific programming language (DSL), supported by a set of DOCA services.
For more details, refer to the DPL Developer Container.
DPL is derived from the P4-16 language specification.
P4 is an open-source, domain-specific programming language (DSL) designed for programming and customizing network data planes. It provides a high-level abstraction for programmable packet processing, allowing developers to add, modify, and extend networking functionalities.
For fixed-function devices, P4 serves as a documentation tool, offering a structured description of the data plane's functional blocks.
Key Features of P4
High-level abstraction: Simplifies the programming of complex network data planes with clear and concise syntax
Programmable packet processing: Enables customization of packet processing and traffic management
Documentation of fixed functions: Offers a standardized method for documenting the fixed functional blocks of network devices
P4 Compiler
A P4 compiler (p4c) is a critical component in the P4 ecosystem. It automatically generates the data plane program and corresponding control plane interface, ensuring seamless coordination between the data plane and control plane.
Key benefits of a compiler:
Automatic generation: Streamlines development by automatically generating essential components and optimizing resource usage
Custom pipeline behavior: Allows developers to extend data plane functionality with customized pipeline behaviors
Dynamically loadable pipelines: Supports hot-swappable pipelines, enabling updates without rebuilding or redeploying an entire application
Control plane integration: Facilitates communication between the data plane and control plane via an open-source API, ensuring effective management of customized pipelines
Focus on NVIDIA's DOCA Pipeline Language
The remainder of this document focuses on NVIDIA's implementation of the DOCA Pipeline Language (DPL). While DPL's syntax is derived from P4-16, its pipeline semantics align with NVIDIA's DPU pipeline architecture rather than standard P4 execution models.
For example, while P4 semantics imply a staged pipeline based on a feed-forward RMT (Reconfigurable Match-Action Table) architecture, NVIDIA's DPU architecture follows a run-to-completion dRMT (disaggregated RMT) model, offering greater flexibility and enhanced capabilities.
DPL Highlights
DPL introduces a unique programming paradigm distinct from traditional SDKs, APIs, libraries, drivers, or utilities. It is a specialized programming language with a runtime system, designed for rapid development, testing, and deployment of packet-processing pipelines. DPL is a ready-to-use, customizable solution as part of DOCA Services.
Key Features of DPL
DPL Services: A system-level solution that includes a compiler, runtime agent, and debugging tools, enabling rapid programming of the DPU pipeline
Optimized for NVIDIA devices: Specifically designed and fine-tuned for programming network data planes on NVIDIA hardware
Advanced networking functionality: Leverages DPL's capabilities to enhance and extend networking features on NVIDIA DPUs
Comprehensive documentation: Provides detailed descriptions of BlueField's fixed functional blocks within the DPU data plane
Developer Resources
The DPL programming guide serves as a comprehensive resource for developers who want to use DPL for programming network data planes. By utilizing the DPL p4c
compiler and the P4-16 specification, developers can:
Enhance network device functionality and efficiency
Meet the evolving demands of modern network infrastructures
Ensure seamless integration and optimization within NVIDIA's DPU ecosystem
The DPL compiler can run on any Linux OS that supports Docker. Language specifications, runtime APIs, and tutorials are available at the P4 GitHub Repository.
Development Environment Requirements
To set up the development environment, the following components are required:
Host computer with Ubuntu 22.04 or later with Docker installed (required for the DOCA development container)
Server with root/hypervisor access to install the DPL Runtime Service package
One or more BlueField-3 devices, installed in the target server for DPL execution
Suggested Workflow
The suggested workflow is as follows:
Coding
Use the DPL programming guide and sample applications to create a DPL program remotely.
The program is compiled using
dplp4c
, iterating until it successfully produces a binary.
Loading
The compiled binary is transferred to the BlueField system.
Using the P4Runtime API (via an open-source or proprietary P4Runtime controller), the pipeline is sent from the remote machine to the DPL Service running on BlueField.
The user checks for P4Runtime error messages.
Running
Inspect the logs for any DPL Service error messages.
Use the
dpl_nspect
tool to verify that P4 tables and entries are present in the hardware.Use the
dpl_debugger
tool to understand the packet processing pipeline, which shows the state of packets and their metadata.
This process is repeated until the DPL application is fully verified.

P4, and by extension DPL, is a domain-specific language (DSL) designed for programming network data planes. It enables customized packet processing, allowing developers to define how packets are handled at different pipeline stages.
However, P4 programs are not universally portable across different architectures. Instead, they are typically compatible within the same target architecture family.
The BlueField programmable pipeline follows a hybrid model that leverages both hardware and software processing capabilities. It consists of three main stages:
Parsing
Match-action processing (Steering)
Forwarding database (FDB)
Parsing
The BlueField native parser is the first stage of the packet processing pipeline. It is responsible for identifying and extracting packet headers, progressing through the protocol stack until the entire frame is parsed.
Key features:
Pre-defined protocol headers and standard transitions based on IETF specifications
On-demand reparsing at any stage, which eliminates the need for reinjection or a final deparser stage
Flex Parsing
Flex parsing allows developers to integrate custom protocol headers into BlueField’s hardware parsing engine. It consists of four components:
Flex Arc In: Defines the transition from a native header to a Flex header
Flex Header: Specifies header characteristics such as length and next protocol location
Flex Sampler: Extracts specific bytes from the hardware, enabling their use in control blocks or table keys
Flex Arc Out: Defines the transition from a Flex header back to a native header (or another Flex header)
The DPL compiler automatically generates Flex parsing components based on the developer's defined parse nodes and transitions.
Operational Mode
The DPL parser operates in a hybrid mode with a default native parser
The compiler automatically integrates native headers and fields into DPL constructs
The Flex parse graph consists of:
Nodes (either native or flex)
Arcs (transitions)
Samplers for custom parsing operations
This design eliminates the need to redefine and reimplement standard IETF protocols and headers.
Match-Action Processing (Steering)
After parsing, packet processing decisions are made based on match-action tables, commonly referred to as "Steering".
Key Features:
Match fields: Define packet attributes for classification (for example, source/destination MAC, VLAN, IP, and protocol headers)
Tables: Store rules for packet handling and decision-making
Actions: Define processing rules (for example, forwarding, header modification, dropping packets)
Programmability: Allows dynamic updates to match-action rules based on network conditions
Efficient processing: Packet handling occurs directly in hardware, reducing latency
P4Runtime integration: DPL tables are populated via the P4Runtime API, supporting SDN controllers
In this document, the terms "flow tables" and "P4 tables" are interchangeable.

Forwarding Database
The Forwarding Database (FDB) is the final stage within the embedded switch (eSwitch). It enables accurate and efficient packet routing within the network infrastructure. It is responsible for:
Storing and managing MAC addresses
Ensuring efficient packet forwarding based on network topology
Maintaining records of port locations for destination-based forwarding
BlueField DPU Pipeline Behavior
The BlueField pipeline is designed for flexibility, allowing developers to customize packet processing to meet specific application needs.
Key characteristics:
Extended parser support: Developers can expand the native parser using Flex Parsing
Immediate execution model: No deferred actions; all modifications take effect immediately.
Mid-pipeline reparsing: Packet headers are reparsed immediately after modification, ensuring correct metadata updates
No deparser control in TA: Unlike traditional architectures, BlueField does not require a separate deparser step
DPL Services
Rather than providing a traditional SDK or driver-level APIs, DPL offers a high-level, services-based approach to programming the DPU pipeline.
The DPL Services consist of 2 packages that form the DPL solution. The services are provided as containers and are deployed separately.
See the following sections on each service:
DPL System Overview for a high-level overview of the components that make up the DPL Services
DPL Runtime Service for instructions on deploying and configuring the backend DPL Runtime Service which interacts with the hardware.
DPL Development Container for information about the DPL language, the compiler tools, and methodology building DPL programs
See also:
DOCA Pipeline Language Developer Tool to learn about the various tools and methods for debugging DPL programs