Data Direct (DirectNIC)
To enable direct GPU read/write and avoid CPU PCIe bandwidth bottlenecks, a direct NIC–GPU datapath is required.
To support this, the HCA exposes an additional PCIe function—a side DMA engine—called Data Direct. This DMA engine lets a vHCA access buffers via an MKEY, providing multiple PCIe datapath interfaces. This is useful when different memory regions require different PCIe paths (for example, in NUMA systems).
A vHCA can use a Data Direct function only if HCA_CAP.data_direct is set. To use the Data Direct interface, the vHCA must create an MKEY with the data_direct bit set; the returned MKEY enables access through the side DMA engine. The MKEY access mode must be PA.
It supports only the following fields:
a
,
rw
,
rr
,
lw
,
lr
,
relaxed_ordering_write
,
relaxed_ordering_read
,
mkey[7:0]
,
length64
,
pd
,
start_addr
, and
len
. All other fields are reserved.
The following table lists the patches required for bare-metal Data Direct (DirectNIC) support.
LKML Discussion | Git Commit | Git Description | Minimum Linux Kernel Release |
https://lore.kernel.org/all/20240625153150.159310-1-vidyas@nvidia.com/ | 47c8846a49ba | PCI: Extend ACS configurability | 6.11 |
https://lore.kernel.org/all/274c4f6f1ac0b1078243dd296695a49dbe58e7d1.1725907637.git.leonro@nvidia.com/ | c77aec65e828 | RDMA/mlx5: Consider the query_vuid cap for data_direct | 6.12 |
https://lore.kernel.org/all/403745463e0ef52adbef681ff09aa6a29a756352.1722512548.git.leon@kernel.org/ | ec7ad6530909 | RDMA/mlx5: Introduce GET_DATA_DIRECT_SYSFS_PATH ioctl | 6.12 |
https://lore.kernel.org/all/1f99d8020ed540d9702b9e2252a145a439609ba6.1722512548.git.leon@kernel.org/ | de8f847a5114 | RDMA/mlx5: Add support for DMABUF MR registrations with Data-direct | 6.12 |
https://lore.kernel.org/all/9a25b2fc02443f7c36c2d93499ae25252b6afd40.1722512548.git.leon@kernel.org/ | 3aa73c6b795b | RDMA: Pass uverbs_attr_bundle as part of '.reg_user_mr_dmabuf' API | 6.12 |
https://lore.kernel.org/all/a38270f2fe4a194868ca2312f4c1c760e51bcbff.1722512548.git.leon@kernel.org/ | 253c61dc256b | RDMA/umem: Introduce an option to revoke DMABUF umem | 6.12 |
https://lore.kernel.org/all/038aad36a43797e5591b20ba81051fc5758124f9.1722512548.git.leon@kernel.org/ | 682358fd35de | RDMA/umem: Add support for creating pinned DMABUF umem with a given dma device | 6.12 |
https://lore.kernel.org/all/b11fa87b2a65bce4db8d40341bb6cee490fa4d06.1722512548.git.leon@kernel.org/ | 2e8e631d7a41 | RDMA/mlx5: Add the initialization flow to utilize the 'data direct' device | 6.12 |
https://lore.kernel.org/all/b77edecfd476c3f445da96ab6aef499ae47b2829.1722512548.git.leon@kernel.org/ | 6910e3660d86 | RDMA/mlx5: Introduce the 'data direct' driver | 6.12 |
https://lore.kernel.org/all/82da7f578a567909bb5858a64ba844fe4cc298fa.1722512548.git.leon@kernel.org/ | c772a2c69018 | net/mlx5: Add IFC related stuff for data direct | 6.12 |