PCIe Endpoint Mode

Applies to the Jetson Orin series.

Jetson Linux contains the following software support for PCIe endpoint mode:

  • A Linux kernel device driver for the PCIe endpoint controller.

    This driver configures the PCIe controller as an endpoint, and provides an interface for higher-level software to select the configuration that is exposed to the PCIe bus.

    The source code for this driver is available at the following paths in the Jetson Linux kernel source package:

    kernel-jammy-src/drivers/pci/controller/dwc/pcie-tegra194.c
    kernel-jammy-src/drivers/pci/controller/dwc/
    
  • A sample Linux kernel PCIe endpoint function driver.

    This driver configures properties of the PCIe endpoint, such as BAR count and size, and IRQ count. It also implements any runtime functionality of the endpoint.

    The sample driver is a trivial implementation, useful for demonstration purposes only. It just exposes some endpoint RAM to the PCIe bus. It does not interact with the host or with other parts of the endpoint software. You must implement your own PCIe endpoint function driver according to the needs of your application.

    The source code for this driver is available at the following path in the Jetson Linux kernel source package:

    nvidia-oot/drivers/pci/endpoint/functions/pci-epf-dma-test.c
    
  • A Linux kernel PCIe endpoint subsystem.

    The source code for this subsystem is available at the following path in the Jetson Linux kernel source package:

    kernel-jammy-src/drivers/pci/endpoint
    

Hardware Requirements

To use PCIe endpoint support, you need the following hardware:

  • Any Jetson Orin series device, which is running Jetson Linux, to act as the PCIe endpoint.

  • Another computer system to act as the PCIe root port.

    The following instructions assume that a Jetson Orin series device is used as a second device that is running Jetson Linux. You can use any standard x86-64 PC that is running Linux.

  • Cables to connect the two devices.

Important

The commands in the following procedure must be run as the root user. Some commands use shell I/O redirection and will not operate correctly if run by using sudo.

Flashing PCIe as Endpoint on a Jetson AGX Orin Series System

  1. In the extracted Jetson Linux release directory, edit the jetson-agx-orin-devkit.conf file.

  2. To set nvhs-uphy-config-1, add the following line to override ODMDATA:

    ODMDATA="gbe-uphy-config-22,nvhs-uphy-config-1,hsio-uphy-config-0,gbe0-enable-10g,hsstp-lane-map-3";
    
  3. To reflash the device, run this command:

    # sudo ./flash.sh jetson-agx-orin-devkit mmcblk0p1
    

    This step completely erases data that was previously stored on the Jetson device.

  4. Delete ODMDATA from jetson-agx-orin-devkit.conf to restore the property’s original value.

    This step ensures that devices flashed in the future will operate in PCIe root port mode.

Flashing PCIe as Endpoint on a Jetson Orin NX/Nano Series System

  1. In the extracted Jetson Linux release directory, edit the jetson-orin-nano-devkit.conf file.

  2. To set hsio-uphy-config-41, add the following line to override ODMDATA:

    ODMDATA="gbe-uphy-config-8,hsstp-lane-map-3,hsio-uphy-config-41";
    
  3. To reflash the device, run this command:

    sudo ./tools/kernel_flash/l4t_initrd_flash.sh --external-device nvme0n1p1  \
    -c tools/kernel_flash//flash_l4t_t234_nvme.xml -p "-c bootloader/generic/cfg/flash_t234_qspi.xml" \
    --showlogs --network usb0 jetson-orin-nano-devkit internal
    
  4. Delete ODMDATA from jetson-orin-nano-devkit.conf to restore the property’s original value.

    This step ensures that devices flashed in the future will operate in PCIe root port mode.

Connecting and Configuring the Devices

  1. Connect the devices using the appropriate PCIe cable.

  2. Boot the endpoint device.

  3. Run these commands to configure and enable PCIe endpoint mode:

    # modprobe pci-epf-dma-test
    # cd /sys/kernel/config/pci_ep/
    # mkdir functions/tegra_pcie_dma_epf/func1
    # echo 0x10de > functions/tegra_pcie_dma_epf/func1/vendorid
    # echo 0x229a > functions/tegra_pcie_dma_epf/func1/deviceid
    # echo 16 > functions/tegra_pcie_dma_epf/func1/msi_interrupts
    # ln -s functions/tegra_pcie_dma_epf/func1 controllers/141a0000.pcie-ep/
    # echo 1 > controllers/141a0000.pcie-ep/start
    

    For additional details, read the following file in the Jetson Linux kernel source package:

    kernel-jammy-src/Documentation/PCI/endpoint/pci-endpoint-cfs.rst
    
  4. Boot the root port system.

Testing Procedures

Use the following procedures to test PCIe endpoint support.

Prepare for Testing

The sample PCIe Endpoint driver provides an example to complete EDMA transfer between EP and RP and provides the performance value.

After the driver is enabled and platform is booted, each driver creates its own debugfs directory.

For EP, the directory is /sys/kernel/debug/<contorller_addr>.pcie_ep_epf_dma_test/.

  • The PCIe C5 EP controller is in the /sys/kernel/debug/141a0000.pcie-ep_epf_dma_test/ directory.

For RP, the directory is /sys/kernel/debug/<controller ID>:01:00.0_pcie_dma_test/.

  • The PCIe C5 RP controller is in the /sys/kernel/debug/0005\:01\:00.0_pcie_dma_test/ directory.

The configurable parameters are referenced using the following files:

``edma_ch``: Configures the number of EDMA channels and their modes. The bit definitions are as follows:

- [0:3]: Sets the 0-Sync, 1-Async mode of the RD/WR channels.

- [4-7]: Enables the 0-Disable and 1-Enable RD/WR channels.

- 31: Enables the Remote EDMA mode.

- 30: Triggers the ABORT use-case validations.
The value of 0xF1 means that all channels are enabled for the WR mode with channel 0 in async and rest of the channel in sync mode.

  ``nents``: The number of descriptors to be populated in each DMA submission (an ``tegra_pcie_edma_submit_xfer`` API call).
    When more than one DMA channel is enabled, the ``nents`` are split for channels.
    For example, if nents = 2 and edma_ch = 0x3, each DMA channel gets one ``nent``.

``dma_size``: Specifies that the size in bytes to be transferred in each software transaction.

``stress_count``: Indicates how many software transactions need to be scheduled in an execution.

Note

  • During testing, if an async channel is selected first, the chances that bandwidth is shared between these channels increases. For effective bandwidth calculations, you must ensure that all channels are enabled in the same mode.

  • nents * dma_size must be less than 127MB.

Execution

  1. For maximum performance, configure the following settings for 16Mb and a channel in async mode for 1000 iterations of four nents in the debugfs directory:

    echo 16777216 > dma_size
    echo 4 > nents
    echo 1000 > stress_count
    echo 0x11 > edma_ch
    
  2. To start the test, run the following command::

    cat edmalib_test

  3. After the test is running correctly, the following log will be printed in the kernel log:

    [  438.647567] pcie_dma_epf tegra_pcie_dma_epf.0: edmalib_common_test: re-init edma lib prev_ch(0) != current chans(11)
    [  438.648038] tegra194-pcie 141a0000.pcie-ep: tegra_pcie_edma_initialize: success
    [  438.875999] pcie_dma_epf tegra_pcie_dma_epf.0: edmalib_common_test: EDMA LIB WR started for 1 chans, size 16777216 Bytes, iterations: 1000 of descriptors 4
    [  438.878092] pcie_dma_epf tegra_pcie_dma_epf.0: edmalib_common_test: EDMA LIB submit done
    [  443.963685] pcie_dma_epf tegra_pcie_dma_epf.0: edma_final_complete: WR-local-Async complete for chan 0 with status 0. Total desc 4000 of Sz 16777216 Bytes done in time 5087716356 nsec. Perf is 105522 Mbps
    [  443.963694] pcie_dma_epf tegra_pcie_dma_epf.0: edma_final_complete: All Async channels. Cumulative Perf 105522 Mbps, time 5087716388 nsec