PCIe Multi-GPU systems#
On multi-GPU systems, the Triton server uses peer-to-peer memory copy to transfer data between GPUs whenever this feature is available; that is, when cudaDeviceCanAccessPeer() returns true.
However, on bare-metal Linux systems with PCIe topology, IOMMU-enabled peer-to-peer memory copy is not supported. For more information, refer to IOMMU on Linux. WHen IOMMU is enabled, we recommend setting it to passthrough (by setting the Linux kernel parameter iommu=pt) to ensure optimal performance.
The following are the steps to set IOMMU to passthrough in GRUB:
Determine whether IOMMU is enabled by running the following command:
dmesg | grep -e DMAR -e IOMMU
If the command produces no output, IOMMU is not enabled, and no further action is required.
If the command produces output, IOMMU is enabled. Continue with the next step.
Determine whether IOMMU is set to passthrough by running the following command:
dmesg | grep -i -e iommu=pt -e iommu.*passthrough
If the command produces output, IOMMU is already set to passthrough, and no further action is required.
If the command produces no output, continue with the following steps.
Open
/etc/default/grubfile for edit and addiommu=pttoGRUB_CMDLINE_LINUXoption. For example:..... GRUB_CMDLINE_LINUX="crashkernel=auto quiet iommu=pt" .....
Based on the system’s OS, use
grub-mkconfigorgrub2-mkconfigto generate the configuration file:On systems with BIOS:
grub-mkconfig -o /boot/grub2/grub.cfg #on ubuntu, debian grub2-mkconfig -o /boot/grub2/grub.cfg #on centos, rockylinux
On systems with UEFI:
grub-mkconfig -o /boot/efi/EFI/<os_name>/grub.cfg #on ubuntu, debian grub2-mkconfig -o /boot/efi/EFI/<os_name>/grub.cfg #on centos, rockylinux #replace <os_name> with ubuntu, centos, debian, or rocky
On Ubuntu and Debian, you might need to install
grub-mkconfig:apt install grub-common
Reboot the system:
systemctl reboot
Verify that IOMMU is set to passthrough by repeating step 2.