1. Overview
2. Design Considerations
3. How to Perform Specific Tasks
4. References
5. Notices
GPUDirect RDMA
»
Contents
v12.4 |
PDF
|
Archive
Contents
1. Overview
1.1. How GPUDirect RDMA Works
1.2. Standard DMA Transfer
1.3. GPUDirect RDMA Transfers
1.4. Changes in CUDA 6.0
1.5. Changes in CUDA 7.0
1.6. Changes in CUDA 8.0
1.7. Changes in CUDA 10.1
1.8. Changes in CUDA 11.2
1.9. Changes in CUDA 11.4
1.10. Changes in CUDA 12.2
2. Design Considerations
2.1. Lazy Unpinning Optimization
2.2. Registration Cache
2.3. Unpin Callback
2.4. Supported Systems
2.5. PCI BAR sizes
2.6. Tokens Usage
2.7. Synchronization and Memory Ordering
3. How to Perform Specific Tasks
3.1. Displaying GPU BAR space
3.2. Pinning GPU memory
3.3. Unpinning GPU memory
3.4. Handling the free callback
3.5. Buffer ID Tag Check for A Registration Cache
3.6. Linking a Kernel Module against nvidia.ko
3.7. Using nvidia-peermem
4. References
4.1. Basics of UVA CUDA Memory Management
4.2. Userspace API
4.3. Kernel API
4.4. Porting to Tegra
4.4.1. Changing the allocator
4.4.2. Modification to Kernel API
4.4.3. Other highlights
5. Notices
5.1. Notice
5.2. OpenCL
5.3. Trademarks