NVBandwidth Plugin
Overview
The NVBandwidth plugin is part of the level 3 and higher tests. NVBandwidth performs bandwidth measurements on NVIDIA GPUs on a single host.
Test Description
NVBandwidth measures bandwidth for various memcpy patterns across different links using copy engine or kernel copy methods. nvbandwidth reports current measured bandwidth on your system. Additional system-specific tuning may be required to achieve maximal peak bandwidth. Tests are performed on GPUs on a single host only. For more information, please see https://github.com/NVIDIA/nvbandwidth.
Supported Products
DCGM will run the NVbandwidth test on the following GPU products:
NVIDIA A800 (20bd)
NVIDIA B100 (197f)
NVIDIA B200 (1999, 199b, 20da)
NVIDIA B300 (20e6)
NVIDIA GH100-88K-A1
NVIDIA GH100-888K (2342, 237f)
NVIDIA H100 144GB HBM3
NVIDIA H100 80GB HBM3
NVIDIA H100NVL (2321, 233a)
NVIDIA H200 (2335, 233b)
NVIDIA H20A
NVIDIA H20B
NVIDIA H20 HBM3e
NVIDIA H20 NVL16
NVIDIA L2
NVIDIA L20
NVIDIA L20A
NVIDIA L30
NVIDIA L40S
NVIDIA P2021
NVIDIA PG153 SKU 210
NVIDIA RTX 2000 Ada Generation
NVIDIA RTX6000D
Supported Parameters
The following table lists the global parameters for this plugin:
Parameter Name |
Type |
Default |
Description |
|---|---|---|---|
testcases |
string |
The list of specific testcases to run, separated by |
|
is_allowed |
Bool |
False |
Specifies whether or not this test is allowed to run. |
Sample Commands
Run the test with default parameters:
$ dcgmi diag -r nvbandwidth
Run the test, specifying only testcase 1
$ dcgmi diag -r nvbandwidth -p nvbandwidth.testcases=1
Run the test, specifying multiple testcases
$ dcgmi diag -r nvbandwidth -p nvbandwidth.testcases=1,2,3
Run the level 3 test, indicating the nvbandwidth test should be allowed to run:
$ dcgmi diag -r 3 -p nvbandwidth.is_allowed=true
Failure Conditions
The test will fail if the
nvbandwidthexecutable cannot be found.The test will fail if current memory copy utilization (MCUTIL) is over 10% or cannot be retrieved.
The test will fail if an error is encountered during nvbandwidth execution.