Introduction to NVDebug#
The NVIDIA® NVDebug tool, nvdebug, runs on server platforms or from remote client machines.
This binary tool, which is available for x86_64 or arm64-SBSA architecture systems, collects the
following information:
Out-of-band (OOB) BMC logs and information for troubleshooting server issues
Logs from the host
For detailed instructions on installing, configuring, and using the NVDebug tool, refer to the NVDebug User Guide.
Requirements#
Requirements |
Client Host |
Server Host |
|---|---|---|
Linux-based operating system: Linux kernel 4.4 or later supported
Linux Kernel 4.4 or later (4.15 or later recommended)
GNU C Library (glibc) 2.17 or later
|
X |
X |
Ubuntu 20.04 or later (supported)
Ubuntu 24.04 (recommended)
|
X |
X |
RHEL/CentOS 8+
SLES 15+
|
X |
X |
Minimum 4GB RAM
2GB free disk space
Network connectivity to target systems
|
X |
X |
Python 3.12
ipmitool 1.8.18 or laterThe
sshpass command |
X |
X |
Access to the server device under test (DUT) via BMC using Redfish and IPMI-over-LAN
Out-of-band (OOB) access to BMC
SSH access to host systems (for remote mode)
Firewall rules for BMC and host communication
Segmented network access is supported through split log collection
Stable network connectivity to target systems
|
X |
X |
Required packages:
|
X |
Note
NVIDIA Graphics Drivers: Must be installed on the target system to run the following collectors:
nvidia-bug-reportcollector H11nvidia-smicollector H6
nvidia-fabricmanager: Required for
nvidia-fabricmanagercollector (H12). This collector is specific to InfiniBand network hardware.nvlsm: Required for subnet manager collector (H10) and a section of the
nvidia-bug-reportcollector (H11).doca-sosreport: Required for
sos-report(H20). The recommended version is v4.8.0 or later. Do not use the Ubuntu system sosreport as it lacks required plugins for NVIDIA hardware.Network Configuration: BMC Management and Server Host Management networks must reside in the same subnet.
Command Structure for Basic Usage#
The basic NVDebug command follows this pattern:
./nvdebug [OPTIONS] COMMAND [ARGS]
Top-level Commands#
./nvdebug collect # Collect system logs and debug information
./nvdebug preflight # Run preflight checks only
./nvdebug list-collectors # List all available collectors
./nvdebug list-baseboards # List available baseboards from the spreadsheet
Note
Run ./nvdebug COMMAND --help to see command-specific flags and examples.
Required Options:
-i, –bmc-ip - BMC IP address
-u, –bmc-user - BMC username with administrative privileges
-p, –bmc-pass - BMC administrator password
-b, –baseboard - Baseboard type (use quotes for names with spaces)
Common Options:
-v, –verbose - Detailed output
-o, –output - Output directory
-c, –config - Configuration file
Basic Examples#
The examples below manually specify the baseboard. If not specified, NVDebug automatically detects the baseboard and platform. If detection fails, NVDebug will prompt the user to select the baseboard and platform.
Simple Collection:
# ARM64 system collection
./nvdebug collect -i <bmc-ip-address> -u admin -p <password> \
-b "HGX B300"
# With output directory
./nvdebug collect -i <bmc-ip-address> -u admin -p <password> \
-b "HGX B300" -o /path/to/output
# With verbose output
./nvdebug collect -i <bmc-ip-address> -u admin -p <password> \
-b "HGX B300" -v