0.1 |
September 13, 2023 |
Initial draft. |
0.2 |
September 26, 2023 |
Support for Out of Band (OOB) log collection and Host-based log collection for NVIDIA Grace™ Hopper™ platforms including NVIDIA MGX systems. |
0.3 |
October 24, 2023 |
Support for OOB log collection and Host-based log collection for NVIDIA® HGX™-H100x8 and NVIDIA DGX™ H100. |
0.4 |
November 27, 2023 |
|
0.5 |
January 19, 2024 |
|
1.0 |
March 6, 2024 |
|
1.1 |
April 12, 2024 |
Host log collection improvement for local and remote hosts.
Initial support for NVIDIA NVLink®-related logs.
Integrated nvidia-bug-report.sh.
|
1.2 |
June 28, 2024 |
|
1.2.1 |
July 22, 2024 |
Added a parse option to decode binary dumps in NVDebug log outputs. |
1.3 |
October 11, 2024 |
Added support for the log collection of GB200 NVL (NVIDIA NVLink™ Switch tray and Multi-DUT for Rack level).
Updated the output log dump format.
Added new CIDs for software inventory, NVOS fabric manager, and subnet manager dumps, and so on.
Changed Platform naming, and the old names will be deprecated in v1.5.
Updated the Platforms section of the tables in Appendix A to reflect newer supported products.
|
1.4 |
December 16, 2024 |
Added support for the log collection of GB200 NVL (NVIDIA PowerShelf tray).
Added new CIDs for Subnet Manager, NVMe, HMC FDR, Post Codes, PowerShelf, OneDiag.
Added support for CID tag overriding.
Added support for passwordless ssh connectivity, DUT level archive zipping.
Updated CID R4 to allow capture on GB200 NVL
Updated CID S15 to enable configurable I2CTRANSFER_WRITE_TO_ADDRESS
|
1.5 |
February 21, 2025 |
Added automated archive splitting with the 200MB threshold.
Added configurable ports for BMC SSH (Default 22) and HMC TCP (Default 80).
Added a default Collector flag.
Added new CIDs for Network Devices (R36), GPU Devices (R37), and Host OS Information (H18).
Added support for HTML reports.
Enhanced collector H9 to better handle NVSwitch.
Enhanced parser error handling for .zip and .tar files.
Modified collector R29 to handle changes for the RoT Naming string.
Removed collectors H16, H17, and H18 from PowerShelf.
Split collector R35 for the network device debug dump into the R35 network switch, R36 network devices, and R37 GPU devices.
Updated collector H11 to capture both safe mode and standard nvidia-bug-report runs.
Updated collector R21 with improved CPER URL handling.
|
1.6 |
April 18, 2025 |
Added support for parallelized collection groups (PARALLEL_GROUP_COLLECTIONS, default True).
Added new Host collector H19 (Node Memory information).
Added user-configurable timeouts for collectors.
Added support for multiple collector groups under the -g argument.
Added ability for users to skip collectors (COLLECTOR_TO_SKIP, default None).
Enhanced the One-Click OOB Script to support GB200 and later platforms. Added support for Virtual EEPROM, authenticated HMC, and HTTPS support.
Enhanced all IPMI collectors with retry logic on non-zero return code (retry with the -v flag for more information).
Enhanced HTML report logic to not display N/A rows, which results in more condensed reports.
Enhanced security: hidden password entry on the CLI.
Modified Host H1 (node dmesg) to support more Linux distros with dmesg and journalctl if available.
Modified Host H2 (lspci) to support new flags full config space dumps and user defined commands.
Updated the baseboard parameter to be required because of API deltas on platforms.
|
1.6.1 |
April 23, 2025 |
Optimized R35, R36, R37, and R38 collection times.
Updated the user guide with usage information.
|
1.7.0 |
June 10, 2025 |
Added support for NVIDIA Open Source Review Board compliance.
Added new Host Collector H20 (SOS report for network devices).
Added new Redfish collector for data on thermal subsystem leak detector (R41).
Added new SSH collectors for device data off management controller buses: VirtualEEPROM (S16), SMBus (S17), and SMBPBI (S18).
Added new NV switch collector to obtain switch platform and FAE logs (H21) and the enabled SEL (R1) BMC dump.
Enhanced the OneClick script with minor fixes and additions to reach parity with NVDebug on the BMC SSH group.
Enhanced the HTML report with dependency check status, pre-flight check status, and improved readability.
Enhanced the built-in FPGA and EROT parsers for GB200 platforms.
Updated the user guide with usage information.
|