NVDebug Runtime Examples#

When running the tool, you can pass parameters through the CLI or use a config and dut_config files.

Note

The CLI will take precedence over the config and dut_config files when applicable.

Grace Hopper Single Node Example#

When executing the tool, you can use only the CLI or the config file.

CLI Only#

Running the tool without using the config file (passing parameters through CLI):

$ nvdebug -i $BMC_IP -u $BMC_USER -p $BMC_PASS -I $HOST_IP -U $HOST_USER -H $HOST_PASS -t arm64 -b mgx-gh200

Results from the CLI#

Log directory created at /tmp/nvdebug_logs_22_04_2025_16_36_36
Using 32 resources for parallel collection
Starting collection for DUT dut-1
dut-1: User provided platform type: arm64
dut-1: BMC IP: XXXX
dut-1: Identified system as Model: P2312-A03, Partno:699-22312-0003-400, Serialno=1320125001253
Log collection has started for dut-1
DUT dut-1 completed.
Generating HTML report(s)
Logs collected in /tmp/nvdebug_logs_22_04_2025_16_36_36
Creating zip archive(s)
Log zip created at /tmp/nvdebug_logs_22_04_2025_16_36_36.zip

YAML Configuration File#

Running the tool using the config file:

Execution CLI#

$ nvdebug --config config.yaml --dutconfig dut_config.yaml

Config File (config.yaml)#

PLATFORM: "arm64"
TargetBaseboard: "GH200"
LogSanitization: true
SKIP_BMC_SSH_LOGS: true

DUT Config File (dut_config.yaml)#

DUT_Defaults: &dut_defaults
  NodeType: "Compute"
  ipmi_cipher: "-C17"
  RF_DEFAULT_PREFIX: "/redfish/v1"
  RF_AUTH: true
  IP_NETWORK: 'ipv4'

gh200-node1:
  <<: *dut_defaults
  BMC_IP: "192.168.1.100"
  BMC_USERNAME: "bmc_user"
  BMC_PASSWORD: "bmc_password"
  HOST_IP: "192.168.2.100"
  HOST_USERNAME: "host_user"
  HOST_PASSWORD: "host_password"

Results from the CLI and YAML Configuration File#

Log directory created at /tmp/nvdebug_logs_22_04_2025_16_36_36
Using 32 resources for parallel collection
Starting collection for DUT gh200-node1
gh200-node1: User provided platform type: HGX-HMC
gh200-node1: BMC IP: XXXX
gh200-node1: Identified system as Model: P2312-A03, Partno:699-22312-0003-400, Serialno=1320125001253
Log collection has started for gh200-node1
DUT gh200-node1 completed.
Generating HTML report(s)
Logs collected in /tmp/nvdebug_logs_22_04_2025_16_36_36
Creating zip archive(s)
Log zip created at /tmp/nvdebug_logs_22_04_2025_16_36_36.zip

Verify the contents of the zipped log file:#

To verify the contents of the zipped log file in the /tmp directory, run the following command.

$ cd /tmp
$ unzip nvdebug_logs_22_04_2025_16_36_36.zip
$ ls -l

The contents of the unzipped log is structured into two different directories:

  • reports

  • dut_name

.
├── reports
│   ├── file_map.html
│   ├── index.html
│   ├── status_complete.html
│   └── gh200-node1
│       ├── config.json.html
│       ├── dut_config.json.html
│       ├── Execution_Summary_Report.txt.html
│       ├── host
...
│       ├── ipmi
...
│       ├── redfish
...
│       └── report.html
└── gh200-node1
    ├── config.json
    ├── dut_config.json
    ├── Execution_Summary_Report.txt
    ├── host
       ├── Host_H1_node_dmesg
          ├── dmesg_recent.txt
          └── journalctl_recent.txt
       ├── Host_H2_node_lspci
          ├── lspci_full.log
          ├── lspci_logical_tree.log
          ├── lspci_physical_tree.log
          ├── proc_iomem.log
          └── proc_ioports.log
    ...
       ├── Host_H11_nvidia_bug_report_op
          ├── nvidia-bug-report_help.log
          ├── nvidia_bug_report_safe.log.gz
          ├── nvidia_bug_report_safe_stdout.log
          ├── nvidia_bug_report_standard.log.gz
          ├── nvidia_bug_report_standard_stdout.log
          ├── nvidia-bug-report_versions_stdout.log
          └── nvidia-bug-report_version_stdout.log
       ├── Host_H18_node_os_info
          └── Host_H18_cat_etc_os-release.txt
       ├── Host_H19_node_memory_info
          ├── dmesg_memory.txt
    ...
    ├── ipmi
       ├── IPMI_I1_mc_info.txt
    ...
       └── IPMI_I14_sdr_elist.txt
    ├── nvdebug_runtime_output_structured.txt
    ├── nvdebug_runtime_output.txt
    └── redfish
        ├── redfish_info.log
        ├── redfish_info.log.1
        ├── Redfish_R1_system_event_log
           ├── Redfish_R1_system_event_log_HGX_Baseboard_0_additional_data.json
           └── Redfish_R1_system_event_log_HGX_Baseboard_0.json
        ├── Redfish_R2_manager_existing_log_dump
           ├── Redfish_R2_existing_dump_341.tar.xz
           ├── Redfish_R2_existing_dump_342.tar.xz
           └── Redfish_R2_log_entries_HGX_BMC_0.json
        ├── Redfish_R3_hgx_manager_on_demand_log_dump
           └── Redfish_R3_hgx_manager_dump_HGX_BMC_0_23.tar.xz
        ├── Redfish_R5_manager_fpga_register_dump
           └── Redfish_R5_fpga_dump_HGX_Baseboard_0_24.tar.xz
        ├── Redfish_R6_manager_erot_dump
           └── Redfish_R6_erot_dump_HGX_Baseboard_0_25.tar.xz
        ...
        ├── Redfish_R20_firmware_inventory_table
           ├── Redfish_R20_firmware_inventory_table.json
           └── Redfish_R20_firmware_inventory_table.txt
        ├── Redfish_R22_task_details
           ├── Redfish_R22_task_0.txt
           ├── Redfish_R22_task_1.txt
           ├── Redfish_R22_task_2.txt
           ├── Redfish_R22_task_3.txt
           └── Redfish_R22_task_service_list.json
        └── redfish_request.log

53 directories, 326 files

NVIDIA HGX H100/B200 8-GPU Example#

When working with HGX baseboards, you can use Redfish aggregation or port forwarding through SSH tunneling. To set up SSH tunneling, you need the BMC SSH credentials. By default, these credentials are the same as the BMC credentials. If they differ, use -r and -w for SSH username and password respectively.

Note

The SSH tunnel is automatically set up using the specified port (default: 18888). To use an existing SSH tunnel, specify in the config file that tunneling should not be set up.

If you run into issues with the SSH tunnel, you can use the -P option to specify a different port for the SSH tunnel. You can manually set up the tunnel manually or use the -P option to specify a different port for the SSH tunnel. The FORCE_PORT_FW option can be used to force the tool to use the SSH tunnel. The tool will clear any processes on the ports that were defined using TUNNEL_TCP_PORT before setting up port forwarding. This ensures that there are no conflicts with the existing processes on the port.

The host BMC must support port forwarding, but if the host BMC does not support this feature, manually set up the tunnel or use Redfish aggregation.

When executing the tool, use only the CLI or the config file.

CLI Only#

To run the tool without using the config file (passing parameters through CLI):

$ nvdebug -i $BMC_IP -u $BMC_USER -p $BMC_PASS -I $HOST_IP -U $HOST_USER -H $HOST_PASS -r $SSHUSER -w $SSHPASS -t HGX-HMC -P $port_num -b Blackwell-HGX-8-GPU

Results from the CLI#

Log directory created at /tmp/nvdebug_logs_22_04_2025_16_36_36
Using 32 resources for parallel collection
Starting collection for DUT dut-1
dut-1: User provided platform type: HGX-HMC
dut-1: BMC IP: XXXX
dut-1: Identified system as Model: P2312-A03, Partno:699-22312-0003-400, Serialno=1320125001253
Log collection has started for dut-1
DUT dut-1 completed.
Generating HTML report(s)
Logs collected in /tmp/nvdebug_logs_22_04_2025_16_36_36
Creating zip archive(s)
Log zip created at /tmp/nvdebug_logs_22_04_2025_16_36_36.zip

YAML Configuration File#

To run the tool using the config file:

Execution CLI#

$ nvdebug --config config.yaml --dutconfig dut_config.yaml

Config File (config.yaml)#

PLATFORM: "HGX-HMC"
TargetBaseboard: "Blackwell-HGX-8-GPU"
LogSanitization: true
SKIP_BMC_SSH_LOGS: true

DUT Config File (dut_config.yaml)#

DUT_Defaults: &dut_defaults
  NodeType: "Compute"
  ipmi_cipher: "-C17"
  RF_DEFAULT_PREFIX: "/redfish/v1"
  RF_AUTH: true
  IP_NETWORK: 'ipv4'

blackwell-hgx-8-gpu-node1:
  <<: *dut_defaults
  BMC_IP: "192.168.1.100"
  BMC_USERNAME: "bmc_user"
  BMC_PASSWORD: "bmc_password"
  HOST_IP: "192.168.2.100"
  HOST_USERNAME: "host_user"
  HOST_PASSWORD: "host_password"
  TUNNEL_TCP_PORT: "18888"
  FORCE_PORT_FW: true
  SETUP_PORT_FORWARDING: true

Results from the CLI and the YAML Configuration File#

Log directory created at /tmp/nvdebug_logs_22_04_2025_16_36_36
Using 32 resources for parallel collection
Starting collection for DUT blackwell-hgx-8-gpu-node1
blackwell-hgx-8-gpu-node1: User provided platform type: HGX-HMC
blackwell-hgx-8-gpu-node1: BMC IP: XXXX
blackwell-hgx-8-gpu-node1: Identified system as Model: P2312-A03, Partno:699-22312-0003-400, Serialno=1320125001253
Log collection has started for blackwell-hgx-8-gpu-node1
DUT blackwell-hgx-8-gpu-node1 completed.
Generating HTML report(s)
Logs collected in /tmp/nvdebug_logs_22_04_2025_16_36_36
Creating zip archive(s)
Log zip created at /tmp/nvdebug_logs_22_04_2025_16_36_36.zip

Verify the Contents of the Zipped Log File#

To verify the contents of the zipped log file in the /tmp directory, run the following command.

$ cd /tmp
$ unzip nvdebug_logs_22_04_2025_16_36_36.zip
$ ls -l

The contents of the unzipped log is structured into the following directories:

  • reports

  • dut_name

.
├── reports
│   ├── file_map.html
│   ├── index.html
│   ├── status_complete.html
│   └── blackwell-hgx-8-gpu-node1
│       ├── config.json.html
│       ├── dut_config.json.html
│       ├── Execution_Summary_Report.txt.html
│       ├── host
...
│       ├── ipmi
...
│       ├── redfish
...
│       └── report.html
└── blackwell-hgx-8-gpu-node1
    ├── config.json
    ├── dut_config.json
    ├── Execution_Summary_Report.txt
    ├── host
       ├── Host_H1_node_dmesg
          ├── dmesg_recent.txt
          └── journalctl_recent.txt
       ├── Host_H2_node_lspci
          ├── lspci_full.log
          ├── lspci_logical_tree.log
          ├── lspci_physical_tree.log
          ├── proc_iomem.log
          └── proc_ioports.log
    ...
       ├── Host_H11_nvidia_bug_report_op
          ├── nvidia-bug-report_help.log
          ├── nvidia_bug_report_safe.log.gz
          ├── nvidia_bug_report_safe_stdout.log
          ├── nvidia_bug_report_standard.log.gz
          ├── nvidia_bug_report_standard_stdout.log
          ├── nvidia-bug-report_versions_stdout.log
          └── nvidia-bug-report_version_stdout.log
       ├── Host_H18_node_os_info
          └── Host_H18_cat_etc_os-release.txt
       ├── Host_H19_node_memory_info
          ├── dmesg_memory.txt
    ...
    ├── ipmi
       ├── IPMI_I1_mc_info.txt
    ...
       └── IPMI_I14_sdr_elist.txt
    ├── nvdebug_runtime_output_structured.txt
    ├── nvdebug_runtime_output.txt
    └── redfish
        ├── redfish_info.log
        ├── redfish_info.log.1
        ├── Redfish_R1_system_event_log
           ├── Redfish_R1_system_event_log_HGX_Baseboard_0_additional_data.json
           └── Redfish_R1_system_event_log_HGX_Baseboard_0.json
        ├── Redfish_R2_manager_existing_log_dump
           ├── Redfish_R2_existing_dump_341.tar.xz
           ├── Redfish_R2_existing_dump_342.tar.xz
           └── Redfish_R2_log_entries_HGX_BMC_0.json
        ├── Redfish_R3_hgx_manager_on_demand_log_dump
           └── Redfish_R3_hgx_manager_dump_HGX_BMC_0_23.tar.xz
        ├── Redfish_R5_manager_fpga_register_dump
           └── Redfish_R5_fpga_dump_HGX_Baseboard_0_24.tar.xz
        ├── Redfish_R6_manager_erot_dump
           └── Redfish_R6_erot_dump_HGX_Baseboard_0_25.tar.xz
        ...
        ├── Redfish_R20_firmware_inventory_table
           ├── Redfish_R20_firmware_inventory_table.json
           └── Redfish_R20_firmware_inventory_table.txt
        ├── Redfish_R22_task_details
           ├── Redfish_R22_task_0.txt
           ├── Redfish_R22_task_1.txt
           ├── Redfish_R22_task_2.txt
           ├── Redfish_R22_task_3.txt
           └── Redfish_R22_task_service_list.json
        └── redfish_request.log

53 directories, 326 files

DGX-H100 Example#

When executing the tool, you can use only the CLI or the config file.

CLI Only#

To run the tool without using the config file (passing parameters through CLI):

$ nvdebug -i $BMC_IP -u $BMC_USER -p $BMC_PASS -I $HOST_IP -U $HOST_USER -H $HOST_PASS -t DGX -b Hopper-HGX-8-GPU

Results from the CLI#

Log directory created at /tmp/nvdebug_logs_22_04_2025_16_36_36
Using 32 resources for parallel collection
Starting collection for DUT dut-1
dut-1: User provided platform type: DGX
dut-1: BMC IP: XXXX
dut-1: Identified system as Model: DGXH100, Partno: 965-24387-0002-003, Serialno:1660224000069
Log collection has started for dut-1
DUT dut-1 completed.
Generating HTML report(s)
Logs collected in /tmp/nvdebug_logs_22_04_2025_16_36_36
Creating zip archive(s)
Log zip created at /tmp/nvdebug_logs_22_04_2025_16_36_36.zip

YAML Configuration File#

To run the tool using the config file:

Execution CLI#

$ nvdebug --config config.yaml --dutconfig dut_config.yaml

Config File (config.yaml)#

PLATFORM: "DGX"
TargetBaseboard: "Hopper-HGX-8-GPU"
LogSanitization: true
SKIP_BMC_SSH_LOGS: true

DUT Config File (dut_config.yaml)#

DUT_Defaults: &dut_defaults
  NodeType: "Compute"
  ipmi_cipher: "-C17"
  RF_DEFAULT_PREFIX: "/redfish/v1"
  RF_AUTH: true
  IP_NETWORK: 'ipv4'

dgx-h100-node1:
  <<: *dut_defaults
  BMC_IP: "192.168.1.100"
  BMC_USERNAME: "bmc_user"
  BMC_PASSWORD: "bmc_password"
  HOST_IP: "192.168.2.100"
  HOST_USERNAME: "host_user"
  HOST_PASSWORD: "host_password"

Results from the CLI and YAML Configuration File#

Log directory created at /tmp/nvdebug_logs_22_04_2025_16_36_36
Using 32 resources for parallel collection
Starting collection for DUT dgx-h100-node1
dgx-h100-node1: User provided platform type: DGX
dgx-h100-node1: BMC IP: XXXX
dgx-h100-node1: Identified system as Model: DGXH100, Partno: 965-24387-0002-003, Serialno:1660224000069
Log collection has started for dgx-h100-node1
DUT dgx-h100-node1 completed.
Generating HTML report(s)
Logs collected in /tmp/nvdebug_logs_22_04_2025_16_36_36
Creating zip archive(s)
Log zip created at /tmp/nvdebug_logs_22_04_2025_16_36_36.zip

Verifying the Contents of the Zipped Log File#

To verify the contents of the zipped log file in the /tmp directory, run the following command.

$ cd /tmp
$ unzip nvdebug_logs_22_04_2025_16_36_36.zip
$ ls -l

The contents of the unzipped log is structured into the following directories:

  • reports

  • dut_name

.
├── reports
│   ├── file_map.html
│   ├── index.html
│   ├── status_complete.html
│   └── dgx-h100-node1
│       ├── config.json.html
│       ├── dut_config.json.html
│       ├── Execution_Summary_Report.txt.html
│       ├── host
...
│       ├── ipmi
...
│       ├── redfish
...
│       └── report.html
└── dgx-h100-node1
    ├── config.json
    ├── dut_config.json
    ├── Execution_Summary_Report.txt
    ├── host
       ├── Host_H1_node_dmesg
          ├── dmesg_recent.txt
          └── journalctl_recent.txt
       ├── Host_H2_node_lspci
          ├── lspci_full.log
          ├── lspci_logical_tree.log
          ├── lspci_physical_tree.log
          ├── proc_iomem.log
          └── proc_ioports.log
    ...
       ├── Host_H11_nvidia_bug_report_op
          ├── nvidia-bug-report_help.log
          ├── nvidia_bug_report_safe.log.gz
          ├── nvidia_bug_report_safe_stdout.log
          ├── nvidia_bug_report_standard.log.gz
          ├── nvidia_bug_report_standard_stdout.log
          ├── nvidia-bug-report_versions_stdout.log
          └── nvidia-bug-report_version_stdout.log
       ├── Host_H18_node_os_info
          └── Host_H18_cat_etc_os-release.txt
       ├── Host_H19_node_memory_info
          ├── dmesg_memory.txt
    ...
    ├── ipmi
       ├── IPMI_I1_mc_info.txt
    ...
       └── IPMI_I14_sdr_elist.txt
    ├── nvdebug_runtime_output_structured.txt
    ├── nvdebug_runtime_output.txt
    └── redfish
        ├── redfish_info.log
        ├── redfish_info.log.1
        ├── Redfish_R1_system_event_log
           ├── Redfish_R1_system_event_log_HGX_Baseboard_0_additional_data.json
           └── Redfish_R1_system_event_log_HGX_Baseboard_0.json
        ├── Redfish_R2_manager_existing_log_dump
           ├── Redfish_R2_existing_dump_341.tar.xz
           ├── Redfish_R2_existing_dump_342.tar.xz
           └── Redfish_R2_log_entries_HGX_BMC_0.json
        ├── Redfish_R3_hgx_manager_on_demand_log_dump
           └── Redfish_R3_hgx_manager_dump_HGX_BMC_0_23.tar.xz
        ├── Redfish_R5_manager_fpga_register_dump
           └── Redfish_R5_fpga_dump_HGX_Baseboard_0_24.tar.xz
        ├── Redfish_R6_manager_erot_dump
           └── Redfish_R6_erot_dump_HGX_Baseboard_0_25.tar.xz
        ...
        ├── Redfish_R20_firmware_inventory_table
           ├── Redfish_R20_firmware_inventory_table.json
           └── Redfish_R20_firmware_inventory_table.txt
        ├── Redfish_R22_task_details
           ├── Redfish_R22_task_0.txt
           ├── Redfish_R22_task_1.txt
           ├── Redfish_R22_task_2.txt
           ├── Redfish_R22_task_3.txt
           └── Redfish_R22_task_service_list.json
        └── redfish_request.log

53 directories, 326 files

GB200 NVL Full Rack Example#

When collecting logs from a full rack, we recommend that you use individual config files for each node type. A GB200 NVL 72x1 rack consists of 18 compute nodes and nine switch nodes. Due to this configuration, a config_compute.yaml and config_switch.yaml file is required to specify the platform type for each node type.

The dut_config.yaml file will contain every single node in the rack.

If you have access to the Power Shelf, you can use the config_ps.yaml file to specify the platform type for the Power Shelf. The following example shows you how to collect logs from the compute nodes and switch nodes.

YAML Configuration File#

To run the tool using the config file:

Execution CLI#

$ nvdebug --config config_compute.yaml --dutconfig dut_config.yaml

Note

The –config flag specifies an initial config file and is arbitrary about which config file is passed first. The dut_config.yaml file contains pointers to the correct config file based on the node type.

Config File (config_compute.yaml)#

PLATFORM: "arm64"
TargetBaseboard: "GB200 NVL"
LogSanitization: true
SKIP_BMC_SSH_LOGS: true

Config File (config_switch.yaml)#

PLATFORM: "NVSwitch"
TargetBaseboard: "GB200 NVL NVSwitchTray"
LogSanitization: true
SKIP_BMC_SSH_LOGS: true

DUT Config File (dut_config.yaml)#

DUT_Defaults: &dut_defaults
  ipmi_cipher: "-C17"
  RF_DEFAULT_PREFIX: "/redfish/v1"
  RF_AUTH: true
  IP_NETWORK: 'ipv4'

nvl-compute-1: &compute_defaults
  <<: *dut_defaults
  NodeType: "Compute"
  ConfigFileToUse: "config_compute.yaml"
  HOST_USERNAME: "host_user"
  HOST_PASSWORD: "host_password"
  BMC_USERNAME: "bmc_user"
  BMC_PASSWORD: "bmc_password"
  BMC_SSH_USERNAME: "bmc_ssh_user"
  BMC_SSH_PASSWORD: "bmc_ssh_password"
  <<: *compute_defaults
  HOST_IP: 192.168.1.6
  BMC_IP: 192.168.1.134
nvl-compute-2:
  <<: *compute_defaults
  HOST_IP: 192.168.1.7
  BMC_IP: 192.168.1.135

...

nvl-compute-18:
  <<: *compute_defaults
  HOST_IP: 192.168.1.23
  BMC_IP: 192.168.1.151

nvl-switch-1: &switch_defaults
  <<: *dut_defaults
  NodeType: "SwitchTray"
  ConfigFileToUse: "config_switch.yaml"
  HOST_USERNAME: "host_user"
  HOST_PASSWORD: "host_password"
  BMC_USERNAME: "bmc_user"
  BMC_PASSWORD: "bmc_password"
  BMC_SSH_USERNAME: "bmc_ssh_user"
  BMC_SSH_PASSWORD: "bmc_ssh_password"
  HOST_IP: 192.168.1.101
  BMC_IP: 192.168.1.229
nvl-switch-2:
  <<: *switch_defaults
  HOST_IP: 192.168.1.102
  BMC_IP: 192.168.1.230

...

nvl-switch-9:
  <<: *switch_defaults
  HOST_IP: 192.168.1.109
  BMC_IP: 192.168.1.237

Results from the CLI and YAML Configuration File#

Multiple DUTs found. Ignoring CLI DUT details
Log directory created at /tmp/nvdebug_logs_22_04_2025_17_22_44
Using 32 resources for parallel collection
Starting collection for DUT nvl-switch-1
Starting collection for DUT nvl-switch-2
Starting collection for DUT nvl-switch-3
Starting collection for DUT nvl-switch-4
Starting collection for DUT nvl-switch-5
Starting collection for DUT nvl-switch-6
Starting collection for DUT nvl-switch-7
Starting collection for DUT nvl-switch-8
Starting collection for DUT nvl-switch-9
Starting collection for DUT nvl-compute-1
Starting collection for DUT nvl-compute-2
Starting collection for DUT nvl-compute-3
Starting collection for DUT nvl-compute-4
Starting collection for DUT nvl-compute-6
Starting collection for DUT nvl-compute-5
Starting collection for DUT nvl-compute-7
Starting collection for DUT nvl-compute-8
Starting collection for DUT nvl-compute-9
Starting collection for DUT nvl-compute-10
Starting collection for DUT nvl-compute-11
Starting collection for DUT nvl-compute-12
Starting collection for DUT nvl-compute-13
Starting collection for DUT nvl-compute-14
Starting collection for DUT nvl-compute-15
Starting collection for DUT nvl-compute-16
Starting collection for DUT nvl-compute-17
Starting collection for DUT nvl-compute-18
nvl-switch-2: User provided platform type: NVSwitch
nvl-switch-2: BMC IP: XXXX
nvl-switch-6: User provided platform type: NVSwitch
nvl-switch-6: BMC IP: XXXX
nvl-switch-1: User provided platform type: NVSwitch
nvl-switch-1: BMC IP: XXXX
nvl-switch-4: User provided platform type: NVSwitch
nvl-switch-4: BMC IP: XXXX
nvl-switch-9: User provided platform type: NVSwitch
nvl-switch-9: BMC IP: XXXX
nvl-switch-8: User provided platform type: NVSwitch
nvl-switch-8: BMC IP: XXXX
nvl-switch-3: User provided platform type: NVSwitch
nvl-switch-3: BMC IP: XXXX
nvl-switch-7: User provided platform type: NVSwitch
nvl-switch-7: BMC IP: XXXX
nvl-switch-5: User provided platform type: NVSwitch
nvl-switch-5: BMC IP: XXXX
nvl-switch-1: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-1
nvl-switch-2: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-2
nvl-switch-9: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-9
nvl-switch-6: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-6
nvl-switch-8: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-8
nvl-switch-3: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-3
nvl-switch-5: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-5
nvl-switch-4: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-4
nvl-switch-7: Identified system as Model: N5300_LD, Partno:N/A, Serialno=N/A
Log collection has started for nvl-switch-7
DUT nvl-switch-1 completed.
DUT nvl-switch-2 completed.
DUT nvl-switch-3 completed.
DUT nvl-switch-4 completed.
DUT nvl-switch-5 completed.
DUT nvl-switch-6 completed.
DUT nvl-switch-7 completed.
DUT nvl-switch-8 completed.
DUT nvl-switch-9 completed.
nvl-compute-8: User provided platform type: arm64
nvl-compute-13: User provided platform type: arm64
nvl-compute-8: BMC IP: 192.168.1.141
nvl-compute-13: BMC IP: 192.168.1.146
nvl-compute-5: User provided platform type: arm64
nvl-compute-5: BMC IP: 192.168.1.138
nvl-compute-18: User provided platform type: arm64
nvl-compute-18: BMC IP: 192.168.1.151
nvl-compute-7: User provided platform type: arm64
nvl-compute-7: BMC IP: 192.168.1.140
nvl-compute-3: User provided platform type: arm64
nvl-compute-3: BMC IP: 192.168.1.136
nvl-compute-2: User provided platform type: arm64
nvl-compute-2: BMC IP: 192.168.1.135
nvl-compute-11: User provided platform type: arm64
nvl-compute-11: BMC IP: 192.168.1.144
nvl-compute-1: User provided platform type: arm64
nvl-compute-1: BMC IP: 192.168.1.134
nvl-compute-15: User provided platform type: arm64
nvl-compute-15: BMC IP: 192.168.1.148
nvl-compute-9: User provided platform type: arm64
nvl-compute-9: BMC IP: 192.168.1.142
nvl-compute-4: User provided platform type: arm64
nvl-compute-4: BMC IP: 192.168.1.137
nvl-compute-10: User provided platform type: arm64
nvl-compute-10: BMC IP: 192.168.1.143
nvl-compute-6: User provided platform type: arm64
nvl-compute-12: User provided platform type: arm64
nvl-compute-12: BMC IP: 192.168.1.145
nvl-compute-6: BMC IP: 192.168.1.139
nvl-compute-16: User provided platform type: arm64
nvl-compute-16: BMC IP: 192.168.1.149
nvl-compute-17: User provided platform type: arm64
nvl-compute-17: BMC IP: 192.168.1.150
nvl-compute-14: User provided platform type: arm64
nvl-compute-14: BMC IP: 192.168.1.147
nvl-compute-18: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010050
Log collection has started for nvl-compute-18
nvl-compute-3: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010206
Log collection has started for nvl-compute-3
nvl-compute-7: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010645
Log collection has started for nvl-compute-7
nvl-compute-8: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010391
Log collection has started for nvl-compute-8
nvl-compute-5: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010452
Log collection has started for nvl-compute-5
nvl-compute-13: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010108
Log collection has started for nvl-compute-13
nvl-compute-11: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010155
Log collection has started for nvl-compute-11
nvl-compute-1: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010392
Log collection has started for nvl-compute-1
nvl-compute-2: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010555
Log collection has started for nvl-compute-2
nvl-compute-15: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010222
Log collection has started for nvl-compute-15
nvl-compute-4: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010033
Log collection has started for nvl-compute-4
nvl-compute-6: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010453
Log collection has started for nvl-compute-6
nvl-compute-9: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010183
Log collection has started for nvl-compute-9
nvl-compute-10: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010451
Log collection has started for nvl-compute-10
nvl-compute-12: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010068
Log collection has started for nvl-compute-12
nvl-compute-14: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010695
Log collection has started for nvl-compute-14
nvl-compute-17: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010172
Log collection has started for nvl-compute-17
nvl-compute-16: Identified system as Model: GB200 NVL, Partno:699-24764-0001-TS3, Serialno=1333124010462
Log collection has started for nvl-compute-16
DUT nvl-compute-1 completed.
DUT nvl-compute-2 completed.
DUT nvl-compute-3 completed.
DUT nvl-compute-4 completed.
DUT nvl-compute-5 completed.
DUT nvl-compute-6 completed.
DUT nvl-compute-7 completed.
DUT nvl-compute-8 completed.
DUT nvl-compute-9 completed.
DUT nvl-compute-10 completed.
DUT nvl-compute-11 completed.
DUT nvl-compute-12 completed.
DUT nvl-compute-13 completed.
DUT nvl-compute-14 completed.
DUT nvl-compute-15 completed.
DUT nvl-compute-16 completed.
DUT nvl-compute-17 completed.
DUT nvl-compute-18 completed.
Generating HTML report(s)
Logs collected in /tmp/nvdebug_logs_22_04_2025_17_22_44
Creating zip archive(s)

Split files created:
  nvdebug_logs_22_04_2025_17_22_44.z01
  nvdebug_logs_22_04_2025_17_22_44.z02
  nvdebug_logs_22_04_2025_17_22_44.z03
  nvdebug_logs_22_04_2025_17_22_44.z04
  nvdebug_logs_22_04_2025_17_22_44.z05
  nvdebug_logs_22_04_2025_17_22_44.zip

To recombine the files:
On Linux/macOS:
  cat nvdebug_logs_22_04_2025_17_22_44.z* > nvdebug_logs_22_04_2025_17_22_44_combined.zip

On Windows (Command Prompt):
  copy /b nvdebug_logs_22_04_2025_17_22_44.z* nvdebug_logs_22_04_2025_17_22_44_combined.zip

On Windows (PowerShell):
  Get-Content nvdebug_logs_22_04_2025_17_22_44.z* -Raw -Encoding Byte | Set-Content nvdebug_logs_22_04_2025_17_22_44_combined.zip -Encoding Byte

Verify the contents of the zipped log file:#

To verify the contents of the zipped log file in the /tmp directory, run the following command.

$ cd /tmp
$ unzip nvdebug_logs_22_04_2025_17_22_44.zip
$ ls -l

The contents of the unzipped log is structured into the following directories:

  1. reports

  2. dut_name

.
├── nvl-compute-1
│   ├── config.json
│   ├── dut_config.json
│   ├── Execution_Summary_Report.txt
│   ├── host
│   ├── ipmi
│   ├── redfish
│   └── ssh
...
├── nvl-compute-18
│   ├── config.json
│   ├── dut_config.json
│   ├── Execution_Summary_Report.txt
│   ├── host
│   ├── ipmi
│   ├── redfish
│   └── ssh
├── nvl-switch-1
│   ├── config.json
│   ├── dut_config.json
│   ├── Execution_Summary_Report.txt
│   ├── host
│   ├── ipmi
│   ├── redfish
│   └── ssh
...
├── nvl-switch-9
│   ├── config.json
│   ├── dut_config.json
│   ├── Execution_Summary_Report.txt
│   ├── host
│   ├── ipmi
│   ├── redfish
│   └── ssh
├── log_signature.txt
└── reports
    ├── nvl-compute-1
       ├── config.json.html
       ├── dut_config.json.html
       ├── Execution_Summary_Report.txt.html
       ├── ipmi
       ├── host
       ├── redfish
       ├── report.html
    ...
    ├── nvl-compute-18
       ├── config.json.html
       ├── dut_config.json.html
       ├── Execution_Summary_Report.txt.html
       ├── ipmi
       ├── host
       ├── redfish
       ├── report.html
    ├── nvl-compute-18
       ├── config.json.html
       ├── dut_config.json.html
       ├── Execution_Summary_Report.txt.html
       ├── ipmi
       ├── host
       ├── redfish
       ├── report.html
    ...
    └── nvl-switch-9
        └── host
        └── ipmi
        └── redfish
        └── ssh

439 directories, 2537 files