image image image image image

On This Page

Dump Me Now (DMN)

DMN generates dumps and traces from various components, including hardware, firmware and software, upon user requests, upon internally detected issues (by the resiliency sensors) and ND application requests via the extended NVIDIA® ND API.

DMN dumps are crucial for offline debugging. Once an issue is hit, the dumps can provide useful information about the NIC's state at the time of the failure. This includes hardware state dumps, firmware traces and various driver component state and resource dumps.

For information on the relevant registry keys for this feature, please refer to Dump Me Now (DMN) Registry Keys.

DMN Triggers and APIs

DMN supports three triggering APIs:

  1. mlx5Cmd.exe can be used to trigger DMN by running the -Dmn sub command:

    Mlx5Cmd -Dmn -hh | -Name <adapter name>
    Submit dump-me-now request


    -hhShow this help screen
    -Name <adapter name>Network adapter name
    -NoMstDumpRun DMN without mst dump
    -CoreDumpQP<QP number>Run DMN with QP Core Dump
  2. ND SPI NVIDIA® extension (defined in ndspi_ext_mlx.h):
    1. API function to generate a general DMN dump from an ND application:

      __in IND2AdapterControl* pCtrl,
      __in HANDLE hOverlappedFile,
      __inout OVERLAPPED* pOverlapped
    2. API function to generate a QP based DMN dump from an ND application. The function generates a dump that might include more information about the queue pair specified by its number.

      __in IND2AdapterControl* pCtrl,
      __in HANDLE hOverlappedFile,
      __in ULONG Qpn,
      __inout OVERLAPPED* pOverlapped
    3. An internal API between different driver components, in order to support generating DMN upon self-detected errors and failures (by the resiliency feature).

Dumps and Incident Folders

DMN generates a directory per incident, where it places all of the needed NIC dump files. There is a mechanism to limit the number of created Incident Directories. For further information, see Cyclic DMN Mechanism.

The DMN incident directory name includes a timestamp, dump type, DMN event source and reason. It uses the following directory naming scheme: dmn-<type of DMN>-<source of DMN trigger>-<reason>-<timestamp>



In this example:

  • GN: The dump type is "General”
  • USR: The DMN was triggered by mlx5Cmd (user)

  • NA: In this version of the driver, the cause for the dump is not available in case of mlx5Cmd triggering

  • The dump was created on April 13th, 2017 at 747 milliseconds after 7:49:02 AM

In this version of the driver, the DMN generates the following dump files upon a DMN event:

  • IPoIB: The adapter’s IPoIB state
  • PDDR: The port diagnostics database
  • General
  • mst files
  • Registry

DMN incident dumps are created under the DMN root directory, which can be controlled via the registry. The root directory will include the port identification in its name.

The default is:

  • Host: "\Systemroot\temp\Mlx5_Dump_Me_Now-<b>-<d>-<f>"

State Dumping (via Dump Me Now)

Upon several types of events, the drivers can produce a set of files reflecting the current state of the adapter.

Automatic state dumps via DMN are done upon the following events:

Event TypeDescriptionProviderDefaultTag


Command failure




CMD_TIMEOUTTimeout reached on a commandMlx5OnTOUT
RESILIENCYResiliency sensor was activatedMlx5OFFRES


Driver decided that an event queue is stuck





Driver decided that a transmit completion queue is stuck





Driver decided that a receive completion queue is stuck





Adapter passed to “port up” state, “port down” state or “port unknown” state.





User application asked to generate dump files






The driver creating the set of files.


Whether or not the state dumps are created by default upon this event.


Part of the file name, used to identify the event that has triggered the state dump.

Dump events can be enabled/disabled by adding DWORD32 parameters into HKLM\SYSTEM\CurrentControlSet\Control\Class\{4d36e972-e325-11ce-bfc1- 08002be10318}\<nn> as follows:

  • Dump events can be disabled by adding MstDumpMode parameter as follows:

    MstDumpMode 0
  • PORT_STATE events can be disabled by adding EnableDumpOnUnknownLink and EnableDumpOnPortDown parameters as follows:

    EnableDumpOnUnknownLink 0
    EnableDumpOnPortDown 0
    EnableDumpOnPortUp 0

    As of WinOF-2 v2.10, the registry keys above can be changed dynamically. In any case of an illegal input, the value will fall back to the default value and not to the last value used.

  • EQ_STUCK, TXCQ_STUCK and RXCQ_STUCK events can be disabled by adding DisableDumpOnEqStuck, DisableDumpOnTxCqStuck and DisableDumpOnRxCqStuck parameters as follows:

    DisableDumpOnEqStuck 1
    DisableDumpOnTxCqStuck 1
    DisableDumpOnRxCqStuck 1

The set consists of 2 consecutive mstdump files. These files are created in the same directory as the DMN, and should be sent to NVIDIA® Support for analysis when debugging WinOF2 driver problems.

Their names have the following format: <event_name>-<dump_mode>_<file_index>.txt


Event nameDescription


Timeout reached on command with polling mode, OPCODE is the command opcode in the driver.


Timeout reached on command while waiting, OPCODE is the command opcode in the driver.


Command with polling mode failed, OPCODE is the command opcode in the driver.


Command failed, OPCODE is the command opcode in the driver.

eth-eq-<EQN >-<EQ_IDX>

EQ stuck, EQN: EQ number, EQ_IDX: EQ index


TXCQ is stuck, CQN is the CQ number


RXCQ is stuck, CQN is the CQ number


PORT change event, STATE: [“up”, “down”, “none”]


User application asked the dump


Bug check event


When resiliency flow is triggered


dump_mode: The mode of collecting the mstdump: “crspcae”, “fast-crspace”


file_index: The file number of this type in the set


Name: wait-failed-936-fast-crspace_1.txt

The default number of sets of files for each event is 20. The other dump files have the filename of: <DumpType>.log

DumpType can be: PDDR, Registry, General, IPoIB, MiniportProfiling

Cyclic DMN Mechanism

The driver manages the DMN incident dumps in a cyclic fashion, in order to limit the amount of disk space used for saving DMN dumps, and avoid low disk space conditions that can be caused from creating the dumps.

Rather than using a simple cyclic override scheme by replacing the oldest DMN incident folder every time it generates a new one, the driver allows the user to determine whether the first N incident folders should be preserved or not. This means that the driver will maintain a cyclic overriding scheme starting from a given index.

The two registry keys used to control this behavior are DumpMeNowTotalCount, which specifies the maximum number of allowed dumps under the DMN root folder, and DumpMeNowPreservedCount, which specifies the number of reserved incident folders that will not be overridden by the cyclic algorithm.

The following diagram illustrates the cyclic scheme’s work, assuming DumpMeNowPreservedCount=2 and DumpMeNowTotalCount=16:

Configuring DMN-IOV

The DMN-IOV detail level can be configured by the "DmnIovMode" value that is located in device parameters registry key. The default value is 2. The acceptable values are 0-4:



The feature is disabled


Major IOV objects and their state will be listed


All VF hardware resources and their state will be listed in the dump (QPs, CQs, MTTs, etc.)


All QP-to-Ring mapping will be added (the huge dump)


All IOV objects and their state will be list

Dump PDDR Information

The DMN-PDDR can configured by the "EnableDumpOnPortUp" and "EnableDumpOnPortDown" values that are located in device parameters registry keys.

The default values of the keys are follow:

  • EnableDumpOnPortUp = 0 [capability disabled]
  • EnableDumpOnPortDown = 1 [capability enabled]

Event Logs

DMN generates an event to the system event log upon the success or failure of the dump file generation. 

Reported Driver Event Severity: Error
Event IDMessage


<device name>: Failed to create a full dump me now.

Dump me now root directory: <path to root DMN folder>

Failure: <Failure description>

Status: <status code>

Reported Driver Event Severity: Warning

For a list of the DMN Warning events, see Reported Driver Events.


FwTrace feature allows firmware traces to be logged Online into the WPP tracing without any NVIDIA® specific tools’ requirements. It provides an easy way to debug and diagnose issues at production without the need to reproduce the issue. Both the firmware and the driver traces are displayed at the same file. Additionally, FwTrace is also used as a platform for core_dump.

System Requirements
Firmware versions:
  • NVIDIA® ConnectX®-4 v12.22.1002
  • NVIDIA® ConnectX®-4 Lx v14.22.1002
  • NVIDIA® ConnectX®-5 v16.22.4020

Configuring FwTrace

FwTrace uses Registry Keys for its configuration. For more information see section FwTrace Registry Keys.

FwTrace feature could be enabled/disabled dynamically (without requiring an adapter restart) using the FwTracerEnabled registry key.

FwTrace uses a cyclic buffer. The size of the buffer could be configured using the dynamic registry key FwTracerBufferSize. To change buffer size, set the desired value to FwTracerBufferSize and then restart FwTrace using FwTracerEnabled registry key or adapter restart.

Resource Dump

Resource Dump is a debuggability utility that extracts and prints data segments generated by the firmware/hardware. The driver will register to all the supported types of resources (Segments) and will listen on the events sent by the firmware to initiate a collect resource dump request and export it to the filesystem (using Dump-Me-Now mechanism).

For further information, see ResourceDump Registry Keys and Resource Dump Utility.

As Resource Dump depends on DMN, its enablement is coupled with the DMN enablement.