Using the Nsight Aftermath API

Usage Examples

In this section some code snippets can be found that show how to use the Nsight Aftermath API for collecting and decoding crash dumps for a D3D12 or Vulkan application.

The code samples cover the following commonly required tasks:

  1. Enable GPU crash dump collection in an application

  2. Configure what data to include in GPU crash dumps

  3. Instrument an application with Aftermath event markers

  4. Handle GPU crash dump callback events

  5. Handle device removed/lost

  6. Disable GPU crash dump collection

  7. Use the GPU crash dump decoding API

Enabling GPU Crash Dumps

An application enables GPU crash dump creation by calling GFSDK_Aftermath_EnableGpuCrashDumps. To use the Nsight Aftermath API functions related to GPU crash dump collection include the GFSDK_Aftermath_GpuCrashDump.h header file.

GPU crash dump collection should be enabled before the application creates any D3D12, D3D11, or Vulkan device. No GPU crash dumps will be generated for GPU crashes or hangs related to devices that were created before the GFSDK_Aftermath_EnableGpuCrashDumps call.

Besides enabling the GPU crash dump feature, this call allows the application to register a few callback functions. First, a callback function that will be invoked once a GPU crash is detected and crash dump data is available. This callback is required. In addition, the application can also provide two optional callback functions that will be invoked if debug information for shaders is available or the application intends to provide additional description or context for the exception, such as the current state of the application at the time of the crash, to be included with the GPU crash dump.

Note

The shader debug information callback will only be invoked if the shader debug information feature is enabled; see the description of GFSDK_Aftermath_FeatureFlags_GenerateShaderDebugInfo and VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_SHADER_DEBUG_INFO_BIT_NV in the Configuring GPU Crash Dumps section of this document.

The following code snippet shows an example of how to enable GPU crash dumps and how to setup the callbacks for crash dump notifications, for shader debug information notifications, for providing additional crash dump description data, and for resolving application-managed markers. Only the crash dump callback is mandatory. The other three callbacks are optional and can be omitted by passing a NULL pointer if the corresponding functionality is not needed. In this example, GPU cash dumps are only enabled for D3D12 and D3D11 devices. For watching Vulkan devices the GFSDK_Aftermath_EnableGpuCrashDumps functions must be called with GFSDK_Aftermath_GpuCrashDumpWatchedApiFlags_Vulkan. It is also possible to combine both flags, if an application uses both the D3D and the Vulkan API.

void MyApp::InitDevice()
{
    [...]

    // Enable GPU crash dumps and register callbacks.
    AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_EnableGpuCrashDumps(
        GFSDK_Aftermath_Version_API,
        GFSDK_Aftermath_GpuCrashDumpWatchedApiFlags_DX,
        GFSDK_Aftermath_GpuCrashDumpFeatureFlags_Default,   // Default behavior.
        GpuCrashDumpCallback,                               // Register callback for GPU crash dumps.
        ShaderDebugInfoCallback,                            // Register callback for shader debug information.
        CrashDumpDescriptionCallback,                       // Register callback for GPU crash dump description.
        ResolveMarkerCallback,                              // Register callback for marker resolution (R495 or later NVIDIA graphics driver).
        &m_gpuCrashDumpTracker));                           // Set the GpuCrashTracker object as user data passed back by the above callbacks.

    [...]
}

// Static wrapper for the GPU crash dump handler. See the 'Handling GPU crash dump Callbacks' section for details.
void MyApp::GpuCrashDumpCallback(const void* pGpuCrashDump, const uint32_t gpuCrashDumpSize, void* pUserData)
{
    GpuCrashTracker* pGpuCrashTracker = reinterpret_cast<GpuCrashTracker*>(pUserData);
    pGpuCrashTracker->OnCrashDump(pGpuCrashDump, gpuCrashDumpSize);
}

// Static wrapper for the shader debug information handler. See the 'Handling Shader Debug Information callbacks' section for details.
void MyApp::ShaderDebugInfoCallback(const void* pShaderDebugInfo, const uint32_t shaderDebugInfoSize, void* pUserData)
{
    GpuCrashTracker* pGpuCrashTracker = reinterpret_cast<GpuCrashTracker*>(pUserData);
    pGpuCrashTracker->OnShaderDebugInfo(pShaderDebugInfo, shaderDebugInfoSize);
}

// Static wrapper for the GPU crash dump description handler. See the 'Handling GPU Crash Dump Description Callbacks' section for details.
void MyApp::CrashDumpDescriptionCallback(PFN_GFSDK_Aftermath_AddGpuCrashDumpDescription addDescription, void* pUserData)
{
    GpuCrashTracker* pGpuCrashTracker = reinterpret_cast<GpuCrashTracker*>(pUserData);
    pGpuCrashTracker->OnDescription(addDescription);
}

// Static wrapper for the resolve marker handler. See the 'Handling Marker Resolve Callbacks' section for details.
void MyApp::ResolveMarkerCallback(const void* pMarkerData, const uint32_t markerDataSize, void* pUserData, void** ppResolvedMarkerData, uint32_t* pResolvedMarkerDataSize)
{
    GpuCrashTracker* pGpuCrashTracker = reinterpret_cast<GpuCrashTracker*>(pUserData);
    pGpuCrashTracker->OnResolveMarker(pMarkerData, markerDataSize, ppResolvedMarkerData, pResolvedMarkerDataSize);
}

Enabling GPU crash dumps in an application with GFSDK_Aftermath_EnableGpuCrashDumps will override any settings from an already active Nsight Aftermath GPU crash dump monitor for this application. That means the GPU crash dump monitor will not be notified of any GPU crash related to this process nor will it create any GPU crash dumps or shader debug information files for D3D or Vulkan devices that are created after the function was called. Also, all configuration settings made in the GPU crash dump monitor will be ignored.

Configuring GPU Crash Dumps

Which data will be included in GPU crash dumps is configured by the Aftermath per-device feature flags. How to configure Aftermath feature flags differs between D3D and Vulkan.

For D3D, the application must call the appropriate GFSDK_Aftermath_DX*_Initialize functions to initialize the desired Aftermath feature flags.

The features supported for DX Aftermath by this release of the Aftermath SDK are the following:

  • GFSDK_Aftermath_FeatureFlags_EnableMarkers - this enables support for DX Aftermath event markers, including both the support for user markers that are explicitly added by the application via GFSDK_Aftermath_SetEventMarker and automatic call stack markers described in more detail below.

    Overhead Note: Using event markers (checkpoints) should be considered carefully. Injecting markers in high-frequency code paths can introduce high CPU overhead. Therefore, on some versions of NVIDIA graphics drivers, the DX event marker feature is only available if the Nsight Aftermath GPU Crash Dump Monitor is running on the system. This requirement applies to R495 to R530 drivers for D3D12 and R495+ drivers for D3D11. No Aftermath configuration needs to be made in the Monitor. It serves only as a dongle to ensure Aftermath event markers do not impact application performance on end user systems.

  • GFSDK_Aftermath_FeatureFlags_CallStackCapturing - this will instruct the NVIDIA graphics driver to automatically generate Aftermath event markers for all draw calls, compute and ray tracing dispatches, ray tracing acceleration structure build operations, or resource copies initiated by the application. The automatic event markers are added into the command lists right after the corresponding commands with the CPU call stacks of the functions recording the commands as their data payload. This may help narrowing down the commands that were issued nearest to that causing the crash. This feature is only available if the event marker feature is enabled, too.

    Overhead Note: Enabling this feature will cause very high CPU overhead during command list recording. Due to the inherent overhead, call stack capturing should only be used for debugging purposes on development or QA systems and should not be enabled in applications shipped to customers. Therefore, on R495+ NVIDIA graphics drivers, the DX call stack capturing feature is only available if the Nsight Aftermath GPU Crash Dump Monitor is running on the system. No Aftermath configuration needs to be made in the Monitor. It serves only as a dongle to ensure Aftermath call stack capturing does not impact application performance on end user systems.

Note

When enabling this feature, Aftermath GPU crash dumps will include file paths to the crashing application’s executable as well as all DLLs it has loaded.

  • GFSDK_Aftermath_FeatureFlags_EnableResourceTracking - this feature will enable graphics driver-level tracking of live and recently destroyed resources. This helps the user to identify resources related to a GPU virtual address seen in the case of a crash due to a GPU page fault. Since the tracking happens on the driver-level, it will provide only basic resource information, such as the size of the resource, the format, and the current deletion status of the resource object.

    Developers may also want to register the D3D12 resources created by their application using the GFSDK_Aftermath_DX12_RegisterResource function, which will allow Aftermath to track additional information, such as the resource’s debug name.

  • GFSDK_Aftermath_FeatureFlags_GenerateShaderDebugInfo - this instructs the shader compiler to generate debug information (line tables for mapping from the shader DXIL passed to the NVIDIA graphics driver to the shader microcode) for all shaders.

    Overhead Note: Using this option should be considered carefully. It may cause considerable shader compilation overhead and additional overhead for handling the corresponding shader debug information callbacks.

  • GFSDK_Aftermath_FeatureFlags_EnableShaderErrorReporting - this puts the GPU in a special mode that allows it to report more runtime shader errors. The additional information may help in debugging rendering corruption crashes due to errors in shader execution.

Note

Enabling this feature does not cause any performance overhead, but it may result in additional crashes being reported for shaders that exhibit undefined behavior, which so far went unnoticed. Examples of situations that would be silently ignored without enabling this feature are:

  • Accessing memory using misaligned addresses, such as reading or writing a byte address that is not a multiple of the access size.

  • Accessing memory out-of-bounds, such as reading or writing beyond the declared bounds of group shared or thread local memory or reading from an out-of-bounds constant buffer address.

  • Hitting call stack limits.

Note

This feature is only supported on R515 and later NVIDIA graphics drivers. The feature flag will be ignored if the application is running on a system using an earlier driver version.

The following code snippet shows how to use those flags with GFSDK_Aftermath_DX12_Initialize to enable all available Aftermath features for a D3D12 device. Note, enabling all features, as in this simple example, may not always be the right choice. Which subset of features to enable should be considered based on requirements and acceptable overhead.

void MyApp::InitDevice()
{
    [...]

    D3D12CreateDevice(hardwareAdapter.Get(), D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&m_device)));

    // Initialize Nsight Aftermath for this device.
    const uint32_t aftermathFlags =
        GFSDK_Aftermath_FeatureFlags_EnableMarkers |             // Enable event marker tracking.
        GFSDK_Aftermath_FeatureFlags_CallStackCapturing |        // Enable automatic call stack event markers.
        GFSDK_Aftermath_FeatureFlags_EnableResourceTracking |    // Enable tracking of resources.
        GFSDK_Aftermath_FeatureFlags_GenerateShaderDebugInfo |   // Generate debug information for shaders.
        GFSDK_Aftermath_FeatureFlags_EnableShaderErrorReporting; // Enable additional runtime shader error reporting.

    AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_DX12_Initialize(
        GFSDK_Aftermath_Version_API,
        aftermathFlags,
        m_device.Get()));

    [...]
}

For Vulkan, Aftermath feature flags are configured at logical device creation time via the VK_NV_device_diagnostics_config extension.

The supported configuration flag bits are:

  • VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_AUTOMATIC_CHECKPOINTS_BIT_NV - this flag will instruct the NVIDIA graphics driver to automatically generate diagnostic checkpoints for all draw calls, compute and ray tracing dispatches, ray tracing acceleration structure build operations, or resource copies initiated by the application. The automatic checkpoints are added into the command buffers right after the corresponding commands with the CPU call stacks of the functions recording the commands as their data payload. This may help narrow down the commands that were issued nearest to the one that caused the crash. This configuration flag is only effective if the application has also enabled the NV_device_diagnostic_checkpoints extension.

    Overhead Note: Using this flag should be considered carefully. Enabling call stack capturing can cause considerable CPU overhead during command buffer recording.

Note

When enabling this feature, Aftermath GPU crash dumps will include file paths to the crashing application’s executable as well as all DLLs or DSOs it has loaded.

  • VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_RESOURCE_TRACKING_BIT_NV - this flag will enable graphics driver-level tracking of live and recently destroyed resources. This helps the user to identify resources related to a GPU virtual address seen in the case of a crash due to a GPU page fault.

    The information being tracked includes the size of the resource, the format, and the current deletion status of the resource object, as well as the resource’s debug name set via vkSetDebugUtilsObjectNameEXT.

  • VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_SHADER_DEBUG_INFO_BIT_NV - this instructs the shader compiler to generate debug information (line tables). Whether a mapping from the SPIR-V code to the shader microcode or a mapping from the shader’s source code to the shader microcode is generated depends on whether the SPIR-V sent to the NVIDIA graphics driver was compiled with debug information or not.

    Overhead Note: Using this option should be considered carefully. It may cause considerable shader compilation overhead and additional overhead for handling the corresponding shader debug information callbacks.

  • VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_SHADER_ERROR_REPORTING_BIT_NV - this flag puts the GPU in a special mode that allows it to report more runtime shader errors. The additional information may help in debugging rendering corruption or crashes due to errors in shader execution.

Note

Enabling this feature does not cause any performance overhead, but it may result in additional crashes being reported for shaders that exhibit undefined behavior, which so far went unnoticed. Examples of situations that would be silently ignored without enabling this feature are:

  • Accessing memory using misaligned addresses, such as reading or writing a byte address that is not a multiple of the access size.

  • Accessing memory out-of-bounds, such as reading or writing beyond the bounds of shared or thread local memory or reading from an out-of-bounds constant buffer address.

  • Accessing a texture with incompatible format or memory layout.

  • Hitting call stack limits.

Note

This feature is only supported on R515 and later NVIDIA graphics drivers. The feature flag will be ignored if the application is running on a system using an earlier driver version.

The Aftermath feature selection for a Vulkan device could look like the code below. Note, enabling all features, as in this simple example, may not always be the right choice. Which subset of features to enable should be considered based on requirements and acceptable overhead.

void MyApp::InitDevice()
{
    std::vector<char const*> extensionNames;

    [...]

    // Enable NV_device_diagnostic_checkpoints extension to be able to
    // use Aftermath event markers.
    extensionNames.push_back(VK_NV_DEVICE_DIAGNOSTIC_CHECKPOINTS_EXTENSION_NAME);

    // Enable NV_device_diagnostics_config extension to configure Aftermath
    // features.
    extensionNames.push_back(VK_NV_DEVICE_DIAGNOSTICS_CONFIG_EXTENSION_NAME);

    // Set up device creation info for Aftermath feature flag configuration.
    VkDeviceDiagnosticsConfigFlagsNV aftermathFlags =
        VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_AUTOMATIC_CHECKPOINTS_BIT_NV |  // Enable automatic call stack checkpoints.
        VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_RESOURCE_TRACKING_BIT_NV |      // Enable tracking of resources.
        VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_SHADER_DEBUG_INFO_BIT_NV |      // Generate debug information for shaders.
        VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_SHADER_ERROR_REPORTING_BIT_NV;  // Enable additional runtime shader error reporting.

    VkDeviceDiagnosticsConfigCreateInfoNV aftermathInfo = {};
    aftermathInfo.sType = VK_STRUCTURE_TYPE_DEVICE_DIAGNOSTICS_CONFIG_CREATE_INFO_NV;
    aftermathInfo.flags = aftermathFlags;

    // Set up device creation info.
    VkDeviceCreateInfo deviceInfo = {};
    deviceInfo.sType = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO;
    deviceInfo.pNext = &aftermathInfo;
    deviceInfo.queueCreateInfoCount = 1;
    deviceInfo.pQueueCreateInfos = &queueInfo;
    deviceInfo.enabledExtensionCount = extensionNames.size();
    deviceInfo.ppEnabledExtensionNames = extensionNames.data();

    // Create the logical device.
    VkDevice device;
    vkCreateDevice(physicalDevice, &deviceInfo, NULL, &device);

    [...]
}

Inserting Event Markers

The Aftermath API provides a simple and light-weight solution for inserting event markers on the GPU timeline.

Here is some D3D12 example code that shows how to create the necessary command context handle with GFSDK_Aftermath_DX12_CreateContextHandle and how to call GFSDK_Aftermath_SetEventMarker to set a simple event marker with a character string as payload.

Note

The command list passed into GFSDK_Aftermath_DX12_CreateContextHandle must be in the recording state for the returned context handle to be valid. The context handle will remain valid even if the command list is subsequently closed and reset, but the command list also must be in the recording state when its context handle is passed into GFSDK_Aftermath_SetEventMarker.

Note

Calls of GFSDK_Aftermath_SetEventMarker are only effective if the GFSDK_Aftermath_FeatureFlags_EnableMarkers option was provided to GFSDK_Aftermath_DX12_Initialize and the Nsight Aftermath GPU Crash Dump Monitor is running on the system. This Monitor requirement applies to R495 to R530 drivers for D3D12 and R495+ drivers for D3D11.

void MyApp::PopulateCommandList()
{
    // Create the command list.
    m_device->CreateCommandList(0, D3D12_COMMAND_LIST_TYPE_DIRECT, m_commandAllocator.Get(), m_pipelineState.Get(), IID_PPV_ARGS(&m_commandList)));

    // Create an Nsight Aftermath context handle for setting Aftermath event markers in this command list.
    // Note that the command list must be in the recording state for this function, so if it is closed it must be reset first
    // (e.g. if it was created above with ID3D12Device4::CreateCommandList1 instead of ID3D12Device::CreateCommandList).
    AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_DX12_CreateContextHandle(m_commandList.Get(), &m_hAftermathCommandListContext));

    [...]

    // Add an Aftermath event marker with a 0-terminated string as payload.
    std::string eventMarker = "Draw Triangle";
    AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_SetEventMarker(m_hAftermathCommandListContext, (void*)eventMarker.c_str(), (unsigned int)eventMarker.size() + 1));
    m_commandList->DrawInstanced(3, 1, 0, 0);

    [...]
}

Overhead Note: For reduced CPU overhead, use GFSDK_Aftermath_SetEventMarker with dataSize=0. This instructs Aftermath not to allocate and copy off memory internally, relying on the application to manage marker pointers itself. Markers managed this way may later be resolved using the Aftermath SDK resolve marker callback PFN_GFSDK_Aftermath_ResolveMarkerCb.

For Vulkan, similar functionality is provided via the NV_device_diagnostic_checkpoints extension. When this extension is enabled for a Vulkan device, event markers can be inserted in a command buffer with the vkCmdSetCheckpointNV function. The application is responsible for managing the marker pointers/tokens, which may later be resolved using the Aftermath SDK resolve marker callback PFN_GFSDK_Aftermath_ResolveMarkerCb.

std::map<uint64_t, std::string> appManagedMarkers;

void MyApp::RecordCommandBuffer()
{
    [...]
    size_t markerId = appManagedMarkers.size() + 1;
    appManagedMarkers[markerId] = "Draw Cube";
    // Add an Aftermath event marker ID, which requires resolution later via callback.
    vkCmdSetCheckpointNV(commandBuffer, (const void*)markerId);
    [...]
}

Handling GPU Crash Dump Callbacks

When Nsight Aftermath GPU crash dumps are enabled, and a GPU crash or hang is detected, the necessary data is gathered, and a GPU crash dump is created from it. Then the GPU crash dump callback function that was registered with the GFSDK_Aftermath_EnableGpuCrashDumps will be invoked to notify the application. In the callback the application can either decode the GPU crash dump data using the GPU crash dump decoding API or forward it to the crash handling infrastructure.

In the simple example implementation of a GPU crash dump callback handler is shown below the GPU crash dump data is simply stored to a file. This file could then be opened for analysis with Nsight Graphics.

// Handler for GPU crash dump callbacks (called by GpuCrashDumpCallback).
void GpuCrashTracker::OnCrashDump(const void* pGpuCrashDump, const uint32_t gpuCrashDumpSize)
{
    // Make sure only one thread at a time...
    std::lock_guard<std::mutex> lock(m_mutex);

    // Write to file for later in-depth analysis.
    WriteGpuCrashDumpToFile(pGpuCrashDump, gpuCrashDumpSize);
}

Note

All callback functions are free-threaded, and that the application is responsible for providing thread-safe callback handlers.

It is important to note that the device removed/lost notification of the graphics API used by the application is asynchronous to the NVIDIA graphics driver’s GPU crash handling. That means applications should check the return value of IDXGISwapChain::Present for DXGI_ERROR_DEVICE_REMOVED or DXGI_ERROR_DEVICE_RESET and the return codes of their Vulkan calls for VK_ERROR_DEVICE_LOST, and give the Nsight Aftermath GPU crash dump processing thread some time to do its work before releasing the D3D or Vulkan device or terminating the process. See the ‘Handling Device Removed/Lost’ section of this document for further details.

Handling GPU Crash Dump Description Callbacks

An application can register an optional callback function that allows it to provide supplemental information about a crash. This callback is called after the GPU crash happened, but before the actual GPU crash dump callback. This presents the opportunity for the application to provide information such as application name, application version, or user defined data, for example, current engine state. The data provided will be stored in the GPU crash dump.

Here is an example of a basic GPU crash dump description handler. Data is added to the crash dump by calling the addDescription function provided by the callback.

// Handler for GPU crash dump description callbacks (called by CrashDumpDescriptionCallback).
void GpuCrashTracker::OnDescription(PFN_GFSDK_Aftermath_AddGpuCrashDumpDescription addDescription)
{
    // Add some basic description about the crash.
    addDescription(GFSDK_Aftermath_GpuCrashDumpDescriptionKey_ApplicationName, "Hello Nsight Aftermath");
    addDescription(GFSDK_Aftermath_GpuCrashDumpDescriptionKey_ApplicationVersion, "v1.0");
    addDescription(GFSDK_Aftermath_GpuCrashDumpDescriptionKey_UserDefined, "This is a GPU crash dump example");
    addDescription(GFSDK_Aftermath_GpuCrashDumpDescriptionKey_UserDefined + 1, "Engine State: Rendering");
}

Note

All callback functions are free-threaded; the application is responsible for providing thread-safe callback handlers.

Handling Shader Debug Information Callbacks

If the device was configured with the GenerateShaderDebugInfo feature flag, the generated shader debug information will be communicated to the application through the (optional) shader debug information callback function that was registered when the GFSDK_Aftermath_EnableGpuCrashDumps was called. This debug information will be required to map from shader instruction addresses to intermediate assembly language (IL) instructions or high-level source lines when analyzing a crash dump in Nsight Graphics or when using the GPU crash dump to JSON decoding functions of the Nsight Aftermath API. If this functionality is not required, an application can omit the GenerateShaderDebugInfo flag when configuring the device and pass nullptr for the shader debug information callback. This might be desirable, because generating shader debug information incurs overhead in shader compilation and for handling the callback.

Here is a simple example implementation of a callback handler that writes the data to disk using the unique shader debug info identifier queried from the opaque shader debug information blob.

// Handler for shader debug information callbacks (called by ShaderDebugInfoCallback)
void GpuCrashTracker::OnShaderDebugInfo(const void* pShaderDebugInfo, const uint32_t shaderDebugInfoSize)
{
    // Make sure only one thread at a time...
    std::lock_guard<std::mutex> lock(m_mutex);

    // Get shader debug information identifier.
    GFSDK_Aftermath_ShaderDebugInfoIdentifier identifier = {};
    AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_GetShaderDebugInfoIdentifier(GFSDK_Aftermath_Version_API, pShaderDebugInfo, shaderDebugInfoSize, &identifier));

    // Write to file for later in-depth analysis of crash dumps with Nsight Graphics.
    WriteShaderDebugInformationToFile(identifier, pShaderDebugInfo, shaderDebugInfoSize);
}

By default, the shader debug information callback will be invoked for every shader that is compiled by the NVIDIA graphics driver. It is the responsibility of the implementation to handle those callbacks and store the data in case a GPU crash occurs. To simplify the process the Nsight Aftermath library can handle the caching of the debug information and only invoke the callback in case of a GPU crash and only for the shaders referenced in the corresponding GPU crash dump. This behavior is enabled by passing the GFSDK_Aftermath_GpuCrashDumpFeatureFlags_DeferDebugInfoCallbacks flag to GFSDK_Aftermath_EnableGpuCrashDumps when enabling GPU crash dumps.

Note

All callback functions are free-threaded; the application is responsible for providing thread-safe callback handlers.

Handling Marker Resolve Callbacks

Note, this Aftermath feature is only supported on R495 and later NVIDIA graphics drivers.

For D3D applications which call GFSDK_Aftermath_SetEventMarker with a markerSize of zero, or for all Vulkan applications, the application passes a unique token (such as a pointer) identifying the marker to Aftermath, and is responsible for tracking the meaning of that token internally. When generating the crash dump, if a marker of interest has a size of zero, the crash dump process will invoke this callback, passing back the marker’s token, expecting the application to resolve the token to the actual marker data (such as a string) and pass the data back.

The association of token to marker data is completely up to the application. Pointers to the application-managed marker data is usually a viable token, since if the application is keeping all marker data in memory, different markers will have unique pointer values by definition.

The pointer to the marker data passed back to the crash dump process from the application must be valid until the next call of ResolveMarkerCallback or for the duration crash dump generation. Crash dump generation is complete when the GPU crash dump callback is called.

If the application does not implement this callback, or does not pass the resolved marker data and size back, then the original marker payload (or only the marker’s address for zero-size/Vulkan marker) will be added to the crash dump. The same will happen if the NVIDIA graphics driver does not support the feature, i.e, if the driver release version is less than R495.

If the marker is still considered valid, but the application does not have an associated payload (payloads lifetime expired, marker is the payload, etc.), you may consider returning a string representation of the marker in the callback implementation.

// Simple data structure used to store marker tokens. This is managed by the application.
// It could be stored within the callbacks pUserData or in some other scope.
std::map<uint64_t, std::string> appManagedMarkers;

// Example handler for resolving markers (called by ResolveMarkerCallback)
void GpuCrashTracker::OnResolveMarker(const void* pMarkerData, const uint32_t markerDataSize, void** ppResolvedMarkerData, uint32_t* pResolvedMarkerDataSize)
{
    // In this example, we set the (uint64_t) key of the 'appManagedMarkers' to the 'markerData'
    // and set the 'markerDataSize' to zero when we call the 'GFSDK_Aftermath_SetEventMarker'.
    // So we are only interested in zero size markers here.
    if (markerDataSize == 0)
    {
        // Important: the pointer passed back via ppResolvedMarkerData must remain valid after this function returns.
        // Using references for all of the appManagedMarkers accesses ensures that the pointers refer to the persistent data
        const auto& foundMarker = appManagedMarkers->find((uint64_t)pMarkerData);
        if (foundMarker != appManagedMarkers->end())
        {
            const std::string& markerString = foundMarker->second;
            // std::string::data() will return a valid pointer until the string is next modified
            // we don't modify the string after calling data() here, so the pointer should remain valid
            *ppResolvedMarkerData = (void*)markerString.data();
            *pResolvedMarkerDataSize = (uint32_t)markerString.length();
            return;
        }
    }
    // Marker was not found or we are not interested in it. So we return without setting any resolved
    // marker data or size. The marker's original payload (or the marker's pointer/token for zero-size
    // marker) will be stored in the Aftermath crash dump.
    // When capturing lots of markers, it may be unfeasible to to maintain the lifetime of all markers.
    // It may be desirable to maintain only markers recorded in the most recent render calls.
}

Handling Device Removed/Lost

After a GPU crash happens (e.g., after seeing device removed/lost in the application), call GFSDK_Aftermath_GetCrashDumpStatus to check the GPU crash dump status. The Application should wait until Aftermath has finished processing the crash dump before exiting/terminating the application. Here is an example of how an application may handle the device removed event if it is setup for collecting Aftermath GPU crash dumps.

void Graphics::Present(void)
{
    [...]

    HRESULT hr = s_SwapChain1->Present(PresentInterval, 0);

    if (hr == DXGI_ERROR_DEVICE_REMOVED || hr == DXGI_ERROR_DEVICE_RESET)
    {
        [...]

        GFSDK_Aftermath_CrashDump_Status status = GFSDK_Aftermath_CrashDump_Status_Unknown;
        AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_GetCrashDumpStatus(&status));

        auto tStart = std::chrono::steady_clock::now();
        auto tElapsed = std::chrono::milliseconds::zero();

        // Loop while Aftermath crash dump data collection has not finished or
        // the application is still processing the crash dump data.
        while (status != GFSDK_Aftermath_CrashDump_Status_CollectingDataFailed &&
               status != GFSDK_Aftermath_CrashDump_Status_Finished &&
               tElapsed.count() < deviceLostTimeout)
        {
            // Sleep a couple of milliseconds and poll the status again.
            std::this_thread::sleep_for(std::chrono::milliseconds(50));
            AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_GetCrashDumpStatus(&status));

            tElapsed = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - tStart);
        }

        if (status == GFSDK_Aftermath_CrashDump_Status_Finished)
        {
            Utility::Print("Aftermath finished processing the crash dump.\n");
        }
        else
        {
            Utility::Printf("Unexpected crash dump status after timeout: %d\n", status);
        }

        exit(-1);
    }

    [...]
}

Disable GPU Crash Dumps

To disable GPU crash dumps simply call GFSDK_Aftermath_DisableGpuCrashDumps.

MyApp::~MyApp()
{
    [...]

    // Disable GPU crash dump creation.
    GFSDK_Aftermath_DisableGpuCrashDumps();

    [...]
}

Disabling GPU crash dumps in an application with GFSDK_Aftermath_DisableGpuCrashDumps will re-enable any Nsight Aftermath GPU crash dump monitor settings for this application if it is running on the system. After this, the GPU crash dump monitor will be notified of any GPU crash related to this process and will create GPU crash dumps and shader debug information files for all active devices.