How to Read Crash Dumps
The Nsight Aftermath library provides several functions for decoding GPU crash dumps and for querying data from the crash dumps. To use these functions include the GFSD_Aftermath_GpuCrashDumpDecoding.h header file.
The first step in decoding a GPU crash dump is to create a decoder object for it by calling GFSDK_Aftermath_GpuCrashDump_CreateDecoder:
// Create a GPU crash dump decoder object for the GPU crash dump.
GFSDK_Aftermath_GpuCrashDump_Decoder decoder = {};
AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_GpuCrashDump_CreateDecoder(
GFSDK_Aftermath_Version_API,
pGpuCrashDump,
gpuCrashDumpSize,
&decoder));
Then one or more decoder functions can be used to read data from a GPU crash dump. The data in the GPU crash dumps varies depending on the type of GPU crash and the feature flags used when initializing Aftermath. For example, if the application does not use Aftermath event markers, querying Aftermath event marker information will fail. The decoder will return GFSDK_Aftermath_Result_NotAvailable if the requested data is not available. An implementation should be aware of that and handle the situation accordingly.
Here is an example of how to query GPU page fault information from a GPU crash dump using a previously created decoder object:
// Query GPU page fault information.
GFSDK_Aftermath_GpuCrashDump_PageFaultInfo pageFaultInfo = {};
GFSDK_Aftermath_Result result = GFSDK_Aftermath_GpuCrashDump_GetPageFaultInfo(decoder, &pageFaultInfo);
if (GFSDK_Aftermath_SUCCEED(result) && result != GFSDK_Aftermath_Result_NotAvailable)
{
// Print information about the GPU page fault.
Utility::Printf("GPU page fault at 0x%016llx", pageFaultInfo.faultingGpuVA);
Utility::Printf("Fault Type: %u", pageFaultInfo.faultType);
Utility::Printf("Access Type: %u", pageFaultInfo.accessType);
Utility::Printf("Engine: %u", pageFaultInfo.engine);
Utility::Printf("Client: %u", pageFaultInfo.client);
if (pageFaultInfo.resourceInfoCount > 0)
{
std::vector<GFSDK_Aftermath_GpuCrashDump_ResourceInfo> resourceInfos(pageFaultInfo.resourceInfoCount);
GFSDK_Aftermath_GpuCrashDump_GetPageFaultResourceInfo(decoder, pageFaultInfo.resourceInfoCount, resourceInfos.data());
int index = 0;
for (auto resourceInfo : resourceInfos)
{
Utility::Printf("Resource[%d]", index);
Utility::Printf("\tFault in resource starting at 0x%016llx", resourceInfo.gpuVa);
Utility::Printf("\tSize of resource: (w x h x d x ml) = {%u, %u, %u, %u} = %llu bytes",
resourceInfo.width,
resourceInfo.height,
resourceInfo.depth,
resourceInfo.mipLevels,
resourceInfo.size);
Utility::Printf("\tFormat of resource: %u", resourceInfo.format);
Utility::Printf("\tResource was destroyed: %d", resourceInfo.bWasDestroyed);
index++;
}
}
}
Some decoding functions require the caller to provide an appropriately sized buffer for the data they return. Those come with an additional function to query the size of the buffer. For example, querying all the active shaders at the time of the GPU crash or hang could look like this:
// First query active shaders count.
uint32_t shaderCount = 0;
GFSDK_Aftermath_Result result = GFSDK_Aftermath_GpuCrashDump_GetActiveShadersInfoCount(decoder, &shaderCount);
if (GFSDK_Aftermath_SUCCEED(result) && result != GFSDK_Aftermath_Result_NotAvailable)
{
// Allocate buffer for results.
std::vector<GFSDK_Aftermath_GpuCrashDump_ShaderInfo> shaderInfos(shaderCount);
// Query active shaders information.
result = GFSDK_Aftermath_GpuCrashDump_GetActiveShadersInfo(decoder, shaderCount, shaderInfos.data());
if (GFSDK_Aftermath_SUCCEED(result))
{
// Print information for each active shader
for (const GFSDK_Aftermath_GpuCrashDump_ShaderInfo& shaderInfo : shaderInfos)
{
Utility::Printf("Active shader: ShaderHash = 0x%016llx ShaderInstance = 0x%016llx Shadertype = %u",
shaderInfo.shaderHash,
shaderInfo.shaderInstance,
shaderInfo.shaderType);
}
}
}
Finally, the GPU crash dump decoder also provides functions for converting a GPU crash dump into JSON format. Here is a code example for creating JSON from a GPU crash dump, including information about all the shaders active at the time of the GPU crash or hang. The information also includes the corresponding active shader warps, including their mapping back to intermediate assembly language (IL) instructions or source, if available. The later requires the caller to also provide a couple of callback functions the decoder will invoke to query shader debug information and shader binaries (dxc shader object outputs or SPIR-V shader files). These are optional and implementations not interested in mapping shader instruction addresses to IL or source lines can simply pass nullptr. However, if shader instruction mapping is desired, the implementation needs to ensure that it can provide the necessary information to the decoder.
// Flags controlling what to include in the JSON data
const uint32_t jsonDecoderFlags =
GFSDK_Aftermath_GpuCrashDumpDecoderFlags_SHADER_INFO | // Include information about active shaders.
GFSDK_Aftermath_GpuCrashDumpDecoderFlags_WARP_STATE_INFO | // Include information about active shader warps.
GFSDK_Aftermath_GpuCrashDumpDecoderFlags_SHADER_MAPPING_INFO; // Try to map shader instruction addresses to shader lines.
// Query the size of the required results buffer
uint32_t jsonSize = 0;
GFSDK_Aftermath_Result result = GFSDK_Aftermath_GpuCrashDump_GenerateJSON(
decoder,
jsonDecoderFlags, // The flags controlling what information to include in the JSON.
GFSDK_Aftermath_GpuCrashDumpFormatterFlags_CONDENSED_OUTPUT, // Generate condensed output, i.e., omit all unnecessary whitespace.
ShaderDebugInfoLookupCallback, // Callback function invoked to find shader debug information data.
ShaderLookupCallback, // Callback function invoked to find shader binary data by shader hash.
ShaderSourceDebugDataLookupCallback, // Callback function invoked to find shader source debug data by shader DebugName.
&m_gpuCrashDumpTracker, // User data that will be provided to the above callback functions.
&jsonSize); // Result of the call: size in bytes of the generated JSON data.
if (GFSDK_Aftermath_SUCCEED(result) && result != GFSDK_Aftermath_Result_NotAvailable)
{
// Allocate buffer for results.
std::vector<char> json(jsonSize);
// Query the generated JSON data taht si cached inside the decoder object.
result = GFSDK_Aftermath_GpuCrashDump_GetJSON(
decoder,
json.size(),
json.data());
if (GFSDK_Aftermath_SUCCEED(result))
{
Utility::Printf("JSON: %s", json.data());
}
}
Possible implementations for the shader debug information and shader binary lookup callbacks:
// Static callback wrapper for OnShaderDebugInfoLookup
void MyApp::ShaderDebugInfoLookupCallback(
const GFSDK_Aftermath_ShaderDebugInfoIdentifier* pIdentifier,
PFN_GFSDK_Aftermath_SetData setShaderDebugInfo,
void* pUserData)
{
GpuCrashTracker* pGpuCrashTracker = reinterpret_cast<GpuCrashTracker*>(pUserData);
pGpuCrashTracker->OnShaderDebugInfoLookup(*pIdentifier, setShaderDebugInfo);
}
// Static callback wrapper for OnShaderLookup
void MyApp::ShaderLookupCallback(
const GFSDK_Aftermath_ShaderBinaryHash* pShaderHash,
PFN_GFSDK_Aftermath_SetData setShaderBinary,
void* pUserData)
{
GpuCrashTracker* pGpuCrashTracker = reinterpret_cast<GpuCrashTracker*>(pUserData);
pGpuCrashTracker->OnShaderLookup(*pShaderHash, setShaderBinary);
}
// Static callback wrapper for OnShaderSourceDebugInfoLookup
void MyApp::ShaderSourceDebugInfoLookupCallback(
const GFSDK_Aftermath_ShaderDebugName* pShaderDebugName,
PFN_GFSDK_Aftermath_SetData setShaderBinary,
void* pUserData)
{
GpuCrashTracker* pGpuCrashTracker = reinterpret_cast<GpuCrashTracker*>(pUserData);
pGpuCrashTracker->OnShaderSourceDebugInfoLookup(*pShaderDebugName, setShaderBinary);
}
// Handler for shader debug information lookup callbacks.
// This is used by the JSON decoder for mapping shader instruction
// addresses to IL lines or source lines.
void GpuCrashTracker::OnShaderDebugInfoLookup(
const GFSDK_Aftermath_ShaderDebugInfoIdentifier& identifier,
PFN_GFSDK_Aftermath_SetData setShaderDebugInfo) const
{
// Search the list of shader debug information blobs received earlier.
auto i_debugInfo = m_shaderDebugInfo.find(identifier);
if (i_debugInfo == m_shaderDebugInfo.end())
{
// Early exit, nothing found. No need to call setShaderDebugInfo.
return;
}
// Let the GPU crash dump decoder know about the shader debug information
// that was found.
setShaderDebugInfo(i_debugInfo->second.data(), i_debugInfo->second.size());
}
// Handler for shader lookup callbacks.
// This is used by the JSON decoder for mapping shader instruction
// addresses to IL lines or source lines.
// NOTE: If the application loads stripped shader binaries, Aftermath
// will require access to both the stripped and the non-stripped
// shader binaries.
void GpuCrashTracker::OnShaderLookup(
const GFSDK_Aftermath_ShaderBinaryHash& shaderHash,
PFN_GFSDK_Aftermath_SetData setShaderBinary) const
{
// Find shader binary data for the shader hash in the shader database.
std::vector<uint8_t> shaderBinary;
if (!m_shaderDatabase.FindShaderBinary(shaderHash, shaderBinary))
{
// Early exit, nothing found. No need to call setShaderBinary.
return;
}
// Let the GPU crash dump decoder know about the shader data
// that was found.
setShaderBinary(shaderBinary.data(), shaderBinary.size());
}
// Handler for shader source debug info lookup callbacks.
// This is used by the JSON decoder for mapping shader instruction addresses to
// source lines if the shaders used by the application were compiled with
// separate debug info data files.
void GpuCrashTracker::OnShaderSourceDebugInfoLookup(
const GFSDK_Aftermath_ShaderDebugName& shaderDebugName,
PFN_GFSDK_Aftermath_SetData setShaderBinary) const
{
// Find source debug info for the shader DebugName in the shader database.
std::vector<uint8_t> sourceDebugInfo;
if (!m_shaderDatabase.FindSourceShaderDebugData(shaderDebugName, sourceDebugInfo))
{
// Early exit, nothing found. No need to call setShaderBinary.
return;
}
// Let the GPU crash dump decoder know about the shader debug data that
// was found.
setShaderBinary(sourceDebugInfo.data(), sourceDebugInfo.size());
}
Last, the decoder object should be destroyed, if no longer needed, to free up all memory allocated for it:
// Destroy the GPU crash dump decoder object.
AFTERMATH_CHECK_ERROR(GFSDK_Aftermath_GpuCrashDump_DestroyDecoder(decoder));