Expected Workflow#
When debugging or profiling, it is important to narrow your investigation to the path that provides the most impactful and actionable data for you to make conclusions and solve problems. Nsight Graphics provides a number of tools to fit each of these workflow scenarios.
When debugging a rendering problem, Nsight Graphics’s Graphics Capture is the place to start for Vulkan and D3D12 applications. This tool enables the inspection of events, API state, resource values, and dependencies to understand where your application might have issues. For OpenGL applications, use the OpenGL Frame Debugger.
When profiling a graphical application, the first step is to determine if you are CPU or GPU bound. If you are CPU bound, you cannot issue enough work to the GPU to take full advantage of its full processing power. If you are GPU bound, the GPU is not able to process the work it is issued fast enough and your engine may stall. One tool you can use to determine the limiting factor is Nsight Systems™. Nsight Systems is a system-wide performance analysis tool designed to visualize an application’s algorithms, help you select the largest opportunities to optimize, and tune to scale efficiently across any quantity of CPUs and GPUs in your computer.
If you have determined that you are GPU bound, the GPU Trace Profiler within Nsight Graphics offers a deep analysis of your application’s performance by reporting GPU performance counters, as well as tracing the execution of your shaders on the SM across a series of frames. Another key technique in optimizing performance is to take advantage of the GPUs ability to process parallel work by using techniques to achieve simultaneous compute and graphics (SCG), also known as async compute. GPU Trace allows you to both see opportunities for async compute as well as to confirm and measure the impact of async compute on your frame.
If you have determined that you are CPU bound, you need to use a CPU profiling tool to discover how you can eliminate inefficiencies to issue work faster to the GPU. You may also want to look into the overhead of the graphics API constructs you are using and determine if there are lighter-weight constructs that can offer the same effect at less cost. The Graphics Capture tool is an excellent resource to inspect API calls while you are making these adjustments to your engine.