enum CUpti_ProfilerRange |
A metric enabled in the session's configuration is collected separately per unique range-stack in the pass. This is an attribute to collect metrics around each kernel in a profiling session or in an user defined range.
For metrics which require multipass collection, a replay of the GPU kernel(s) is required. This is an attribute which specify how the replay of the kernel(s) to be measured is done.
CUptiResult cuptiProfilerBeginPass | ( | CUpti_Profiler_BeginPass_Params * | pParams | ) |
These APIs are used if user chooses to replay by itself /ref CUPTI_UserReplay or /ref CUPTI_ApplicationReplay for multipass collection of the metrics configurations. It's a no-op in case of /ref CUPTI_KernelReplay.
CUptiResult cuptiProfilerBeginSession | ( | CUpti_Profiler_BeginSession_Params * | pParams | ) |
Although, it doesn't start the profiling but GPU resources needed for profiling are allocated. Outside of a session, the GPU will return to its normal operating state.
CUptiResult cuptiProfilerCounterDataImageCalculateScratchBufferSize | ( | CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params * | pParams | ) |
Use these APIs to calculate the allocation size and initialize counterData image scratch buffer.
CUptiResult cuptiProfilerCounterDataImageCalculateSize | ( | CUpti_Profiler_CounterDataImage_CalculateSize_Params * | pParams | ) |
User borne the resposibility of managing the counterDataImage allocations. CounterDataPrefix contains meta data about the metrics that will be stored in counterDataImage. Use these APIs to calculate the allocation size and initialize counterData image.
CUptiResult cuptiProfilerDisableProfiling | ( | CUpti_Profiler_DisableProfiling_Params * | pParams | ) |
In /ref CUPTI_AutoRange, these APIs are used to enable/disable profiling for the kernels to be executed in a profiling session.
CUptiResult cuptiProfilerEnableProfiling | ( | CUpti_Profiler_EnableProfiling_Params * | pParams | ) |
In /ref CUPTI_AutoRange, these APIs are used to enable/disable profiling for the kernels to be executed in a profiling session.
CUptiResult cuptiProfilerEndPass | ( | CUpti_Profiler_EndPass_Params * | pParams | ) |
These APIs are used if user chooses to replay by itself /ref CUPTI_UserReplay or /ref CUPTI_ApplicationReplay for multipass collection of the metrics configurations. Its a no-op in case of /ref CUPTI_KernelReplay. Returns information for next pass.
CUptiResult cuptiProfilerEndSession | ( | CUpti_Profiler_EndSession_Params * | pParams | ) |
Frees up the GPU resources acquired for profiling. Outside of a session, the GPU will return to it's normal operating state.
CUptiResult cuptiProfilerFlushCounterData | ( | CUpti_Profiler_FlushCounterData_Params * | pParams | ) |
Flush Counter data API to ensure every pass is decoded into the counterDataImage passed at beginSession. This will cause the CPU/GPU sync to collect all the undecoded pass.
CUptiResult cuptiProfilerInitialize | ( | CUpti_Profiler_Initialize_Params * | pParams | ) |
Loads the required libraries in the process address space. Sets up the hooks with the CUDA driver.
CUptiResult cuptiProfilerPopRange | ( | CUpti_Profiler_PopRange_Params * | pParams | ) |
Counter data is collected per unique range-stack. Identified by a string label passsed by the user. It's an invalid operation in case of /ref CUPTI_AutoRange.
CUptiResult cuptiProfilerPushRange | ( | CUpti_Profiler_PushRange_Params * | pParams | ) |
Counter data is collected per unique range-stack. Identified by a string label passsed by the user. It's an invalid operation in case of /ref CUPTI_AutoRange.
CUptiResult cuptiProfilerSetConfig | ( | CUpti_Profiler_SetConfig_Params * | pParams | ) |
Use these APIs to set the config to profile in a session. It can be used for advanced cases such as where multiple configurations are collected into a single CounterData Image on the need basis, without restarting the session.