NVIDIA Performance Primitives (NPP)
Version 10.0
|
Basic functions for library management, in particular library version and device property query functions. More...
Functions | |
const NppLibraryVersion * | nppGetLibVersion (void) |
Get the NPP library version. More... |
|
NppGpuComputeCapability | nppGetGpuComputeCapability (void) |
What CUDA compute model is supported by the active CUDA device? More... |
|
int | nppGetGpuNumSMs (void) |
Get the number of Streaming Multiprocessors (SM) on the active CUDA device. More... |
|
int | nppGetMaxThreadsPerBlock (void) |
Get the maximum number of threads per block on the active CUDA device. More... |
|
int | nppGetMaxThreadsPerSM (void) |
Get the maximum number of threads per SM for the active GPU. More... |
|
int | nppGetGpuDeviceProperties (int *pMaxThreadsPerSM, int *pMaxThreadsPerBlock, int *pNumberOfSMs) |
Get the maximum number of threads per SM, maximum threads per block, and number of SMs for the active GPU. More... |
|
const char * | nppGetGpuName (void) |
Get the name of the active CUDA device. More... |
|
cudaStream_t | nppGetStream (void) |
Get the NPP CUDA stream. More... |
|
unsigned int | nppGetStreamNumSMs (void) |
Get the number of SMs on the device associated with the current NPP CUDA stream. More... |
|
unsigned int | nppGetStreamMaxThreadsPerSM (void) |
Get the maximum number of threads per SM on the device associated with the current NPP CUDA stream. More... |
|
NppStatus | nppSetStream (cudaStream_t hStream) |
Set the NPP CUDA stream. More... |
|
Basic functions for library management, in particular library version and device property query functions.
NppGpuComputeCapability nppGetGpuComputeCapability | ( | void | ) |
What CUDA compute model is supported by the active CUDA device?
Before trying to call any NPP functions, the user should make a call this function to ensure that the current machine has a CUDA capable device.
int nppGetGpuDeviceProperties | ( | int * | pMaxThreadsPerSM, |
int * | pMaxThreadsPerBlock, | ||
int * | pNumberOfSMs | ||
) |
Get the maximum number of threads per SM, maximum threads per block, and number of SMs for the active GPU.
const char* nppGetGpuName | ( | void | ) |
Get the name of the active CUDA device.
int nppGetGpuNumSMs | ( | void | ) |
Get the number of Streaming Multiprocessors (SM) on the active CUDA device.
const NppLibraryVersion* nppGetLibVersion | ( | void | ) |
Get the NPP library version.
int nppGetMaxThreadsPerBlock | ( | void | ) |
Get the maximum number of threads per block on the active CUDA device.
int nppGetMaxThreadsPerSM | ( | void | ) |
Get the maximum number of threads per SM for the active GPU.
cudaStream_t nppGetStream | ( | void | ) |
Get the NPP CUDA stream.
NPP enables concurrent device tasks via a global stream state varible. The NPP stream by default is set to stream 0, i.e. non-concurrent mode. A user can set the NPP stream to any valid CUDA stream. All CUDA commands issued by NPP (e.g. kernels launched by the NPP library) are then issed to that NPP stream.
unsigned int nppGetStreamMaxThreadsPerSM | ( | void | ) |
Get the maximum number of threads per SM on the device associated with the current NPP CUDA stream.
NPP enables concurrent device tasks via a global stream state varible. The NPP stream by default is set to stream 0, i.e. non-concurrent mode. A user can set the NPP stream to any valid CUDA stream. All CUDA commands issued by NPP (e.g. kernels launched by the NPP library) are then issed to that NPP stream. This call avoids a cudaGetDeviceProperties() call.
unsigned int nppGetStreamNumSMs | ( | void | ) |
Get the number of SMs on the device associated with the current NPP CUDA stream.
NPP enables concurrent device tasks via a global stream state varible. The NPP stream by default is set to stream 0, i.e. non-concurrent mode. A user can set the NPP stream to any valid CUDA stream. All CUDA commands issued by NPP (e.g. kernels launched by the NPP library) are then issed to that NPP stream. This call avoids a cudaGetDeviceProperties() call.
NppStatus nppSetStream | ( | cudaStream_t | hStream | ) |
Set the NPP CUDA stream.
This function now returns an error if a problem occurs with Cuda stream management. This function should only be called if a call to nppGetStream() returns a stream number which is different from the desired stream since unnecessarily flushing the current stream can significantly affect performance.