NVIDIA Performance Primitives (NPP)  Version 10.1
Functions
NPP Core

Basic functions for library management, in particular library version and device property query functions. More...

Functions

const NppLibraryVersionnppGetLibVersion (void)
 Get the NPP library version. More...
 
NppGpuComputeCapability nppGetGpuComputeCapability (void)
 What CUDA compute model is supported by the active CUDA device? More...
 
int nppGetGpuNumSMs (void)
 Get the number of Streaming Multiprocessors (SM) on the active CUDA device. More...
 
int nppGetMaxThreadsPerBlock (void)
 Get the maximum number of threads per block on the active CUDA device. More...
 
int nppGetMaxThreadsPerSM (void)
 Get the maximum number of threads per SM for the active GPU. More...
 
int nppGetGpuDeviceProperties (int *pMaxThreadsPerSM, int *pMaxThreadsPerBlock, int *pNumberOfSMs)
 Get the maximum number of threads per SM, maximum threads per block, and number of SMs for the active GPU. More...
 
const char * nppGetGpuName (void)
 Get the name of the active CUDA device. More...
 
cudaStream_t nppGetStream (void)
 Get the NPP CUDA stream. More...
 
NppStatus nppGetStreamContext (NppStreamContext *pNppStreamContext)
 Get the current NPP managed CUDA stream context as set by calls to nppSetStream(). More...
 
unsigned int nppGetStreamNumSMs (void)
 Get the number of SMs on the device associated with the current NPP CUDA stream. More...
 
unsigned int nppGetStreamMaxThreadsPerSM (void)
 Get the maximum number of threads per SM on the device associated with the current NPP CUDA stream. More...
 
NppStatus nppSetStream (cudaStream_t hStream)
 Set the NPP CUDA stream. More...
 

Detailed Description

Basic functions for library management, in particular library version and device property query functions.

Function Documentation

NppGpuComputeCapability nppGetGpuComputeCapability ( void  )

What CUDA compute model is supported by the active CUDA device?

Before trying to call any NPP functions, the user should make a call this function to ensure that the current machine has a CUDA capable device.

NOTE THAT THIS FUNCTION WILL BE DEPRECATED IN THE NEXT NPP RELEASE. INSTEAD CALL cudaGetDevice() TO GET THE GPU DEVICE ID THEN cudaDeviceGetAttribute() TWICE, ONCE WITH THE cudaDevAttrComputeCapabilityMajor PARAMETER AND ONCE WITH THE cudaDevAttrComputeCapabilityMinor PARAMETER.

Returns
An enum value representing if a CUDA capable device was found and what level of compute capabilities it supports.
int nppGetGpuDeviceProperties ( int *  pMaxThreadsPerSM,
int *  pMaxThreadsPerBlock,
int *  pNumberOfSMs 
)

Get the maximum number of threads per SM, maximum threads per block, and number of SMs for the active GPU.

Returns
cudaSuccess for success, -1 for failure
const char* nppGetGpuName ( void  )

Get the name of the active CUDA device.

Returns
Name string of the active graphics-card/compute device in a system.
int nppGetGpuNumSMs ( void  )

Get the number of Streaming Multiprocessors (SM) on the active CUDA device.

Returns
Number of SMs of the default CUDA device.
const NppLibraryVersion* nppGetLibVersion ( void  )

Get the NPP library version.

Returns
A struct containing separate values for major and minor revision and build number.
int nppGetMaxThreadsPerBlock ( void  )

Get the maximum number of threads per block on the active CUDA device.

Returns
Maximum number of threads per block on the active CUDA device.
int nppGetMaxThreadsPerSM ( void  )

Get the maximum number of threads per SM for the active GPU.

Returns
Maximum number of threads per SM for the active GPU
cudaStream_t nppGetStream ( void  )

Get the NPP CUDA stream.

NPP enables concurrent device tasks via a global stream state varible. The NPP stream by default is set to stream 0, i.e. non-concurrent mode. A user can set the NPP stream to any valid CUDA stream. All CUDA commands issued by NPP (e.g. kernels launched by the NPP library) are then issed to that NPP stream.

NppStatus nppGetStreamContext ( NppStreamContext pNppStreamContext)

Get the current NPP managed CUDA stream context as set by calls to nppSetStream().

NPP enables concurrent device tasks via an NPP maintained global stream state context. The NPP stream by default is set to stream 0, i.e. non-concurrent mode. A user can set the NPP stream to any valid CUDA stream which will update the current NPP managed stream state context or supply application initialized stream contexts to NPP calls. All CUDA commands issued by NPP (e.g. kernels launched by the NPP library) are then issed to the current NPP managed stream or to application supplied stream contexts depending on whether the stream context is passed to the NPP function or not. NPP managed stream context calls (those without stream context parameters) can be intermixed with application managed stream context calls but any NPP managed stream context calls will always use the most recent stream set by nppSetStream() or the NULL stream if nppSetStream() has never been called.

unsigned int nppGetStreamMaxThreadsPerSM ( void  )

Get the maximum number of threads per SM on the device associated with the current NPP CUDA stream.

NPP enables concurrent device tasks via a global stream state varible. The NPP stream by default is set to stream 0, i.e. non-concurrent mode. A user can set the NPP stream to any valid CUDA stream. All CUDA commands issued by NPP (e.g. kernels launched by the NPP library) are then issed to that NPP stream. This call avoids a cudaGetDeviceProperties() call.

unsigned int nppGetStreamNumSMs ( void  )

Get the number of SMs on the device associated with the current NPP CUDA stream.

NPP enables concurrent device tasks via a global stream state varible. The NPP stream by default is set to stream 0, i.e. non-concurrent mode. A user can set the NPP stream to any valid CUDA stream. All CUDA commands issued by NPP (e.g. kernels launched by the NPP library) are then issed to that NPP stream. This call avoids a cudaGetDeviceProperties() call.

NppStatus nppSetStream ( cudaStream_t  hStream)

Set the NPP CUDA stream.

This function now returns an error if a problem occurs with Cuda stream management. This function should only be called if a call to nppGetStream() returns a stream number which is different from the desired stream since unnecessarily flushing the current stream can significantly affect performance.

See Also
nppGetStream()

Copyright © 2009-2018 NVIDIA Corporation