NVIDIA NvNeural SDK  2021.2
GPU inference framework for NVIDIA Nsight Deep Learning Designer
nvneural::ICudaCompiledFunction Class Referenceabstract

Represents a runtime-compiled function object from ICudaRuntimeCompiler. More...

#include <nvneural/CudaTypes.h>

Inheritance diagram for nvneural::ICudaCompiledFunction:
nvneural::IRefObject

Public Member Functions

virtual const void * compiledBinary () const noexcept=0
 Provides a pointer to the compiled binary representation (PTX or cubin) of this function. More...
 
virtual std::size_t compiledBinarySize () const noexcept=0
 Provides the size of the compiled binary (PTX or cubin) representation of this function in bytes. More...
 
virtual CUfunction function () const noexcept=0
 Returns the CUfunction represented by this function object. More...
 
virtual NeuralResult launch (INetworkBackendCuda *pBackend, std::size_t gridSizeX, std::size_t gridSizeY, std::size_t gridSizeZ, std::size_t blockSizeX, std::size_t blockSizeY, std::size_t blockSizeZ, void **ppArguments, std::uint32_t smem) const noexcept=0
 Launches the function on the specified CUDA backend's stream. More...
 
virtual CUmodule module () const noexcept=0
 Returns the CUmodule containing this function object. More...
 
- Public Member Functions inherited from nvneural::IRefObject
virtual RefCount addRef () const noexcept=0
 Increments the object's reference count. More...
 
virtual const void * queryInterface (TypeId interface) const noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
virtual void * queryInterface (TypeId interface) noexcept=0
 Retrieves a new object interface pointer. More...
 
virtual RefCount release () const noexcept=0
 Decrements the object's reference count and destroy the object if the reference count reaches zero. More...
 

Static Public Attributes

static const IRefObject::TypeId typeID = 0x467d6d0e91bcc332ul
 Interface TypeId for InterfaceOf purposes.
 
- Static Public Attributes inherited from nvneural::IRefObject
static const TypeId typeID = 0x14ecc3f9de638e1dul
 Interface TypeId for InterfaceOf purposes.
 

Additional Inherited Members

- Public Types inherited from nvneural::IRefObject
using RefCount = std::uint32_t
 Typedef used to track the number of active references to an object.
 
using TypeId = std::uint64_t
 Every interface must define a unique TypeId. This should be randomized.
 
- Protected Member Functions inherited from nvneural::IRefObject
virtual ~IRefObject ()=default
 A protected destructor prevents accidental stack-allocation of IRefObjects or use with other smart pointer classes like std::unique_ptr.
 

Detailed Description

Represents a runtime-compiled function object from ICudaRuntimeCompiler.

Member Function Documentation

◆ compiledBinary()

virtual const void* nvneural::ICudaCompiledFunction::compiledBinary ( ) const
pure virtualnoexcept

Provides a pointer to the compiled binary representation (PTX or cubin) of this function.

The size of the buffer is given by compiledBinarySize. If no binary representation is available, returns nullptr. The pointer returned by this function remains valid as long as this object does.

◆ compiledBinarySize()

virtual std::size_t nvneural::ICudaCompiledFunction::compiledBinarySize ( ) const
pure virtualnoexcept

Provides the size of the compiled binary (PTX or cubin) representation of this function in bytes.

If a precompiled format is not available, returns 0. If the function was compiled to a cubin, returns the cubin; PTX is only returned when compiling to virtual (compute_##) architectures.

◆ function()

virtual CUfunction nvneural::ICudaCompiledFunction::function ( ) const
pure virtualnoexcept

Returns the CUfunction represented by this function object.

If this function was compiled to an incompatible architecture, this function returns nullptr.

◆ launch()

virtual NeuralResult nvneural::ICudaCompiledFunction::launch ( INetworkBackendCuda pBackend,
std::size_t  gridSizeX,
std::size_t  gridSizeY,
std::size_t  gridSizeZ,
std::size_t  blockSizeX,
std::size_t  blockSizeY,
std::size_t  blockSizeZ,
void **  ppArguments,
std::uint32_t  smem 
) const
pure virtualnoexcept

Launches the function on the specified CUDA backend's stream.

This function is conceptually equivalent to the cuLaunchKernel driver API.

Parameters
pBackendCUDA network backend owning the context/stream
gridSizeXX-dimension of launch grid in blocks
gridSizeYY-dimension of launch grid in blocks
gridSizeZZ-dimension of launch grid in blocks
blockSizeXX-dimension of a block in threads
blockSizeYY-dimension of a block in threads
blockSizeZZ-dimension of a block in threads
ppArgumentsArray of pointers to kernel parameters; see cuLaunchKernel 'kernelParams' for details
smemDynamic shared-memory size per block in bytes; see cuLaunchKernel 'sharedMemBytes' for details

◆ module()

virtual CUmodule nvneural::ICudaCompiledFunction::module ( ) const
pure virtualnoexcept

Returns the CUmodule containing this function object.

If this function was compiled to an incompatible architecture, this function returns nullptr.


The documentation for this class was generated from the following file: