NVIDIA NvNeural SDK  2022.2
GPU inference framework for NVIDIA Nsight Deep Learning Designer
nvneural::INetworkBackend Class Referenceabstract

INetworkBackend is a runtime-specific interface for CUDA, DirectX, or other system- specific operations needed during inference. More...

#include <nvneural/CoreTypes.h>

Inheritance diagram for nvneural::INetworkBackend:
nvneural::IRefObject nvneural::INetworkBackend2

Public Member Functions

virtual NeuralResult bindCurrentThread () noexcept=0
 Rebinds internal data structures to the current thread.
 
virtual NetworkBackendId id () const noexcept=0
 Introspection function: Returns the backend ID implemented by this interface.
 
virtual NeuralResult saveImage (const ILayer *pLayer, const INetworkRuntime *pNetwork, IImage *pImage, ImageSpace imageSpace, size_t channels) noexcept=0
 Converts a layer's output tensor to a CPU image. More...
 
virtual NeuralResult synchronize () noexcept=0
 Performs a CPU/GPU sync and completes all pending operations on the device.
 
virtual NeuralResult transformTensor (void *pDeviceDestination, TensorFormat destinationFormat, TensorDimension destinationSize, const void *pDeviceSource, TensorFormat sourceFormat, TensorDimension sourceSize) noexcept=0
 Transforms a tensor from one format to another.
 
Device management functions
virtual NeuralResult initializeFromDeviceOrdinal (std::uint32_t deviceOrdinal) noexcept=0
 Initializes the backend to point to a specific device ordinal. More...
 
virtual NeuralResult initializeFromDeviceIdentifier (const IBackendDeviceIdentifier *pDeviceIdentifier) noexcept=0
 Initializes the backend to point to a specific device identifier. More...
 
virtual const IBackendDeviceIdentifierdeviceIdentifier () const noexcept=0
 Retrieves an opaque device identifier object corresponding to the device associated with this backend. More...
 
Low-level memory allocation functions
virtual NeuralResult setDeviceMemory (void *pDeviceDestination, std::uint8_t value, std::size_t byteCount) noexcept=0
 Fills a buffer with a preset value. Equivalent to memset.
 
virtual NeuralResult copyMemoryD2D (void *pDeviceDestination, const void *pDeviceSource, std::size_t byteCount) noexcept=0
 Device-to-device memory copy.
 
virtual NeuralResult copyMemoryH2D (void *pDeviceDestination, const void *pHostSource, std::size_t byteCount) noexcept=0
 Host-to-device memory copy.
 
virtual NeuralResult copyMemoryD2H (void *pHostDestination, const void *pDeviceSource, std::size_t byteCount) noexcept=0
 Device-to-host memory copy.
 
Library handle management functions

Many layer classes rely on external libraries for computation, and these libraries may be expensive to initialize repeatedly.

These functions allow layers to query the presence of preexisting library contexts, and initialize them if required.

Since many of these libraries share the same synchronization/binding behavior as the core backend, library support is implemented as a key-value store of ILibraryContext objects rather than direct handle access. A registered ILibraryContext object receives callbacks when rebinding or synchronization is required.

virtual NeuralResult registerLibraryContext (ILibraryContext *pLibraryContext) noexcept=0
 Registers a new library context with the backend. More...
 
virtual ILibraryContextgetLibraryContext (ILibraryContext::LibraryId libraryId) noexcept=0
 Retrieves a library context by its identifier. More...
 
virtual const ILibraryContextgetLibraryContext (ILibraryContext::LibraryId libraryId) const noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
High-level memory allocation functions
virtual NeuralResult allocateMemoryBlock (MemoryHandle *pHandle, size_t byteCount) noexcept=0
 Allocates a memory block of the requested size. More...
 
virtual NeuralResult freeMemoryBlock (MemoryHandle handle) noexcept=0
 Frees a memory block that was allocated with allocateMemoryBlock. More...
 
virtual void * getAddressForMemoryBlock (MemoryHandle handle) noexcept=0
 Retrieves the raw address corresponding to a MemoryHandle. More...
 
virtual size_t getSizeForMemoryBlock (MemoryHandle handle) noexcept=0
 Retrieves the buffer size corresponding to a MemoryHandle. More...
 
virtual NeuralResult lockMemoryBlock (MemoryHandle handle) noexcept=0
 Locks a memory block to prevent reuse. More...
 
virtual NeuralResult unlockMemoryBlock (MemoryHandle handle) noexcept=0
 Unlocks a memory block. More...
 
virtual MemoryHandle updateTensor (const ILayer *pLayer, INetworkRuntime *pNetwork, TensorFormat format, MemoryHandle hOriginal, TensorDimension stepping, TensorDimension internalDimensions) noexcept=0
 Updates a memory handle.
 
Weights management
virtual NeuralResult clearLoadedWeights () noexcept=0
 Clears all loaded weights.
 
virtual NeuralResult uploadWeights (const void **ppUploadedWeightsOut, const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader, const char *pName, const void *pWeightsData, std::size_t weightsDataSize, TensorDimension weightsDim, TensorFormat format, bool memManagedExternally) noexcept=0
 Uploads weights data to an internal cache. More...
 
virtual const void * getAddressForWeightsData (const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader, const char *pName, TensorFormat format) const noexcept=0
 Retrieves loaded weights data from the internal cache. More...
 
virtual NeuralResult getDimensionsForWeightsData (TensorDimension *pDimensionOut, const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader, const char *pName, TensorFormat format) const noexcept=0
 Retrieves loaded weights dimensions from the internal cache. More...
 
virtual NeuralResult getWeightsNamesForLayer (IStringList **ppListOut, const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader) const noexcept=0
 Retrieves names of loaded weights objects from the internal cache. More...
 
- Public Member Functions inherited from nvneural::IRefObject
virtual RefCount addRef () const noexcept=0
 Increments the object's reference count. More...
 
virtual const void * queryInterface (TypeId interface) const noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
virtual void * queryInterface (TypeId interface) noexcept=0
 Retrieves a new object interface pointer. More...
 
virtual RefCount release () const noexcept=0
 Decrements the object's reference count and destroy the object if the reference count reaches zero. More...
 

Static Public Attributes

static const IRefObject::TypeId typeID = 0xacd7828da90108ddul
 Interface TypeId for InterfaceOf purposes.
 
- Static Public Attributes inherited from nvneural::IRefObject
static const TypeId typeID = 0x14ecc3f9de638e1dul
 Interface TypeId for InterfaceOf purposes.
 

Optimization capability checking

Not all backends may be compatible with certain network optimizations we perform.

We therefore expose a capabilities system where backends can be queried before the Network class makes use of the optimization in question.

When creating a network backend, please "fail safe" in these queries. If you do not recognize the optimization in question (perhaps the Network implementation is newer than your backend), do not claim to support the optimization. Unconditionally saying "no, this is unsupported" to these checks is always safe though may result in reduced performance.

enum class  OptimizationCapability : std::uint64_t { SkipConcatenation = 0xdd13d58fbabb2f5bul , FuseBatchNormAndConvolution = 0xeaffe6d9a4acfdc9ul }
 List of optional optimizations supported by backends. More...
 
virtual bool supportsOptimization (OptimizationCapability optimization) const noexcept=0
 Returns true if the indicated optimization is applicable to this backend. More...
 

Additional Inherited Members

- Public Types inherited from nvneural::IRefObject
using RefCount = std::uint32_t
 Typedef used to track the number of active references to an object.
 
using TypeId = std::uint64_t
 Every interface must define a unique TypeId. This should be randomized.
 
- Protected Member Functions inherited from nvneural::IRefObject
virtual ~IRefObject ()=default
 A protected destructor prevents accidental stack-allocation of IRefObjects or use with other smart pointer classes like std::unique_ptr.
 

Detailed Description

INetworkBackend is a runtime-specific interface for CUDA, DirectX, or other system- specific operations needed during inference.

Member Enumeration Documentation

◆ OptimizationCapability

List of optional optimizations supported by backends.

Enumerator
SkipConcatenation 

Backends exposing buffers as GPU virtual addresses rather than resource handles can skip concatenation layers by copying tensors directly into the relevant part of the concatenated output.

Concatenation layers are identified by the presence of IConcatenationLayer.

FuseBatchNormAndConvolution 

Batch normalization and convolution can be fused into a single launch.

        Batch normalization layers are identified by the presence of
        the IBatchNormalizationLayer interface.

        Convolution layers are identified by the presence of IConvolutionLayer. 

Member Function Documentation

◆ allocateMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::allocateMemoryBlock ( MemoryHandle pHandle,
size_t  byteCount 
)
pure virtualnoexcept

Allocates a memory block of the requested size.

This memory must be freed with freeMemoryBlock.

If this function fails, pHandle will receive nullptr as a value.

Parameters
pHandle[out] Pointer receiving a MemoryHandle to the new memory
byteCountNumber of bytes to allocate

◆ deviceIdentifier()

virtual const IBackendDeviceIdentifier* nvneural::INetworkBackend::deviceIdentifier ( ) const
pure virtualnoexcept

Retrieves an opaque device identifier object corresponding to the device associated with this backend.

Returns
A device identifier, or nullptr if the backend is uninitialized.

◆ freeMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::freeMemoryBlock ( MemoryHandle  handle)
pure virtualnoexcept

Frees a memory block that was allocated with allocateMemoryBlock.

Freeing nullptr is explicitly permitted and does nothing.

Parameters
handleMemoryHandle to the buffer to be freed

◆ getAddressForMemoryBlock()

virtual void* nvneural::INetworkBackend::getAddressForMemoryBlock ( MemoryHandle  handle)
pure virtualnoexcept

Retrieves the raw address corresponding to a MemoryHandle.

This address may correspond to a GPU pointer (such as CUDA device address space), so do not assume the return value is CPU-accessible by default. Check the documentation for the backend in question.

Parameters
handleMemoryHandle to query

◆ getAddressForWeightsData()

virtual const void* nvneural::INetworkBackend::getAddressForWeightsData ( const ILayer pLayer,
const IWeightsLoader pOriginWeightLoader,
const char *  pName,
TensorFormat  format 
) const
pure virtualnoexcept

Retrieves loaded weights data from the internal cache.

If the weights have not been uploaded with uploadWeights in the desired format, this call is allowed to fail rather than causing a tensor transformation.

Parameters
pLayerLayer associated with the requested weights data
pOriginWeightLoaderWeight loader associated with the weight
pNameName associated with the requested weights data
formatDesired tensor format of the weights data
Returns
A pointer in the backend's address space, or nullptr if the weights are not available in this format.

◆ getDimensionsForWeightsData()

virtual NeuralResult nvneural::INetworkBackend::getDimensionsForWeightsData ( TensorDimension pDimensionOut,
const ILayer pLayer,
const IWeightsLoader pOriginWeightLoader,
const char *  pName,
TensorFormat  format 
) const
pure virtualnoexcept

Retrieves loaded weights dimensions from the internal cache.

If the weights have not been uploaded with uploadWeights in the desired format, this call is allowed to fail rather than causing a tensor transformation.

Parameters
pDimensionOutOutput pointer receiving the weights dimensions
pLayerLayer associated with the requested weights data
pOriginWeightLoaderWeight loader associated with the weight
pNameName associated with the requested weights data
formatDesired tensor format of the weights data

◆ getLibraryContext()

virtual ILibraryContext* nvneural::INetworkBackend::getLibraryContext ( ILibraryContext::LibraryId  libraryId)
pure virtualnoexcept

Retrieves a library context by its identifier.

Parameters
libraryIdLibrary context identifier to retrieve.
Returns
A pointer to the library context, or nullptr if no such context has been registered.

◆ getSizeForMemoryBlock()

virtual size_t nvneural::INetworkBackend::getSizeForMemoryBlock ( MemoryHandle  handle)
pure virtualnoexcept

Retrieves the buffer size corresponding to a MemoryHandle.

System memory allocation functions (e.g., VirtualAlloc, cuMemAlloc) may add additional padding or alignment bytes to the original buffer size. They are not included in the value returned by this function.

Parameters
handleMemoryHandle to query

◆ getWeightsNamesForLayer()

virtual NeuralResult nvneural::INetworkBackend::getWeightsNamesForLayer ( IStringList **  ppListOut,
const ILayer pLayer,
const IWeightsLoader pOriginWeightLoader 
) const
pure virtualnoexcept

Retrieves names of loaded weights objects from the internal cache.

Parameters
ppListOutVariable receiving a reference to a new IStringList. Caller must release the reference.
pLayerLayer associated with the requested weights data
pOriginWeightLoaderWeight loader associated with the weight

◆ initializeFromDeviceIdentifier()

virtual NeuralResult nvneural::INetworkBackend::initializeFromDeviceIdentifier ( const IBackendDeviceIdentifier pDeviceIdentifier)
pure virtualnoexcept

Initializes the backend to point to a specific device identifier.

This is used to ensure all backends execute on the same GPU; typically you should initialize one backend with initializeFromDeviceOrdinal, then retrieve its IBackendDeviceIdentifier and pass it to the other backends.

Backends do not support reinitialization; attempts to call this function after the backend has been initialized will fail.

Parameters
pDeviceIdentifierSystem-specific device identifier

◆ initializeFromDeviceOrdinal()

virtual NeuralResult nvneural::INetworkBackend::initializeFromDeviceOrdinal ( std::uint32_t  deviceOrdinal)
pure virtualnoexcept

Initializes the backend to point to a specific device ordinal.

The enumeration order is backend-specific, and not guaranteed to be stable between runs or reboots. Typically ordinal zero refers to the backend's "primary" device.

Backends do not support reinitialization; attempts to call this function after the backend has been initialized will fail.

Parameters
deviceOrdinalIndex for device enumeration

◆ lockMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::lockMemoryBlock ( MemoryHandle  handle)
pure virtualnoexcept

Locks a memory block to prevent reuse.

To avoid leaks, be sure to unlock the memory block with unlockMemoryBlock.

Parameters
handleMemoryHandle to lock.

◆ registerLibraryContext()

virtual NeuralResult nvneural::INetworkBackend::registerLibraryContext ( ILibraryContext pLibraryContext)
pure virtualnoexcept

Registers a new library context with the backend.

The backend takes a reference to the context. This function is in INetworkBackend rather than host-application-specific interfaces because layers might perform their own on-demand library context registration.

Parameters
pLibraryContextContext to register

◆ saveImage()

virtual NeuralResult nvneural::INetworkBackend::saveImage ( const ILayer pLayer,
const INetworkRuntime pNetwork,
IImage pImage,
ImageSpace  imageSpace,
size_t  channels 
)
pure virtualnoexcept

Converts a layer's output tensor to a CPU image.

Parameters
pLayerLayer to read.
pImageIImage object to receive image data.
pNetworkNetwork associated with the layer
imageSpaceConversion function to apply when mapping floats to RGB.
channelsNumber of channels to copy into the image.

◆ supportsOptimization()

virtual bool nvneural::INetworkBackend::supportsOptimization ( OptimizationCapability  optimization) const
pure virtualnoexcept

Returns true if the indicated optimization is applicable to this backend.

Reminder to implementers: "return false" is always safe. Do not return true if the optimization identifier being queried is not known to your code.

See https://devblogs.microsoft.com/oldnewthing/20040211-00/?p=40663 for an example.

◆ unlockMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::unlockMemoryBlock ( MemoryHandle  handle)
pure virtualnoexcept

Unlocks a memory block.

The memory block must have been locked previously by a call to lockMemoryBlock.

Parameters
handleMemoryHandle to lock.

◆ uploadWeights()

virtual NeuralResult nvneural::INetworkBackend::uploadWeights ( const void **  ppUploadedWeightsOut,
const ILayer pLayer,
const IWeightsLoader pOriginWeightLoader,
const char *  pName,
const void *  pWeightsData,
std::size_t  weightsDataSize,
TensorDimension  weightsDim,
TensorFormat  format,
bool  memManagedExternally 
)
pure virtualnoexcept

Uploads weights data to an internal cache.

Parameters
ppUploadedWeightsOutVariable which will receive a pointer to the uploaded data
pLayerLayer to associate with the weights data
pOriginWeightLoaderWeight loader associated with the weight
pNameName to associate with the weights data
pWeightsDataBinary buffer to upload
weightsDataSizeSize of pWeightsData, in bytes
weightsDimSize of pWeightsData, in tensor dimensions
formatDesired tensor format of the uploaded data
memManagedExternallypWeightsData owner is an external owner, likely caller

The documentation for this class was generated from the following file: