INetworkBackend is a runtime-specific interface for CUDA, DirectX, or other system- specific operations needed during inference. More...

#include <nvneural/CoreTypes.h>

Inheritance diagram for nvneural::INetworkBackend:

Public Member Functions
virtual NeuralResult	bindCurrentThread () noexcept=0
	Rebinds internal data structures to the current thread.

virtual NetworkBackendId	id () const noexcept=0
	Introspection function: Returns the backend ID implemented by this interface.

virtual NeuralResult	saveImage (const ILayer pLayer, const INetworkRuntime pNetwork, IImage *pImage, ImageSpace imageSpace, size_t channels) noexcept=0
	Converts a layer's output tensor to a CPU image. More...

virtual NeuralResult	synchronize () noexcept=0
	Performs a CPU/GPU sync and completes all pending operations on the device.

virtual NeuralResult	transformTensor (void pDeviceDestination, TensorFormat destinationFormat, TensorDimension destinationSize, const void pDeviceSource, TensorFormat sourceFormat, TensorDimension sourceSize) noexcept=0
	Transforms a tensor from one format to another.

Device management functions
virtual NeuralResult	initializeFromDeviceOrdinal (std::uint32_t deviceOrdinal) noexcept=0
	Initializes the backend to point to a specific device ordinal. More...

virtual NeuralResult	initializeFromDeviceIdentifier (const IBackendDeviceIdentifier *pDeviceIdentifier) noexcept=0
	Initializes the backend to point to a specific device identifier. More...

virtual const IBackendDeviceIdentifier *	deviceIdentifier () const noexcept=0
	Retrieves an opaque device identifier object corresponding to the device associated with this backend. More...

Low-level memory allocation functions
virtual NeuralResult	setDeviceMemory (void *pDeviceDestination, std::uint8_t value, std::size_t byteCount) noexcept=0
	Fills a buffer with a preset value. Equivalent to memset.

virtual NeuralResult	copyMemoryD2D (void pDeviceDestination, const void pDeviceSource, std::size_t byteCount) noexcept=0
	Device-to-device memory copy.

virtual NeuralResult	copyMemoryH2D (void pDeviceDestination, const void pHostSource, std::size_t byteCount) noexcept=0
	Host-to-device memory copy.

virtual NeuralResult	copyMemoryD2H (void pHostDestination, const void pDeviceSource, std::size_t byteCount) noexcept=0
	Device-to-host memory copy.

Library handle management functions
Many layer classes rely on external libraries for computation, and these libraries may be expensive to initialize repeatedly. These functions allow layers to query the presence of preexisting library contexts, and initialize them if required. Since many of these libraries share the same synchronization/binding behavior as the core backend, library support is implemented as a key-value store of ILibraryContext objects rather than direct handle access. A registered ILibraryContext object receives callbacks when rebinding or synchronization is required.
virtual NeuralResult	registerLibraryContext (ILibraryContext *pLibraryContext) noexcept=0
	Registers a new library context with the backend. More...

virtual ILibraryContext *	getLibraryContext (ILibraryContext::LibraryId libraryId) noexcept=0
	Retrieves a library context by its identifier. More...

virtual const ILibraryContext *	getLibraryContext (ILibraryContext::LibraryId libraryId) const noexcept=0
	This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

High-level memory allocation functions
virtual NeuralResult	allocateMemoryBlock (MemoryHandle *pHandle, size_t byteCount) noexcept=0
	Allocates a memory block of the requested size. More...

virtual NeuralResult	freeMemoryBlock (MemoryHandle handle) noexcept=0
	Frees a memory block that was allocated with allocateMemoryBlock. More...

virtual void *	getAddressForMemoryBlock (MemoryHandle handle) noexcept=0
	Retrieves the raw address corresponding to a MemoryHandle. More...

virtual size_t	getSizeForMemoryBlock (MemoryHandle handle) noexcept=0
	Retrieves the buffer size corresponding to a MemoryHandle. More...

virtual NeuralResult	lockMemoryBlock (MemoryHandle handle) noexcept=0
	Locks a memory block to prevent reuse. More...

virtual NeuralResult	unlockMemoryBlock (MemoryHandle handle) noexcept=0
	Unlocks a memory block. More...

virtual MemoryHandle	updateTensor (const ILayer pLayer, INetworkRuntime pNetwork, TensorFormat format, MemoryHandle hOriginal, TensorDimension stepping, TensorDimension internalDimensions) noexcept=0
	Updates a memory handle.

Weights management
virtual NeuralResult	clearLoadedWeights () noexcept=0
	Clears all loaded weights.

virtual NeuralResult	uploadWeights (const void *ppUploadedWeightsOut, const ILayer pLayer, const IWeightsLoader pOriginWeightLoader, const char pName, const void *pWeightsData, std::size_t weightsDataSize, TensorDimension weightsDim, TensorFormat format, bool memManagedExternally) noexcept=0
	Uploads weights data to an internal cache. More...

virtual const void *	getAddressForWeightsData (const ILayer pLayer, const IWeightsLoader pOriginWeightLoader, const char *pName, TensorFormat format) const noexcept=0
	Retrieves loaded weights data from the internal cache. More...

virtual NeuralResult	getDimensionsForWeightsData (TensorDimension pDimensionOut, const ILayer pLayer, const IWeightsLoader pOriginWeightLoader, const char pName, TensorFormat format) const noexcept=0
	Retrieves loaded weights dimensions from the internal cache. More...

virtual NeuralResult	getWeightsNamesForLayer (IStringList *ppListOut, const ILayer pLayer, const IWeightsLoader *pOriginWeightLoader) const noexcept=0
	Retrieves names of loaded weights objects from the internal cache. More...

Public Member Functions inherited from nvneural::IRefObject
virtual RefCount	addRef () const noexcept=0
	Increments the object's reference count. More...

virtual const void *	queryInterface (TypeId interface) const noexcept=0
	This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

virtual void *	queryInterface (TypeId interface) noexcept=0
	Retrieves a new object interface pointer. More...

virtual RefCount	release () const noexcept=0
	Decrements the object's reference count and destroy the object if the reference count reaches zero. More...

Static Public Attributes
static const IRefObject::TypeId	typeID = 0xacd7828da90108ddul
	Interface TypeId for InterfaceOf purposes.

Static Public Attributes inherited from nvneural::IRefObject
static const TypeId	typeID = 0x14ecc3f9de638e1dul
	Interface TypeId for InterfaceOf purposes.

Optimization capability checking
Not all backends may be compatible with certain network optimizations we perform. We therefore expose a capabilities system where backends can be queried before the Network class makes use of the optimization in question. When creating a network backend, please "fail safe" in these queries. If you do not recognize the optimization in question (perhaps the Network implementation is newer than your backend), do not claim to support the optimization. Unconditionally saying "no, this is unsupported" to these checks is always safe though may result in reduced performance.
enum class	OptimizationCapability : std::uint64_t { SkipConcatenation = 0xdd13d58fbabb2f5bul , FuseBatchNormAndConvolution = 0xeaffe6d9a4acfdc9ul }
	List of optional optimizations supported by backends. More...

virtual bool	supportsOptimization (OptimizationCapability optimization) const noexcept=0
	Returns true if the indicated optimization is applicable to this backend. More...

Additional Inherited Members
Public Types inherited from nvneural::IRefObject
using	RefCount = std::uint32_t
	Typedef used to track the number of active references to an object.

using	TypeId = std::uint64_t
	Every interface must define a unique TypeId. This should be randomized.

Protected Member Functions inherited from nvneural::IRefObject
virtual	~IRefObject ()=default
	A protected destructor prevents accidental stack-allocation of IRefObjects or use with other smart pointer classes like std::unique_ptr.

Detailed Description

INetworkBackend is a runtime-specific interface for CUDA, DirectX, or other system- specific operations needed during inference.

Member Enumeration Documentation

◆ OptimizationCapability

enum nvneural::INetworkBackend::OptimizationCapability : std::uint64_t

strong

List of optional optimizations supported by backends.

Enumerator

SkipConcatenation

Backends exposing buffers as GPU virtual addresses rather than resource handles can skip concatenation layers by copying tensors directly into the relevant part of the concatenated output.

Concatenation layers are identified by the presence of IConcatenationLayer.

FuseBatchNormAndConvolution

Batch normalization and convolution can be fused into a single launch.

        Batch normalization layers are identified by the presence of
        the IBatchNormalizationLayer interface.

        Convolution layers are identified by the presence of IConvolutionLayer.

Member Function Documentation

◆ allocateMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::allocateMemoryBlock	(	MemoryHandle *	pHandle,
		size_t	byteCount
	)

pure virtualnoexcept

Allocates a memory block of the requested size.

This memory must be freed with freeMemoryBlock.

If this function fails, pHandle will receive nullptr as a value.

Parameters

pHandle	[out] Pointer receiving a MemoryHandle to the new memory
byteCount	Number of bytes to allocate

◆ deviceIdentifier()

virtual const IBackendDeviceIdentifier* nvneural::INetworkBackend::deviceIdentifier ( ) const

pure virtualnoexcept

Retrieves an opaque device identifier object corresponding to the device associated with this backend.

Returns: A device identifier, or nullptr if the backend is uninitialized.

◆ freeMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::freeMemoryBlock ( MemoryHandle handle )

pure virtualnoexcept

Frees a memory block that was allocated with allocateMemoryBlock.

Freeing nullptr is explicitly permitted and does nothing.

Parameters

handle MemoryHandle to the buffer to be freed

◆ getAddressForMemoryBlock()

virtual void* nvneural::INetworkBackend::getAddressForMemoryBlock ( MemoryHandle handle )

pure virtualnoexcept

Retrieves the raw address corresponding to a MemoryHandle.

This address may correspond to a GPU pointer (such as CUDA device address space), so do not assume the return value is CPU-accessible by default. Check the documentation for the backend in question.

Parameters

handle MemoryHandle to query

◆ getAddressForWeightsData()

virtual const void* nvneural::INetworkBackend::getAddressForWeightsData	(	const ILayer *	pLayer,
		const IWeightsLoader *	pOriginWeightLoader,
		const char *	pName,
		TensorFormat	format
	)		const

pure virtualnoexcept

Retrieves loaded weights data from the internal cache.

If the weights have not been uploaded with uploadWeights in the desired format, this call is allowed to fail rather than causing a tensor transformation.

Parameters

pLayer	Layer associated with the requested weights data
pOriginWeightLoader	Weight loader associated with the weight
pName	Name associated with the requested weights data
format	Desired tensor format of the weights data

Returns: A pointer in the backend's address space, or nullptr if the weights are not available in this format.

◆ getDimensionsForWeightsData()

virtual NeuralResult nvneural::INetworkBackend::getDimensionsForWeightsData	(	TensorDimension *	pDimensionOut,
		const ILayer *	pLayer,
		const IWeightsLoader *	pOriginWeightLoader,
		const char *	pName,
		TensorFormat	format
	)		const

pure virtualnoexcept

Retrieves loaded weights dimensions from the internal cache.

If the weights have not been uploaded with uploadWeights in the desired format, this call is allowed to fail rather than causing a tensor transformation.

Parameters

pDimensionOut	Output pointer receiving the weights dimensions
pLayer	Layer associated with the requested weights data
pOriginWeightLoader	Weight loader associated with the weight
pName	Name associated with the requested weights data
format	Desired tensor format of the weights data

◆ getLibraryContext()

virtual ILibraryContext* nvneural::INetworkBackend::getLibraryContext ( ILibraryContext::LibraryId libraryId )

pure virtualnoexcept

Retrieves a library context by its identifier.

Parameters

libraryId Library context identifier to retrieve.

Returns: A pointer to the library context, or nullptr if no such context has been registered.

◆ getSizeForMemoryBlock()

virtual size_t nvneural::INetworkBackend::getSizeForMemoryBlock ( MemoryHandle handle )

pure virtualnoexcept

Retrieves the buffer size corresponding to a MemoryHandle.

System memory allocation functions (e.g., VirtualAlloc, cuMemAlloc) may add additional padding or alignment bytes to the original buffer size. They are not included in the value returned by this function.

Parameters

handle MemoryHandle to query

◆ getWeightsNamesForLayer()

virtual NeuralResult nvneural::INetworkBackend::getWeightsNamesForLayer	(	IStringList **	ppListOut,
		const ILayer *	pLayer,
		const IWeightsLoader *	pOriginWeightLoader
	)		const

pure virtualnoexcept

Retrieves names of loaded weights objects from the internal cache.

Parameters

ppListOut	Variable receiving a reference to a new IStringList. Caller must release the reference.
pLayer	Layer associated with the requested weights data
pOriginWeightLoader	Weight loader associated with the weight

◆ initializeFromDeviceIdentifier()

virtual NeuralResult nvneural::INetworkBackend::initializeFromDeviceIdentifier ( const IBackendDeviceIdentifier * pDeviceIdentifier )

pure virtualnoexcept

Initializes the backend to point to a specific device identifier.

This is used to ensure all backends execute on the same GPU; typically you should initialize one backend with initializeFromDeviceOrdinal, then retrieve its IBackendDeviceIdentifier and pass it to the other backends.

Backends do not support reinitialization; attempts to call this function after the backend has been initialized will fail.

Parameters

pDeviceIdentifier System-specific device identifier

◆ initializeFromDeviceOrdinal()

virtual NeuralResult nvneural::INetworkBackend::initializeFromDeviceOrdinal ( std::uint32_t deviceOrdinal )

pure virtualnoexcept

Initializes the backend to point to a specific device ordinal.

The enumeration order is backend-specific, and not guaranteed to be stable between runs or reboots. Typically ordinal zero refers to the backend's "primary" device.

Backends do not support reinitialization; attempts to call this function after the backend has been initialized will fail.

Parameters

deviceOrdinal Index for device enumeration

◆ lockMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::lockMemoryBlock ( MemoryHandle handle )

pure virtualnoexcept

Locks a memory block to prevent reuse.

To avoid leaks, be sure to unlock the memory block with unlockMemoryBlock.

Parameters

handle MemoryHandle to lock.

◆ registerLibraryContext()

virtual NeuralResult nvneural::INetworkBackend::registerLibraryContext ( ILibraryContext * pLibraryContext )

pure virtualnoexcept

Registers a new library context with the backend.

The backend takes a reference to the context. This function is in INetworkBackend rather than host-application-specific interfaces because layers might perform their own on-demand library context registration.

Parameters

pLibraryContext Context to register

◆ saveImage()

virtual NeuralResult nvneural::INetworkBackend::saveImage	(	const ILayer *	pLayer,
		const INetworkRuntime *	pNetwork,
		IImage *	pImage,
		ImageSpace	imageSpace,
		size_t	channels
	)

pure virtualnoexcept

Converts a layer's output tensor to a CPU image.

Parameters

pLayer	Layer to read.
pImage	IImage object to receive image data.
pNetwork	Network associated with the layer
imageSpace	Conversion function to apply when mapping floats to RGB.
channels	Number of channels to copy into the image.

◆ supportsOptimization()

virtual bool nvneural::INetworkBackend::supportsOptimization ( OptimizationCapability optimization ) const

pure virtualnoexcept

Returns true if the indicated optimization is applicable to this backend.

Reminder to implementers: "return false" is always safe. Do not return true if the optimization identifier being queried is not known to your code.

See https://devblogs.microsoft.com/oldnewthing/20040211-00/?p=40663 for an example.

◆ unlockMemoryBlock()

virtual NeuralResult nvneural::INetworkBackend::unlockMemoryBlock ( MemoryHandle handle )

pure virtualnoexcept

Unlocks a memory block.

The memory block must have been locked previously by a call to lockMemoryBlock.

Parameters

handle MemoryHandle to lock.

◆ uploadWeights()

virtual NeuralResult nvneural::INetworkBackend::uploadWeights	(	const void **	ppUploadedWeightsOut,
		const ILayer *	pLayer,
		const IWeightsLoader *	pOriginWeightLoader,
		const char *	pName,
		const void *	pWeightsData,
		std::size_t	weightsDataSize,
		TensorDimension	weightsDim,
		TensorFormat	format,
		bool	memManagedExternally
	)

pure virtualnoexcept

Uploads weights data to an internal cache.

Parameters

ppUploadedWeightsOut	Variable which will receive a pointer to the uploaded data
pLayer	Layer to associate with the weights data
pOriginWeightLoader	Weight loader associated with the weight
pName	Name to associate with the weights data
pWeightsData	Binary buffer to upload
weightsDataSize	Size of pWeightsData, in bytes
weightsDim	Size of pWeightsData, in tensor dimensions
format	Desired tensor format of the uploaded data
memManagedExternally	pWeightsData owner is an external owner, likely caller

The documentation for this class was generated from the following file:

Core/Inc/nvneural/CoreTypes.h

Public Member Functions

Static Public Attributes

Optimization capability checking

Additional Inherited Members

Detailed Description

Member Enumeration Documentation

◆ OptimizationCapability

Member Function Documentation

◆ allocateMemoryBlock()

◆ deviceIdentifier()

◆ freeMemoryBlock()

◆ getAddressForMemoryBlock()

◆ getAddressForWeightsData()

◆ getDimensionsForWeightsData()

◆ getLibraryContext()

◆ getSizeForMemoryBlock()

◆ getWeightsNamesForLayer()

◆ initializeFromDeviceIdentifier()

◆ initializeFromDeviceOrdinal()

◆ lockMemoryBlock()

◆ registerLibraryContext()

◆ saveImage()

◆ supportsOptimization()

◆ unlockMemoryBlock()

◆ uploadWeights()