NVIDIA NvNeural SDK
2022.2
GPU inference framework for NVIDIA Nsight Deep Learning Designer
|
INetworkBackend2 is a revision of INetworkBackend. More...
#include <nvneural/CoreTypes.h>
Public Member Functions | |
virtual NeuralResult | allocateMemoryBlock (MemoryHandle *pHandle, size_t byteCount) noexcept=0 |
Allocates a memory block of the requested size. More... | |
High-level memory allocation functions | |
The new functions in INetworkBackend2 enable support for per-layer memory tracking. Allocator support for memory tracking is optional at runtime; high-performance allocators are explicitly allowed to ignore tracking keys to minimize overhead. | |
virtual NeuralResult | allocateMemoryBlock (MemoryHandle *pHandle, size_t byteCount, const char *pTrackingKey) noexcept=0 |
Allocates a memory block of the requested size and allows tracking of the memory block using a user-defined key. More... | |
virtual const MemoryTrackingData * | getMemoryTrackingData (const char *pTrackingKey, const char *pTrackingSubkey) const noexcept=0 |
Compiles and returns memory data for the given key. More... | |
virtual NeuralResult | getMemoryTrackingKeys (IStringList **ppKeysOut) noexcept=0 |
Returns an IStringList of the currently tracked keys. More... | |
virtual NeuralResult | getMemoryTrackingSubkeys (const char *pTrackingKey, IStringList **ppKeysOut) noexcept=0 |
Returns an IStringList of the subkeys of given tracking key. More... | |
virtual NeuralResult | setMemoryTrackingKey (const char *pTrackingKey, const char *pTrackingSubkey) noexcept=0 |
Sets a potential tracking key. More... | |
![]() | |
virtual NeuralResult | bindCurrentThread () noexcept=0 |
Rebinds internal data structures to the current thread. | |
virtual NetworkBackendId | id () const noexcept=0 |
Introspection function: Returns the backend ID implemented by this interface. | |
virtual NeuralResult | saveImage (const ILayer *pLayer, const INetworkRuntime *pNetwork, IImage *pImage, ImageSpace imageSpace, size_t channels) noexcept=0 |
Converts a layer's output tensor to a CPU image. More... | |
virtual NeuralResult | synchronize () noexcept=0 |
Performs a CPU/GPU sync and completes all pending operations on the device. | |
virtual NeuralResult | transformTensor (void *pDeviceDestination, TensorFormat destinationFormat, TensorDimension destinationSize, const void *pDeviceSource, TensorFormat sourceFormat, TensorDimension sourceSize) noexcept=0 |
Transforms a tensor from one format to another. | |
virtual NeuralResult | initializeFromDeviceOrdinal (std::uint32_t deviceOrdinal) noexcept=0 |
Initializes the backend to point to a specific device ordinal. More... | |
virtual NeuralResult | initializeFromDeviceIdentifier (const IBackendDeviceIdentifier *pDeviceIdentifier) noexcept=0 |
Initializes the backend to point to a specific device identifier. More... | |
virtual const IBackendDeviceIdentifier * | deviceIdentifier () const noexcept=0 |
Retrieves an opaque device identifier object corresponding to the device associated with this backend. More... | |
virtual NeuralResult | setDeviceMemory (void *pDeviceDestination, std::uint8_t value, std::size_t byteCount) noexcept=0 |
Fills a buffer with a preset value. Equivalent to memset. | |
virtual NeuralResult | copyMemoryD2D (void *pDeviceDestination, const void *pDeviceSource, std::size_t byteCount) noexcept=0 |
Device-to-device memory copy. | |
virtual NeuralResult | copyMemoryH2D (void *pDeviceDestination, const void *pHostSource, std::size_t byteCount) noexcept=0 |
Host-to-device memory copy. | |
virtual NeuralResult | copyMemoryD2H (void *pHostDestination, const void *pDeviceSource, std::size_t byteCount) noexcept=0 |
Device-to-host memory copy. | |
virtual NeuralResult | registerLibraryContext (ILibraryContext *pLibraryContext) noexcept=0 |
Registers a new library context with the backend. More... | |
virtual ILibraryContext * | getLibraryContext (ILibraryContext::LibraryId libraryId) noexcept=0 |
Retrieves a library context by its identifier. More... | |
virtual const ILibraryContext * | getLibraryContext (ILibraryContext::LibraryId libraryId) const noexcept=0 |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
virtual NeuralResult | allocateMemoryBlock (MemoryHandle *pHandle, size_t byteCount) noexcept=0 |
Allocates a memory block of the requested size. More... | |
virtual NeuralResult | freeMemoryBlock (MemoryHandle handle) noexcept=0 |
Frees a memory block that was allocated with allocateMemoryBlock. More... | |
virtual void * | getAddressForMemoryBlock (MemoryHandle handle) noexcept=0 |
Retrieves the raw address corresponding to a MemoryHandle. More... | |
virtual size_t | getSizeForMemoryBlock (MemoryHandle handle) noexcept=0 |
Retrieves the buffer size corresponding to a MemoryHandle. More... | |
virtual NeuralResult | lockMemoryBlock (MemoryHandle handle) noexcept=0 |
Locks a memory block to prevent reuse. More... | |
virtual NeuralResult | unlockMemoryBlock (MemoryHandle handle) noexcept=0 |
Unlocks a memory block. More... | |
virtual MemoryHandle | updateTensor (const ILayer *pLayer, INetworkRuntime *pNetwork, TensorFormat format, MemoryHandle hOriginal, TensorDimension stepping, TensorDimension internalDimensions) noexcept=0 |
Updates a memory handle. | |
virtual NeuralResult | clearLoadedWeights () noexcept=0 |
Clears all loaded weights. | |
virtual NeuralResult | uploadWeights (const void **ppUploadedWeightsOut, const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader, const char *pName, const void *pWeightsData, std::size_t weightsDataSize, TensorDimension weightsDim, TensorFormat format, bool memManagedExternally) noexcept=0 |
Uploads weights data to an internal cache. More... | |
virtual const void * | getAddressForWeightsData (const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader, const char *pName, TensorFormat format) const noexcept=0 |
Retrieves loaded weights data from the internal cache. More... | |
virtual NeuralResult | getDimensionsForWeightsData (TensorDimension *pDimensionOut, const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader, const char *pName, TensorFormat format) const noexcept=0 |
Retrieves loaded weights dimensions from the internal cache. More... | |
virtual NeuralResult | getWeightsNamesForLayer (IStringList **ppListOut, const ILayer *pLayer, const IWeightsLoader *pOriginWeightLoader) const noexcept=0 |
Retrieves names of loaded weights objects from the internal cache. More... | |
virtual bool | supportsOptimization (OptimizationCapability optimization) const noexcept=0 |
Returns true if the indicated optimization is applicable to this backend. More... | |
![]() | |
virtual RefCount | addRef () const noexcept=0 |
Increments the object's reference count. More... | |
virtual const void * | queryInterface (TypeId interface) const noexcept=0 |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
virtual void * | queryInterface (TypeId interface) noexcept=0 |
Retrieves a new object interface pointer. More... | |
virtual RefCount | release () const noexcept=0 |
Decrements the object's reference count and destroy the object if the reference count reaches zero. More... | |
Static Public Attributes | |
static const IRefObject::TypeId | typeID = 0x02793dfd8bfde737ul |
Interface TypeId for InterfaceOf purposes. | |
![]() | |
static const IRefObject::TypeId | typeID = 0xacd7828da90108ddul |
Interface TypeId for InterfaceOf purposes. | |
![]() | |
static const TypeId | typeID = 0x14ecc3f9de638e1dul |
Interface TypeId for InterfaceOf purposes. | |
Additional Inherited Members | |
![]() | |
enum class | OptimizationCapability : std::uint64_t { SkipConcatenation = 0xdd13d58fbabb2f5bul , FuseBatchNormAndConvolution = 0xeaffe6d9a4acfdc9ul } |
List of optional optimizations supported by backends. More... | |
![]() | |
using | RefCount = std::uint32_t |
Typedef used to track the number of active references to an object. | |
using | TypeId = std::uint64_t |
Every interface must define a unique TypeId. This should be randomized. | |
![]() | |
virtual | ~IRefObject ()=default |
A protected destructor prevents accidental stack-allocation of IRefObjects or use with other smart pointer classes like std::unique_ptr. | |
INetworkBackend2 is a revision of INetworkBackend.
See INetworkBackend for more information. Describes a single input terminal for a layer. LayerDesc objects contain arrays of this structure.
|
noexcept |
Allocates a memory block of the requested size.
This memory must be freed with freeMemoryBlock.
If this function fails, pHandle will receive nullptr as a value.
pHandle | [out] Pointer receiving a MemoryHandle to the new memory |
byteCount | Number of bytes to allocate |
|
pure virtualnoexcept |
Allocates a memory block of the requested size and allows tracking of the memory block using a user-defined key.
It is best practice to use the layer's name as the key.
The tracking key has two main parts, a "key" which is the higher level tracking key, like a layer name, and the "subkey" which is tracked under the main key, like the semantic and weight name. The subkey is not user-defined.
This memory must be freed with freeMemoryBlock.
If this function fails, pHandle will receive nullptr as a value.
pHandle | Pointer receiving a MemoryHandle to the new memory. |
byteCount | Number of bytes to allocate. |
pTrackingKey | A string key, usually the layer's name, which allows tracking of this allocation's information. Use nullptr to disable specific tracking. |
|
pure virtualnoexcept |
Compiles and returns memory data for the given key.
The returned memory data is in bytes. A nullptr is returned if the input(s) are null or if the tracking data could not be found.
pTrackingKey | A string key, usually the layer's name. |
pTrackingSubkey | A string key, usually from getMemoryTrackingSubkey. |
|
pure virtualnoexcept |
Returns an IStringList of the currently tracked keys.
The order of the keys in the list is unspecified. This list is retrieved from the allocator, so high-performance/low-overhead allocators are allowed to ignore tracking keys and return an empty list.
Special keys: { network } : memory allocations of the entire network { untracked } : memory allocations where no tracking key was given
NeuralResult::Failure is returned for null input.
ppKeysOut | Variable receiving a reference to a new IStringList of tracked keys. Caller must release the reference. |
|
pure virtualnoexcept |
Returns an IStringList of the subkeys of given tracking key.
The order of the keys in the list is unspecified. This list is retrieved from the allocator, so high-performance/low-overhead allocators are allowed to ignore tracking keys and return an empty list.
Special keys: { all } : memory allocations of the entire key (usually the layer's name)
NeuralResult::Failure is returned for null input(s).
pTrackingKey | A string key, usually the layer's name. |
ppKeysOut | Variable receiving a reference to a new IStringList of tracked subkeys. Caller must release the reference. |
|
pure virtualnoexcept |
Sets a potential tracking key.
The tracking key has two main parts, a key and optional subkey, which are described in the function INetworkBackend2::allocateMemoryBlock. This function sets a key that will be used when the INetworkBackend::allocateMemoryBlock is used to allocate memory which does not contain a tracking key parameter. This function should be called before any memory allocations for a tracking need and should be called after to clear the key (nullptr). Normally this is only called during reshapes and evaluteForwards with the layer's name.
NeuralResult::Failure is returned when a prefix could not be set.
pTrackingKey | A string key, usually the layer's name. |
pTrackingSubkey | A string key, usually the memory semantic. |