NVIDIA NvNeural SDK
2022.2
GPU inference framework for NVIDIA Nsight Deep Learning Designer
|
Base class for the CUDA prototype layers shipped with NvNeural. More...
#include <BasePrototypeLayer.h>
Public Member Functions | |
ILayer inherited members | |
const char * | serializedType () const noexcept |
Retrieves the layer type. More... | |
NeuralResult | setName (const char *pName) noexcept |
Sets the layer name. More... | |
const char * | name () const noexcept |
Retrieves the layer name. More... | |
NeuralResult | setNetworkRuntime (INetworkRuntime *pNetworkRuntime) noexcept |
Informs the layer it has been attached to a new network. More... | |
NetworkBackendId | backendId () const noexcept |
Returns the backend ID associated with this layer implementation. More... | |
TensorFormat | tensorFormat () const noexcept |
Returns the tensor format consumed by this layer implementation. More... | |
TensorDimension | stepping () const noexcept |
Returns the internal storage stride consumed by this layer implementation. More... | |
TensorDimension | dimensions () const noexcept |
Retrieves the dimensions of the layer's output tensor. More... | |
TensorDimension | internalDimensions () const noexcept |
Retrieves the dimensions of the layer's output tensor as allocated internally. More... | |
size_t | tensorBufferSize () const noexcept |
Retrieve the size of the layer's output tensor buffer in bytes. More... | |
size_t | tensorInternalBufferSize () const noexcept |
Retrieves the dimensions of the layer's output tensor as allocated internally. More... | |
NeuralResult | loadFromParameters (const IParameterNode *pParameters) noexcept |
Loads layer parameters from a serialized key-value representation. More... | |
NeuralResult | getInputLayers (ILayerList **ppInputLayers) const noexcept |
Retrieves the inputs for this layer. More... | |
NeuralResult | setInputLayer (size_t index, ILayer *pLayer) noexcept |
Sets an input layer by index. More... | |
NeuralResult | setPermanent (bool permanent) noexcept |
Sets or clears the "permanent" flag on a layer's output tensor. More... | |
bool | isPermanent () const noexcept |
Returns the current status of the "permanent" flag. More... | |
NeuralResult | setAffected (bool affected) noexcept |
Sets or clears the "affected" flag on a layer's output tensor. More... | |
bool | isAffected () const noexcept |
Returns the current status of the "affected" flag. More... | |
NeuralResult | setActivationFunction (ActivationFunctionId activationFunction) noexcept |
Sets the activation function attached to the layer. More... | |
ActivationFunctionId | activationFunction () const noexcept |
Retrieves the activation function attached to this layer. More... | |
NeuralResult | setActivationCoefficient (std::size_t coefficientIndex, float value) noexcept |
Sets an activation coefficient. More... | |
float | activationCoefficient (std::size_t coefficientIndex) const noexcept |
Retrieves the activation coefficient for the specified index. More... | |
const char * | weightsName () const noexcept |
Retrieves the name used to identify this layer's weights. More... | |
NeuralResult | setWeightsName (const char *pWeightsName) noexcept |
Sets the name used to identify this layer's weights. More... | |
TensorDimension | weightsDimensions (const char *pWeightsName, WeightsQuery queryType) const noexcept |
Retrieves the tensor dimension of a layer's named weight input. More... | |
NeuralResult | reshape () noexcept |
Initializes (or reinitializes) the layer implementation with the current set of parameters. More... | |
NeuralResult | evaluateForward () noexcept |
Performs forward evaluation for this layer. More... | |
NeuralResult | getData (void **ppOut, TensorFormat format, const ILayer *pRequestingLayer) noexcept |
Retrieves device-side memory for the layer's output. More... | |
NeuralResult | getConstData (const void **ppOut, TensorFormat format, const ILayer *pRequestingLayer) const noexcept |
Retrieves read-only device-side memory for the layer's output. More... | |
NeuralResult | getCpuConstData (void *pOutBuffer, size_t bufferByteCount, size_t *pBytesCopied, TensorFormat format) const noexcept |
Retrieves read-only CPU-side memory for the layer's output. More... | |
IPrototypeLayer inherited members | |
NeuralResult | addPrototypeCode (const char *pName, const char *pCode) noexcept final |
Adds a kernel definition for a particular tensor format and backend. More... | |
NeuralResult | addForwardEvalCall (const char *pName, const char *pCall) noexcept final |
Adds a call script for a particular tensor format and backend. More... | |
NeuralResult | addParameterKey (const char *pKey) noexcept final |
Adds a named parameter to the layer. More... | |
ICppCodeGenerationLayer inherited members | |
NeuralResult | generateLayerCpp (ICppCodeGenerationLayerHost *pHost) noexcept final |
Generates C++ code to configure the layer. More... | |
NeuralResult | networkGenerationComplete () noexcept final |
Indicates the entire network has been generated. More... | |
![]() | |
IRefObject::RefCount | addRef () const noexcept |
Increment the object's reference count. More... | |
const void * | queryInterface (IRefObject::TypeId interfaceId) const noexcept |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
void * | queryInterface (IRefObject::TypeId interfaceId) noexcept |
Retrieves a new object interface pointer. More... | |
RefObjectBase () | |
Default constructor. Logs object creation. | |
IRefObject::RefCount | release () const noexcept |
Decrements the object's reference count and destroys the object if the reference count reaches zero. More... | |
Protected Member Functions | |
BasePrototypeLayer () | |
Creates a new BasePrototypeLayer object. | |
NeuralResult | loadBasePrototypeInfo (NetworkBackendId backendId, const TensorFormat &format, CompilationLevel compilationLevel, const std::string ¤tImpl, const IParameterNode *pParameters) |
Loads prototype information from a parameter node. More... | |
NeuralResult | setBasePrototypeInfo (std::size_t inputsCount, const std::string &type, NetworkBackendId backend, const TensorFormat &format, const std::string &dim, const std::string &impl) noexcept |
Stores implementation-agnostic details of the prototype layer. More... | |
Methods to be implemented by derived classes | |
virtual NeuralResult | implementationReshape () noexcept=0 |
Implementation-specific logic for ILayer::reshape. | |
virtual NeuralResult | implementationForward () noexcept=0 |
Implementation-specific logic for ILayer::evaluateForward. | |
Protected Attributes | |
detail::SizeValue | m_block |
Preferred block size in threads. | |
std::string | m_entry |
Name of compiled entry point (e.g., CUDA function name) | |
detail::SizeValue | m_grid |
Preferred grid size in blocks. | |
std::vector< detail::AnyOperand > | m_ops |
Arguments to pass to the compiled entry point. | |
INetworkRuntime * | m_pNetwork |
Pointer to owning network. | |
std::string | m_selected_code |
Source code for the currently selected implementation. | |
Base class for the CUDA prototype layers shipped with NvNeural.
|
noexcept |
Retrieves the activation coefficient for the specified index.
The meaning of this value is activation-specific. If no setActivationCoefficient call was previously issued for this index, the layer should return a default value of zero.
coefficientIndex | Coefficient index |
|
noexcept |
Retrieves the activation function attached to this layer.
Should return the most recent value passed to setActivationFunction.
|
finalnoexcept |
Adds a call script for a particular tensor format and backend.
Replaces previous definitions of that kernel.
pName | Name of the prototype (e.g., "cuda_fp32_nchw") |
pCall | Call script for the prototype |
|
finalnoexcept |
Adds a named parameter to the layer.
pKey | Parameter name that should be included in kernel calls |
|
finalnoexcept |
Adds a kernel definition for a particular tensor format and backend.
Replaces previous definitions of that kernel.
pName | Name of the prototype provided (e.g., "cuda_fp32_nchw") |
pCode | Kernel code for the prototype |
|
noexcept |
Returns the backend ID associated with this layer implementation.
Networks will reshape tensors as necessary in order to match the return value.
|
noexcept |
Retrieves the dimensions of the layer's output tensor.
Should not change except in response to a reshape event.
|
noexcept |
Performs forward evaluation for this layer.
If the layer does not provide a pre-fused implementation of the selected activation function, it MUST call INetworkRuntime::defaultForwardActivation(this).
|
finalnoexcept |
Generates C++ code to configure the layer.
See the ICppCodeGenerationLayerHost interface for details of the generated code.
A failure result from this function will cause the tool to report failure to the user.
|
noexcept |
Retrieves read-only device-side memory for the layer's output.
|
noexcept |
Retrieves read-only CPU-side memory for the layer's output.
This function is intended for debugging purposes. Element count of the buffer should be dimensions().elementCount(), multiplied by the size of the desired element type.
To check buffer sizes without preallocating memory, pass nullptr in pOutBuffer and zero in bufferByteCount. pBytesCopied will then receive the optimal size for pOutBuffer. Other uses of nullptr are considered errors.
pBytesCopied is not modified if this function fails.
pOutBuffer | Memory region receiving the layer output data |
bufferByteCount | Size of pOutBuffer in bytes |
pBytesCopied | Optional output variable receiving the number of bytes copied into pOutBuffer |
format | Format for the output buffer |
|
noexcept |
Retrieves device-side memory for the layer's output.
|
noexcept |
Retrieves the inputs for this layer.
ppInputLayers | Pointer receiving a reference to an ILayerList object. |
Layer may create a new list or return an incremented reference to a cached object; this is an implementation detail.
In either case, caller is responsible for releasing its reference when done with *ppInputLayers.
|
noexcept |
Retrieves the dimensions of the layer's output tensor as allocated internally.
Internal allocations may have additional padding or alignment restrictions.
|
noexcept |
Returns the current status of the "affected" flag.
By default, layers should be "affected" after initialization.
|
noexcept |
Returns the current status of the "permanent" flag.
By default, layers should not be "permanent."
|
protected |
Loads prototype information from a parameter node.
Replaces calls to setBasePrototypeInfo; this path is used by network builder classes instead.
The parameter node used for this call does not typically map to a <Parameters> block in XML. Instead, it must provide the following key-value pairs:
Implementation-specific blocks must also be provided for at least the currentImpl parameter:
backendId | Backend to prefer in ILayer::backendId |
format | Format to prefer in ILayer::tensorFormat |
compilationLevel | Determines whether to decode currentImpl's code as source or binary |
currentImpl | Implementation type string like "cuda_fp32_nchw" |
pParameters | IParameterNode containing additional type details |
|
noexcept |
Loads layer parameters from a serialized key-value representation.
Tools such as ConverenceNG call this during network construction. Deployed applications may use this function themselves (often with a container class such as StringMapParameterNode) or may use layer-specific object interfaces; either approach is valid.
|
noexcept |
Retrieves the layer name.
|
finalnoexcept |
Indicates the entire network has been generated.
Layers that rely on out-of-band synchronization to avoid multiple definitions of key types may use this callback to know that all layers have been visited, and can then reset any static variables they used to track this procedure.
This function will be called whether network generation succeeded or failed, but any failure result reported should be treated by the caller as an overall generation error.
|
noexcept |
Initializes (or reinitializes) the layer implementation with the current set of parameters.
It is safe to call this multiple times, and may be necessary as parameters change.
|
noexcept |
Retrieves the layer type.
|
noexcept |
Sets an activation coefficient.
The meaning of this value is activation-specific.
coefficientIndex | Coefficient index |
value | Value to store |
|
noexcept |
Sets the activation function attached to the layer.
Repeated calls to this function replace the activation rather than stacking them.
activationFunction | New activation function ID |
|
noexcept |
Sets or clears the "affected" flag on a layer's output tensor.
Layers that are "affected" have pending computation work and will be reevaluated as necessary during inference. Networks will clear the "affected" flag after successful inference.
Layers should mark themselves as "affected" in response to parameter changes that would be visible during inference. Examples of such changes include loadFromParameters and the IRuntimeOptionsLayer::setRuntimeOptionValue callback.
affected | New value for the "affected" flag |
|
protectednoexcept |
Stores implementation-agnostic details of the prototype layer.
inputsCount | Number of inputs accepted by the layer |
type | Type to report in ILayer::serializedType |
backend | Backend to prefer in ILayer::backendId |
format | Format to prefer in ILayer::tensorFormat |
dim | Script code that determines the size reported in ILayer::dimensions during reshape |
impl | Implementation type string like "cuda_fp32_nchw" |
|
noexcept |
Sets an input layer by index.
Input indices are explicitly allowed to be sparse in order to support optional inputs; usually such inputs are used to provide weights from other parts of the graph instead of going out to IWeightsLoader.
index | Index of the input to assign |
pLayer | Layer to assign as an input |
|
noexcept |
Sets the layer name.
Layer names are unique within a network.
pName | New layer name; layer must copy this data. |
|
noexcept |
Informs the layer it has been attached to a new network.
INetwork::pushLayer implementations will call this function for you; host applications do not normally need to call this function.
pNetworkRuntime | New owning network |
|
noexcept |
Sets or clears the "permanent" flag on a layer's output tensor.
Layers that are "permanent" are kept resident in memory and their tensor buffers are not reused. Setting too many layers as "permanent" will have negative effects on memory utilization.
permanent | New value for the "permanent" flag |
|
noexcept |
Sets the name used to identify this layer's weights.
pWeightsName | New layer name to use for weights instead of name() |
|
noexcept |
Returns the internal storage stride consumed by this layer implementation.
Output tensors are padded to be multiples of this value.
|
noexcept |
Retrieve the size of the layer's output tensor buffer in bytes.
This result is defined as the number of elements in dimensions() multiplied by the size of an individual element (e.g., 4 for FP32).
|
noexcept |
Returns the tensor format consumed by this layer implementation.
Networks will reshape tensors as necessary in order to match the return value.
|
noexcept |
Retrieves the dimensions of the layer's output tensor as allocated internally.
Internal allocations may have additional padding or alignment restrictions. The result is defined as the number of elements in internalDimensions() multiplied by the size of an individual element (e.g., 4 for FP32).
|
noexcept |
Retrieves the tensor dimension of a layer's named weight input.
pWeightsName | Name of a weights object being queried |
queryType | Type of dimensions being queried; see WeightsQuery for details |
|
noexcept |
Retrieves the name used to identify this layer's weights.
By default, this should be the name assigned with setName. Alternate weights names allow layer aliasing and namespacing within subnetworks.