NVIDIA NvNeural SDK  2022.2
GPU inference framework for NVIDIA Nsight Deep Learning Designer
nvneural::BasePrototypeLayer Class Referenceabstract

Base class for the CUDA prototype layers shipped with NvNeural. More...

#include <BasePrototypeLayer.h>

Inheritance diagram for nvneural::BasePrototypeLayer:
nvneural::refobj::RefObjectBase< refobj::Implements< ILayer >, refobj::Implements< IPrototypeLayer >, refobj::Implements< ICppCodeGenerationLayer > >

Public Member Functions

ILayer inherited members
const char * serializedType () const noexcept
 Retrieves the layer type. More...
 
NeuralResult setName (const char *pName) noexcept
 Sets the layer name. More...
 
const char * name () const noexcept
 Retrieves the layer name. More...
 
NeuralResult setNetworkRuntime (INetworkRuntime *pNetworkRuntime) noexcept
 Informs the layer it has been attached to a new network. More...
 
NetworkBackendId backendId () const noexcept
 Returns the backend ID associated with this layer implementation. More...
 
TensorFormat tensorFormat () const noexcept
 Returns the tensor format consumed by this layer implementation. More...
 
TensorDimension stepping () const noexcept
 Returns the internal storage stride consumed by this layer implementation. More...
 
TensorDimension dimensions () const noexcept
 Retrieves the dimensions of the layer's output tensor. More...
 
TensorDimension internalDimensions () const noexcept
 Retrieves the dimensions of the layer's output tensor as allocated internally. More...
 
size_t tensorBufferSize () const noexcept
 Retrieve the size of the layer's output tensor buffer in bytes. More...
 
size_t tensorInternalBufferSize () const noexcept
 Retrieves the dimensions of the layer's output tensor as allocated internally. More...
 
NeuralResult loadFromParameters (const IParameterNode *pParameters) noexcept
 Loads layer parameters from a serialized key-value representation. More...
 
NeuralResult getInputLayers (ILayerList **ppInputLayers) const noexcept
 Retrieves the inputs for this layer. More...
 
NeuralResult setInputLayer (size_t index, ILayer *pLayer) noexcept
 Sets an input layer by index. More...
 
NeuralResult setPermanent (bool permanent) noexcept
 Sets or clears the "permanent" flag on a layer's output tensor. More...
 
bool isPermanent () const noexcept
 Returns the current status of the "permanent" flag. More...
 
NeuralResult setAffected (bool affected) noexcept
 Sets or clears the "affected" flag on a layer's output tensor. More...
 
bool isAffected () const noexcept
 Returns the current status of the "affected" flag. More...
 
NeuralResult setActivationFunction (ActivationFunctionId activationFunction) noexcept
 Sets the activation function attached to the layer. More...
 
ActivationFunctionId activationFunction () const noexcept
 Retrieves the activation function attached to this layer. More...
 
NeuralResult setActivationCoefficient (std::size_t coefficientIndex, float value) noexcept
 Sets an activation coefficient. More...
 
float activationCoefficient (std::size_t coefficientIndex) const noexcept
 Retrieves the activation coefficient for the specified index. More...
 
const char * weightsName () const noexcept
 Retrieves the name used to identify this layer's weights. More...
 
NeuralResult setWeightsName (const char *pWeightsName) noexcept
 Sets the name used to identify this layer's weights. More...
 
TensorDimension weightsDimensions (const char *pWeightsName, WeightsQuery queryType) const noexcept
 Retrieves the tensor dimension of a layer's named weight input. More...
 
NeuralResult reshape () noexcept
 Initializes (or reinitializes) the layer implementation with the current set of parameters. More...
 
NeuralResult evaluateForward () noexcept
 Performs forward evaluation for this layer. More...
 
NeuralResult getData (void **ppOut, TensorFormat format, const ILayer *pRequestingLayer) noexcept
 Retrieves device-side memory for the layer's output. More...
 
NeuralResult getConstData (const void **ppOut, TensorFormat format, const ILayer *pRequestingLayer) const noexcept
 Retrieves read-only device-side memory for the layer's output. More...
 
NeuralResult getCpuConstData (void *pOutBuffer, size_t bufferByteCount, size_t *pBytesCopied, TensorFormat format) const noexcept
 Retrieves read-only CPU-side memory for the layer's output. More...
 
IPrototypeLayer inherited members
NeuralResult addPrototypeCode (const char *pName, const char *pCode) noexcept final
 Adds a kernel definition for a particular tensor format and backend. More...
 
NeuralResult addForwardEvalCall (const char *pName, const char *pCall) noexcept final
 Adds a call script for a particular tensor format and backend. More...
 
NeuralResult addParameterKey (const char *pKey) noexcept final
 Adds a named parameter to the layer. More...
 
ICppCodeGenerationLayer inherited members
NeuralResult generateLayerCpp (ICppCodeGenerationLayerHost *pHost) noexcept final
 Generates C++ code to configure the layer. More...
 
NeuralResult networkGenerationComplete () noexcept final
 Indicates the entire network has been generated. More...
 
- Public Member Functions inherited from nvneural::refobj::RefObjectBase< refobj::Implements< ILayer >, refobj::Implements< IPrototypeLayer >, refobj::Implements< ICppCodeGenerationLayer > >
IRefObject::RefCount addRef () const noexcept
 Increment the object's reference count. More...
 
const void * queryInterface (IRefObject::TypeId interfaceId) const noexcept
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
void * queryInterface (IRefObject::TypeId interfaceId) noexcept
 Retrieves a new object interface pointer. More...
 
 RefObjectBase ()
 Default constructor. Logs object creation.
 
IRefObject::RefCount release () const noexcept
 Decrements the object's reference count and destroys the object if the reference count reaches zero. More...
 

Protected Member Functions

 BasePrototypeLayer ()
 Creates a new BasePrototypeLayer object.
 
NeuralResult loadBasePrototypeInfo (NetworkBackendId backendId, const TensorFormat &format, CompilationLevel compilationLevel, const std::string &currentImpl, const IParameterNode *pParameters)
 Loads prototype information from a parameter node. More...
 
NeuralResult setBasePrototypeInfo (std::size_t inputsCount, const std::string &type, NetworkBackendId backend, const TensorFormat &format, const std::string &dim, const std::string &impl) noexcept
 Stores implementation-agnostic details of the prototype layer. More...
 
Methods to be implemented by derived classes
virtual NeuralResult implementationReshape () noexcept=0
 Implementation-specific logic for ILayer::reshape.
 
virtual NeuralResult implementationForward () noexcept=0
 Implementation-specific logic for ILayer::evaluateForward.
 

Protected Attributes

detail::SizeValue m_block
 Preferred block size in threads.
 
std::string m_entry
 Name of compiled entry point (e.g., CUDA function name)
 
detail::SizeValue m_grid
 Preferred grid size in blocks.
 
std::vector< detail::AnyOperand > m_ops
 Arguments to pass to the compiled entry point.
 
INetworkRuntimem_pNetwork
 Pointer to owning network.
 
std::string m_selected_code
 Source code for the currently selected implementation.
 

Detailed Description

Base class for the CUDA prototype layers shipped with NvNeural.

Member Function Documentation

◆ activationCoefficient()

float BasePrototypeLayer::activationCoefficient ( std::size_t  coefficientIndex) const
noexcept

Retrieves the activation coefficient for the specified index.

The meaning of this value is activation-specific. If no setActivationCoefficient call was previously issued for this index, the layer should return a default value of zero.

Parameters
coefficientIndexCoefficient index

◆ activationFunction()

ActivationFunctionId BasePrototypeLayer::activationFunction ( ) const
noexcept

Retrieves the activation function attached to this layer.

Should return the most recent value passed to setActivationFunction.

◆ addForwardEvalCall()

NeuralResult BasePrototypeLayer::addForwardEvalCall ( const char *  pName,
const char *  pCall 
)
finalnoexcept

Adds a call script for a particular tensor format and backend.

Replaces previous definitions of that kernel.

Parameters
pNameName of the prototype (e.g., "cuda_fp32_nchw")
pCallCall script for the prototype

◆ addParameterKey()

NeuralResult BasePrototypeLayer::addParameterKey ( const char *  pKey)
finalnoexcept

Adds a named parameter to the layer.

Parameters
pKeyParameter name that should be included in kernel calls

◆ addPrototypeCode()

NeuralResult BasePrototypeLayer::addPrototypeCode ( const char *  pName,
const char *  pCode 
)
finalnoexcept

Adds a kernel definition for a particular tensor format and backend.

Replaces previous definitions of that kernel.

Parameters
pNameName of the prototype provided (e.g., "cuda_fp32_nchw")
pCodeKernel code for the prototype

◆ backendId()

NetworkBackendId BasePrototypeLayer::backendId ( ) const
noexcept

Returns the backend ID associated with this layer implementation.

Networks will reshape tensors as necessary in order to match the return value.

◆ dimensions()

TensorDimension BasePrototypeLayer::dimensions ( ) const
noexcept

Retrieves the dimensions of the layer's output tensor.

Should not change except in response to a reshape event.

◆ evaluateForward()

NeuralResult BasePrototypeLayer::evaluateForward ( )
noexcept

Performs forward evaluation for this layer.

If the layer does not provide a pre-fused implementation of the selected activation function, it MUST call INetworkRuntime::defaultForwardActivation(this).

◆ generateLayerCpp()

NeuralResult BasePrototypeLayer::generateLayerCpp ( ICppCodeGenerationLayerHost pHost)
finalnoexcept

Generates C++ code to configure the layer.

See the ICppCodeGenerationLayerHost interface for details of the generated code.

A failure result from this function will cause the tool to report failure to the user.

◆ getConstData()

NeuralResult BasePrototypeLayer::getConstData ( const void **  ppOut,
TensorFormat  format,
const ILayer pRequestingLayer 
) const
noexcept

Retrieves read-only device-side memory for the layer's output.

◆ getCpuConstData()

NeuralResult BasePrototypeLayer::getCpuConstData ( void *  pOutBuffer,
size_t  bufferByteCount,
size_t *  pBytesCopied,
TensorFormat  format 
) const
noexcept

Retrieves read-only CPU-side memory for the layer's output.

This function is intended for debugging purposes. Element count of the buffer should be dimensions().elementCount(), multiplied by the size of the desired element type.

To check buffer sizes without preallocating memory, pass nullptr in pOutBuffer and zero in bufferByteCount. pBytesCopied will then receive the optimal size for pOutBuffer. Other uses of nullptr are considered errors.

pBytesCopied is not modified if this function fails.

Parameters
pOutBufferMemory region receiving the layer output data
bufferByteCountSize of pOutBuffer in bytes
pBytesCopiedOptional output variable receiving the number of bytes copied into pOutBuffer
formatFormat for the output buffer

◆ getData()

NeuralResult BasePrototypeLayer::getData ( void **  ppOut,
TensorFormat  format,
const ILayer pRequestingLayer 
)
noexcept

Retrieves device-side memory for the layer's output.

◆ getInputLayers()

NeuralResult BasePrototypeLayer::getInputLayers ( ILayerList **  ppInputLayers) const
noexcept

Retrieves the inputs for this layer.

Parameters
ppInputLayersPointer receiving a reference to an ILayerList object.

Layer may create a new list or return an incremented reference to a cached object; this is an implementation detail.

In either case, caller is responsible for releasing its reference when done with *ppInputLayers.

◆ internalDimensions()

TensorDimension BasePrototypeLayer::internalDimensions ( ) const
noexcept

Retrieves the dimensions of the layer's output tensor as allocated internally.

Internal allocations may have additional padding or alignment restrictions.

See also
stepping()

◆ isAffected()

bool BasePrototypeLayer::isAffected ( ) const
noexcept

Returns the current status of the "affected" flag.

By default, layers should be "affected" after initialization.

◆ isPermanent()

bool BasePrototypeLayer::isPermanent ( ) const
noexcept

Returns the current status of the "permanent" flag.

By default, layers should not be "permanent."

◆ loadBasePrototypeInfo()

NeuralResult BasePrototypeLayer::loadBasePrototypeInfo ( NetworkBackendId  backendId,
const TensorFormat format,
CompilationLevel  compilationLevel,
const std::string &  currentImpl,
const IParameterNode pParameters 
)
protected

Loads prototype information from a parameter node.

Replaces calls to setBasePrototypeInfo; this path is used by network builder classes instead.

The parameter node used for this call does not typically map to a <Parameters> block in XML. Instead, it must provide the following key-value pairs:

  • inputCount
  • dim
  • parameterKeys

Implementation-specific blocks must also be provided for at least the currentImpl parameter:

  • [currentImpl]_call (call script)
  • [currentImpl] (inference code)
Parameters
backendIdBackend to prefer in ILayer::backendId
formatFormat to prefer in ILayer::tensorFormat
compilationLevelDetermines whether to decode currentImpl's code as source or binary
currentImplImplementation type string like "cuda_fp32_nchw"
pParametersIParameterNode containing additional type details

◆ loadFromParameters()

NeuralResult BasePrototypeLayer::loadFromParameters ( const IParameterNode pParameters)
noexcept

Loads layer parameters from a serialized key-value representation.

Tools such as ConverenceNG call this during network construction. Deployed applications may use this function themselves (often with a container class such as StringMapParameterNode) or may use layer-specific object interfaces; either approach is valid.

◆ name()

const char * BasePrototypeLayer::name ( ) const
noexcept

Retrieves the layer name.

Returns
Layer name; pointer must be valid until object is destroyed or setName is called.

◆ networkGenerationComplete()

NeuralResult BasePrototypeLayer::networkGenerationComplete ( )
finalnoexcept

Indicates the entire network has been generated.

Layers that rely on out-of-band synchronization to avoid multiple definitions of key types may use this callback to know that all layers have been visited, and can then reset any static variables they used to track this procedure.

This function will be called whether network generation succeeded or failed, but any failure result reported should be treated by the caller as an overall generation error.

◆ reshape()

NeuralResult BasePrototypeLayer::reshape ( )
noexcept

Initializes (or reinitializes) the layer implementation with the current set of parameters.

It is safe to call this multiple times, and may be necessary as parameters change.

◆ serializedType()

const char * BasePrototypeLayer::serializedType ( ) const
noexcept

Retrieves the layer type.

Returns
Layer type; pointer must be a non null string.

◆ setActivationCoefficient()

NeuralResult BasePrototypeLayer::setActivationCoefficient ( std::size_t  coefficientIndex,
float  value 
)
noexcept

Sets an activation coefficient.

The meaning of this value is activation-specific.

Parameters
coefficientIndexCoefficient index
valueValue to store

◆ setActivationFunction()

NeuralResult BasePrototypeLayer::setActivationFunction ( ActivationFunctionId  activationFunction)
noexcept

Sets the activation function attached to the layer.

Repeated calls to this function replace the activation rather than stacking them.

Parameters
activationFunctionNew activation function ID

◆ setAffected()

NeuralResult BasePrototypeLayer::setAffected ( bool  affected)
noexcept

Sets or clears the "affected" flag on a layer's output tensor.

Layers that are "affected" have pending computation work and will be reevaluated as necessary during inference. Networks will clear the "affected" flag after successful inference.

Layers should mark themselves as "affected" in response to parameter changes that would be visible during inference. Examples of such changes include loadFromParameters and the IRuntimeOptionsLayer::setRuntimeOptionValue callback.

Parameters
affectedNew value for the "affected" flag

◆ setBasePrototypeInfo()

NeuralResult BasePrototypeLayer::setBasePrototypeInfo ( std::size_t  inputsCount,
const std::string &  type,
NetworkBackendId  backend,
const TensorFormat format,
const std::string &  dim,
const std::string &  impl 
)
protectednoexcept

Stores implementation-agnostic details of the prototype layer.

Parameters
inputsCountNumber of inputs accepted by the layer
typeType to report in ILayer::serializedType
backendBackend to prefer in ILayer::backendId
formatFormat to prefer in ILayer::tensorFormat
dimScript code that determines the size reported in ILayer::dimensions during reshape
implImplementation type string like "cuda_fp32_nchw"

◆ setInputLayer()

NeuralResult BasePrototypeLayer::setInputLayer ( size_t  index,
ILayer pLayer 
)
noexcept

Sets an input layer by index.

Input indices are explicitly allowed to be sparse in order to support optional inputs; usually such inputs are used to provide weights from other parts of the graph instead of going out to IWeightsLoader.

Parameters
indexIndex of the input to assign
pLayerLayer to assign as an input

◆ setName()

NeuralResult BasePrototypeLayer::setName ( const char *  pName)
noexcept

Sets the layer name.

Layer names are unique within a network.

Parameters
pNameNew layer name; layer must copy this data.

◆ setNetworkRuntime()

NeuralResult BasePrototypeLayer::setNetworkRuntime ( INetworkRuntime pNetworkRuntime)
noexcept

Informs the layer it has been attached to a new network.

INetwork::pushLayer implementations will call this function for you; host applications do not normally need to call this function.

Parameters
pNetworkRuntimeNew owning network

◆ setPermanent()

NeuralResult BasePrototypeLayer::setPermanent ( bool  permanent)
noexcept

Sets or clears the "permanent" flag on a layer's output tensor.

Layers that are "permanent" are kept resident in memory and their tensor buffers are not reused. Setting too many layers as "permanent" will have negative effects on memory utilization.

Parameters
permanentNew value for the "permanent" flag

◆ setWeightsName()

NeuralResult BasePrototypeLayer::setWeightsName ( const char *  pWeightsName)
noexcept

Sets the name used to identify this layer's weights.

Parameters
pWeightsNameNew layer name to use for weights instead of name()

◆ stepping()

TensorDimension BasePrototypeLayer::stepping ( ) const
noexcept

Returns the internal storage stride consumed by this layer implementation.

Output tensors are padded to be multiples of this value.

◆ tensorBufferSize()

size_t BasePrototypeLayer::tensorBufferSize ( ) const
noexcept

Retrieve the size of the layer's output tensor buffer in bytes.

This result is defined as the number of elements in dimensions() multiplied by the size of an individual element (e.g., 4 for FP32).

◆ tensorFormat()

TensorFormat BasePrototypeLayer::tensorFormat ( ) const
noexcept

Returns the tensor format consumed by this layer implementation.

Networks will reshape tensors as necessary in order to match the return value.

◆ tensorInternalBufferSize()

size_t BasePrototypeLayer::tensorInternalBufferSize ( ) const
noexcept

Retrieves the dimensions of the layer's output tensor as allocated internally.

Internal allocations may have additional padding or alignment restrictions. The result is defined as the number of elements in internalDimensions() multiplied by the size of an individual element (e.g., 4 for FP32).

◆ weightsDimensions()

TensorDimension BasePrototypeLayer::weightsDimensions ( const char *  pWeightsName,
WeightsQuery  queryType 
) const
noexcept

Retrieves the tensor dimension of a layer's named weight input.

Parameters
pWeightsNameName of a weights object being queried
queryTypeType of dimensions being queried; see WeightsQuery for details

◆ weightsName()

const char * BasePrototypeLayer::weightsName ( ) const
noexcept

Retrieves the name used to identify this layer's weights.

By default, this should be the name assigned with setName. Alternate weights names allow layer aliasing and namespacing within subnetworks.


The documentation for this class was generated from the following files: