NVIDIA NvNeural SDK  2022.2
GPU inference framework for NVIDIA Nsight Deep Learning Designer
nvneural::IOnnxGenerationHost Class Referenceabstract

Tool-provided interface for creating ONNX model graphs in an ABI-stable fashion. More...

#include <nvneural/OnnxTypes.h>

Inheritance diagram for nvneural::IOnnxGenerationHost:
nvneural::IRefObject

Public Member Functions

virtual NeuralResult addGraphInitializer (const char *pInitializerName, OnnxTensorDataType dataType, bool fp16Eligible, const std::int64_t *pTensorDims, std::size_t numTensorDims, const void *pTensorData, std::size_t tensorDataBytes) noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. More...
 
virtual NeuralResult addGraphInitializer (const char *pInitializerName, OnnxTensorDataType dataType, bool fp16Eligible, TensorDimension tensorSize, const void *pTensorData, std::size_t tensorDataBytes) noexcept=0
 Adds a tensor as a named graph initializer. More...
 
virtual NeuralResult addGraphInput (const char *pInputName, bool fp16Eligible) noexcept=0
 Adds a new graph input. More...
 
virtual NeuralResult addGraphInput (const char *pInputName, bool fp16Eligible, TensorDimension tensorDim) noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. More...
 
virtual NeuralResult addGraphOutput (const char *pOutputName, bool fp16Eligible) noexcept=0
 Adds a new graph output. More...
 
virtual NeuralResult addGraphOutput (const char *pOutputName, bool fp16Eligible, TensorDimension tensorDim) noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. More...
 
virtual NeuralResult addWeightsConstant (const char *pInitializerName, const IWeightsData *pWeightsData, bool fp16Eligible) noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.The tensor dimensions of the initializer are taken from pWeightsData->dimension().
 
virtual NeuralResult addWeightsConstant (const char *pInitializerName, const IWeightsData *pWeightsData, bool fp16Eligible, const std::int64_t *pTensorDims, std::size_t numTensorDims) noexcept=0
 Adds weights data as a named graph initializer. More...
 
virtual bool automaticFp16ConversionEnabled () const noexcept=0
 Returns true if automatic fp16 conversion is enabled. More...
 
virtual NeuralResult createGraphNode (IOnnxGenerationGraphNode **ppNodeOut) noexcept=0
 Creates a new graph node (GraphProto.node) and returns a pointer to it. More...
 
virtual const char * getLayerOutputName (const ILayer *pLayer) const noexcept=0
 Returns the "output name" associated with a layer. More...
 
virtual NeuralResult importOperatorSet (const char *pDomain, std::int64_t version) noexcept=0
 Adds a versioned operator set to the graph's import list. More...
 
virtual bool placeholdersEnabled () const noexcept=0
 Returns true if placeholder operators are permitted. More...
 
virtual NeuralResult setLayerOutputName (const ILayer *pLayer, const char *pOutputName) noexcept=0
 Sets the "output name" associated with a layer. More...
 
virtual bool shouldUseExplicitTensorSizes () const noexcept=0
 Returns true if graph inputs and outputs should be sized explicitly. More...
 
- Public Member Functions inherited from nvneural::IRefObject
virtual RefCount addRef () const noexcept=0
 Increments the object's reference count. More...
 
virtual const void * queryInterface (TypeId interface) const noexcept=0
 This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
 
virtual void * queryInterface (TypeId interface) noexcept=0
 Retrieves a new object interface pointer. More...
 
virtual RefCount release () const noexcept=0
 Decrements the object's reference count and destroy the object if the reference count reaches zero. More...
 

Static Public Attributes

static const IRefObject::TypeId typeID = 0xde33951dcbbaafc7ul
 Interface TypeId for InterfaceOf purposes.
 
- Static Public Attributes inherited from nvneural::IRefObject
static const TypeId typeID = 0x14ecc3f9de638e1dul
 Interface TypeId for InterfaceOf purposes.
 

Additional Inherited Members

- Public Types inherited from nvneural::IRefObject
using RefCount = std::uint32_t
 Typedef used to track the number of active references to an object.
 
using TypeId = std::uint64_t
 Every interface must define a unique TypeId. This should be randomized.
 
- Protected Member Functions inherited from nvneural::IRefObject
virtual ~IRefObject ()=default
 A protected destructor prevents accidental stack-allocation of IRefObjects or use with other smart pointer classes like std::unique_ptr.
 

Detailed Description

Tool-provided interface for creating ONNX model graphs in an ABI-stable fashion.

Member Function Documentation

◆ addGraphInitializer() [1/2]

virtual NeuralResult nvneural::IOnnxGenerationHost::addGraphInitializer ( const char *  pInitializerName,
OnnxTensorDataType  dataType,
bool  fp16Eligible,
const std::int64_t *  pTensorDims,
std::size_t  numTensorDims,
const void *  pTensorData,
std::size_t  tensorDataBytes 
)
pure virtualnoexcept

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
pInitializerNameName to assign to the initializer (must be unique within the graph)
dataTypeTensor data type to embed.
fp16EligibleIf true, a tensor(float) may be converted to a tensor(float16) to match the graph.
pTensorDimsExplicit shape to assign to the initializer
numTensorDimsNumber of elements in the pTensorDims array
pTensorDataPointer to first element in a packed array of tensor data
tensorDataBytesNumber of bytes in the pTensorData array

◆ addGraphInitializer() [2/2]

virtual NeuralResult nvneural::IOnnxGenerationHost::addGraphInitializer ( const char *  pInitializerName,
OnnxTensorDataType  dataType,
bool  fp16Eligible,
TensorDimension  tensorSize,
const void *  pTensorData,
std::size_t  tensorDataBytes 
)
pure virtualnoexcept

Adds a tensor as a named graph initializer.

Note: Strings are not supported here due to their packing requirements. This is a graph-level operation; the initializer also needs to be added to a node's input list. To make the initializer replaceable at inference time, add the same name as a graph input.

Parameters
pInitializerNameName to assign to the initializer (must be unique within the graph)
dataTypeTensor data type to embed.
fp16EligibleIf true, a tensor(float) may be converted to a tensor(float16) to match the graph.
tensorSizeShape to assign to the tensor
pTensorDataPointer to first element in a packed array of tensor data
tensorDataBytesNumber of bytes in the pTensorData array

◆ addGraphInput() [1/2]

virtual NeuralResult nvneural::IOnnxGenerationHost::addGraphInput ( const char *  pInputName,
bool  fp16Eligible 
)
pure virtualnoexcept

Adds a new graph input.

Typically only input layers will need to use this function. If no explicit size is provided, the graph input will use an implicit size derived from the input name.

Parameters
pInputNameName of the input to introduce.
fp16EligibleCan this input be changed to FLOAT16?

◆ addGraphInput() [2/2]

virtual NeuralResult nvneural::IOnnxGenerationHost::addGraphInput ( const char *  pInputName,
bool  fp16Eligible,
TensorDimension  tensorDim 
)
pure virtualnoexcept

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
pInputNameName of the input to introduce.
fp16EligibleCan this input be changed to FLOAT16?
tensorDimExplicit dimension for the input tensor.

◆ addGraphOutput() [1/2]

virtual NeuralResult nvneural::IOnnxGenerationHost::addGraphOutput ( const char *  pOutputName,
bool  fp16Eligible 
)
pure virtualnoexcept

Adds a new graph output.

The exporter tool will automatically add graph outputs for all layers whose outputs are not used as inputs by other layers. Exporters do not typically need to call this function directly.

If no explicit size is provided for the output, the tensor will use an implicit size derived from the output name.

Parameters
pOutputNameName of the output to introduce.
fp16EligibleCan this output be changed to FLOAT16?

◆ addGraphOutput() [2/2]

virtual NeuralResult nvneural::IOnnxGenerationHost::addGraphOutput ( const char *  pOutputName,
bool  fp16Eligible,
TensorDimension  tensorDim 
)
pure virtualnoexcept

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
pOutputNameName of the output to introduce.
fp16EligibleCan this output be changed to FLOAT16?
tensorDimExplicit dimension for the output tensor.

◆ addWeightsConstant()

virtual NeuralResult nvneural::IOnnxGenerationHost::addWeightsConstant ( const char *  pInitializerName,
const IWeightsData pWeightsData,
bool  fp16Eligible,
const std::int64_t *  pTensorDims,
std::size_t  numTensorDims 
)
pure virtualnoexcept

Adds weights data as a named graph initializer.

This is a graph-level operation; the initializer also needs to be added to a node's input list.

Parameters
pInitializerNameName to assign to the initializer (must be unique within the graph).
pWeightsDataWeights data to embed.
fp16EligibleIf true, the weights may be converted to tensor(float16) format to match the graph.
pTensorDimsExplicit shape to assign to the initializer (useful for operators requiring 1D weights)
numTensorDimsNumber of elements in the pTensorDims array

◆ automaticFp16ConversionEnabled()

virtual bool nvneural::IOnnxGenerationHost::automaticFp16ConversionEnabled ( ) const
pure virtualnoexcept

Returns true if automatic fp16 conversion is enabled.

Some ONNX inference frameworks do not consider a model to be eligible for fp16 inference unless it uses tensor(float16) inputs/outputs/initializers. As ONNX operator type constraints typically discourage mixing fp16/fp32 tensors inside an operator, the exporter will automatically convert suitable tensor(float) graph initializers to tensor(float16) upon insertion. This function allows you to test whether that feature is enabled, but calling it should rarely be necessary.

Returns
True if "fp16-eligible" weights and initializers will be converted from float to float16

◆ createGraphNode()

virtual NeuralResult nvneural::IOnnxGenerationHost::createGraphNode ( IOnnxGenerationGraphNode **  ppNodeOut)
pure virtualnoexcept

Creates a new graph node (GraphProto.node) and returns a pointer to it.

Parameters
ppNodeOutPointer receiving a reference to a new IOnnxGenerationGraphNode object.

◆ getLayerOutputName()

virtual const char* nvneural::IOnnxGenerationHost::getLayerOutputName ( const ILayer pLayer) const
pure virtualnoexcept

Returns the "output name" associated with a layer.

This name should be used to represent the output tensor from a given layer. If a layer requires multiple operators for export, or is fused into another layer, update its name with setLayerOutputName.

Parameters
pLayerLayer entry to query. Null pointers are permitted here and return "".

◆ importOperatorSet()

virtual NeuralResult nvneural::IOnnxGenerationHost::importOperatorSet ( const char *  pDomain,
std::int64_t  version 
)
pure virtualnoexcept

Adds a versioned operator set to the graph's import list.

If the operator set is already imported, increases the imported version if necessary. While some recent version of the standard ONNX operator set is imported for you by the tool, we recommend you import the minimum version required for your operators as a hygiene measure.

Parameters
pDomainOperator set domain. Pass an empty string or nullptr for the default ONNX operator set.
versionOperator set version to import.

◆ placeholdersEnabled()

virtual bool nvneural::IOnnxGenerationHost::placeholdersEnabled ( ) const
pure virtualnoexcept

Returns true if placeholder operators are permitted.

The user may choose to disallow placeholders for operations that cannot be represented in ONNX. When this happens, layers should fail export rather than making a "best-effort" placeholder node.

If placeholders are disabled, generateThingOnnx returns of NeuralResult::Unsupported are treated as failures by the tool. You do not have to test this flag before returning Unsupported.

◆ setLayerOutputName()

virtual NeuralResult nvneural::IOnnxGenerationHost::setLayerOutputName ( const ILayer pLayer,
const char *  pOutputName 
)
pure virtualnoexcept

Sets the "output name" associated with a layer.

The tool assigns default names before export, so this is typically needed only when inserting activations or otherwise fusing operations into layers.

Warning
Forward references are not permitted in ONNX, and names must be unique within that namespace. This interface does not retroactively alter references to the layer's old output name.
Parameters
pLayerLayer entry to adjust.
pOutputNameNew output name for the layer.

◆ shouldUseExplicitTensorSizes()

virtual bool nvneural::IOnnxGenerationHost::shouldUseExplicitTensorSizes ( ) const
pure virtualnoexcept

Returns true if graph inputs and outputs should be sized explicitly.

Implicit sizes allow inference with arbitrary tensor inputs (e.g., when games run in varying window sizes), but are less compatible with third-party inference frameworks. Exporters should therefore allow the user to decide whether fixed tensor sizes are necessary.

Typically only input layers need to call this function; weights data is always explicitly sized, as are most other graph initializers.

Returns
True if sizes should always be explicit, false if implicit sizes are permitted.

The documentation for this class was generated from the following file: