Defines an input layer that accepts data from CUDA device memory. More...

#include <nvneural/layers/ICudaInputLayer.h>

Inheritance diagram for nvneural::ICudaInputLayer:

Public Member Functions
virtual NeuralResult	copyCudaTensorAsync (const void *pBuffer, TensorDimension bufferSize) noexcept=0
	Loads a raw tensor from a CUDA device pointer. More...

Public Member Functions inherited from nvneural::IRefObject
virtual RefCount	addRef () const noexcept=0
	Increments the object's reference count. More...

virtual const void *	queryInterface (TypeId interface) const noexcept=0
	This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

virtual void *	queryInterface (TypeId interface) noexcept=0
	Retrieves a new object interface pointer. More...

virtual RefCount	release () const noexcept=0
	Decrements the object's reference count and destroy the object if the reference count reaches zero. More...

Static Public Attributes
static const IRefObject::TypeId	typeID = 0x66f6b04c4edaf310ul
	Interface TypeId for InterfaceOf purposes.

Static Public Attributes inherited from nvneural::IRefObject
static const TypeId	typeID = 0x14ecc3f9de638e1dul
	Interface TypeId for InterfaceOf purposes.

Additional Inherited Members
Public Types inherited from nvneural::IRefObject
using	RefCount = std::uint32_t
	Typedef used to track the number of active references to an object.

using	TypeId = std::uint64_t
	Every interface must define a unique TypeId. This should be randomized.

Protected Member Functions inherited from nvneural::IRefObject
virtual	~IRefObject ()=default
	A protected destructor prevents accidental stack-allocation of IRefObjects or use with other smart pointer classes like std::unique_ptr.

Detailed Description

Defines an input layer that accepts data from CUDA device memory.

Note that IInputLayer is a type trait that should accompany this interface in almost all cases.

Member Function Documentation

◆ copyCudaTensorAsync()

virtual NeuralResult nvneural::ICudaInputLayer::copyCudaTensorAsync	(	const void *	pBuffer,
		TensorDimension	bufferSize
	)

pure virtualnoexcept

Loads a raw tensor from a CUDA device pointer.

The tensor must be in the layer's native format as described by ILayer::tensorFormat. Layers with additional tiling requirements (as in IStandardInputLayer::tilingFactor or ILayer::stepping) should reject unaligned data instead of inserting extra padding.

The copy is performed asynchronously on the CUDA backend's primary stream. Do not modify the source buffer until the copy completes. Use cuStreamSynchronize or INetworkBackend::synchronize to ensure all pending operations complete.

Parameters

pBuffer	Pointer to the first element of the input tensor
bufferSize	Size of the buffer in elements

The documentation for this class was generated from the following file:

Core/Inc/nvneural/layers/ICudaInputLayer.h

Public Member Functions

Static Public Attributes

Additional Inherited Members

Detailed Description

Member Function Documentation

◆ copyCudaTensorAsync()