|
TensorRT 10.16.1
|
Layer that represents a KVCacheUpdate operation. More...
#include <NvInfer.h>
Public Member Functions | |
| bool | setCacheMode (KVCacheMode cacheMode) noexcept |
| Set the mode of the KVCacheUpdate layer. More... | |
| KVCacheMode | getCacheMode () const noexcept |
| Get the mode of the KVCacheUpdate layer. More... | |
| void | setInput (int32_t index, ITensor &tensor) noexcept |
| Append or replace an input of this layer with a specific tensor. More... | |
Public Member Functions inherited from nvinfer1::ILayer | |
| LayerType | getType () const noexcept |
| Return the type of a layer. More... | |
| void | setName (char const *name) noexcept |
| Set the name of a layer. More... | |
| char const * | getName () const noexcept |
| Return the name of a layer. More... | |
| int32_t | getNbInputs () const noexcept |
| Get the number of inputs of a layer. More... | |
| ITensor * | getInput (int32_t index) const noexcept |
| Get the layer input corresponding to the given index. More... | |
| int32_t | getNbOutputs () const noexcept |
| Get the number of outputs of a layer. More... | |
| ITensor * | getOutput (int32_t index) const noexcept |
| Get the layer output corresponding to the given index. More... | |
| void | setInput (int32_t index, ITensor &tensor) noexcept |
| Replace an input of this layer with a specific tensor. More... | |
| TRT_DEPRECATED void | setPrecision (DataType dataType) noexcept |
| Set the preferred or required computational precision of this layer in a weakly-typed network. More... | |
| DataType | getPrecision () const noexcept |
| get the computational precision of this layer More... | |
| TRT_DEPRECATED bool | precisionIsSet () const noexcept |
| whether the computational precision has been set for this layer More... | |
| TRT_DEPRECATED void | resetPrecision () noexcept |
| reset the computational precision for this layer More... | |
| TRT_DEPRECATED void | setOutputType (int32_t index, DataType dataType) noexcept |
| Set the output type of this layer in a weakly-typed network. More... | |
| DataType | getOutputType (int32_t index) const noexcept |
| get the output type of this layer More... | |
| TRT_DEPRECATED bool | outputTypeIsSet (int32_t index) const noexcept |
| whether the output type has been set for this layer More... | |
| TRT_DEPRECATED void | resetOutputType (int32_t index) noexcept |
| reset the output type for this layer More... | |
| void | setMetadata (char const *metadata) noexcept |
| Set the metadata for this layer. More... | |
| char const * | getMetadata () const noexcept |
| Get the metadata of the layer. More... | |
| bool | setNbRanks (int32_t nbRanks) noexcept |
| Set the number of ranks for multi-device execution. More... | |
| int32_t | getNbRanks () const noexcept |
| Get the number of ranks for multi-device execution. More... | |
Protected Member Functions | |
| virtual | ~IKVCacheUpdateLayer () noexcept=default |
Protected Member Functions inherited from nvinfer1::ILayer | |
| virtual | ~ILayer () noexcept=default |
Protected Member Functions inherited from nvinfer1::INoCopy | |
| INoCopy ()=default | |
| virtual | ~INoCopy ()=default |
| INoCopy (INoCopy const &other)=delete | |
| INoCopy & | operator= (INoCopy const &other)=delete |
| INoCopy (INoCopy &&other)=delete | |
| INoCopy & | operator= (INoCopy &&other)=delete |
Protected Attributes | |
| apiv::VKVCacheUpdateLayer * | mImpl |
Protected Attributes inherited from nvinfer1::ILayer | |
| apiv::VLayer * | mLayer |
Layer that represents a KVCacheUpdate operation.
The KVCacheUpdate layer is used to cache the key or value tensors for the attention mechanism. K and V use separate KVCacheUpdate layers.
An IKVCacheUpdateLayer has three inputs (cache, update, writeIndices) and one output. In kLINEAR mode, for each batch element i, the layer copies the update tensor into the cache starting at position writeIndices[i]. Assuming no out-of-bounds writes occur, the operation for each sequence position s in [0, sequenceLength) is:
The output performs in-place updates on the cache tensor, so they must share the same device memory address.
|
protectedvirtualdefaultnoexcept |
|
inlinenoexcept |
Get the mode of the KVCacheUpdate layer.
|
inlinenoexcept |
Set the mode of the KVCacheUpdate layer.
| cacheMode | The mode of the KVCacheUpdate layer. For TensorRT 10.15, only kLINEAR mode is supported. |
|
inlinenoexcept |
Append or replace an input of this layer with a specific tensor.
| index | the index of the input to modify. |
| tensor | the new input tensor. |
The indices are as follows:
Input 0 is the input cache tensor. Input 1 is the input update tensor. Input 2 is the input writeIndices tensor.
|
protected |
Copyright © 2024 NVIDIA Corporation
Privacy Policy |
Manage My Privacy |
Do Not Sell or Share My Data |
Terms of Service |
Accessibility |
Corporate Policies |
Product Security |
Contact