softmax.h
Functions
-
void nvte_scaled_softmax_forward(const NVTETensor input, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)
Compute scaled softmax activation on the input.
- Parameters:
input – [in] Input tensor for softmax.
softmax_results – [out] Output tensor.
scale_factor – [in] Scalar for the input tensor.
stream – [in] CUDA stream used for the operation.
-
void nvte_scaled_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)
Compute the backward of the scaled softmax activation.
incoming_grads
is the input tensor containing the gradients received from the following layer.softmax_results
is the output tensor of the corresponding forward softmax operation.output_grads
is the output tensor containing the computed gradients.
- Parameters:
incoming_grads – [in] Input gradient tensor for backward.
softmax_results – [in] Output tensor of softmax forward.
output_grads – [out] Output tensor.
scale_factor – [in] Scalar for the output tensor.
stream – [in] CUDA stream used for the operation.
-
void nvte_scaled_masked_softmax_forward(const NVTETensor input, const NVTETensor mask, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)
Compute scaled masked softmax activation on the input.
- Parameters:
input – [in] Input tensor for softmax.
mask – [in] Mask for the input tensor.
softmax_results – [out] Output tensor.
scale_factor – [in] Scalar for the input tensor.
stream – [in] CUDA stream used for the operation.
-
void nvte_scaled_masked_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)
Compute the backward of the scaled masked softmax activation.
incoming_grads
is the input tensor containing the gradients received from the following layer.softmax_results
is the output tensor of the corresponding forward softmax operation.output_grads
is the output tensor containing the computed gradients.
- Parameters:
incoming_grads – [in] Input gradient tensor for backward.
softmax_results – [in] Output tensor of softmax forward.
output_grads – [out] Output tensor.
scale_factor – [in] Scalar for the output tensor.
stream – [in] CUDA stream used for the operation.
-
void nvte_scaled_upper_triang_masked_softmax_forward(const NVTETensor input, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)
Compute scaled softmax activation using a 2D upper triangular mask on the input.
- Parameters:
input – [in] Input tensor for softmax.
softmax_results – [out] Output tensor.
scale_factor – [in] Scalar for the input tensor.
stream – [in] CUDA stream used for the operation.
-
void nvte_scaled_upper_triang_masked_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)
Compute the backward of the scaled softmax activation using a 2D upper triangular mask.
incoming_grads
is the input tensor containing the gradients received from the following layer.softmax_results
is the output tensor of the corresponding forward softmax operation.output_grads
is the output tensor containing the computed gradients.
- Parameters:
incoming_grads – [in] Input gradient tensor for backward.
softmax_results – [in] Output tensor of softmax forward.
output_grads – [out] Output tensor.
scale_factor – [in] Scalar for the output tensor.
stream – [in] CUDA stream used for the operation.
-
void nvte_scaled_aligned_causal_masked_softmax_forward(const NVTETensor input, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)
Compute scaled softmax activation using an implicit 2D mask aligned to the bottom right corner of the input matrix.
- Parameters:
input – [in] Input tensor for softmax.
softmax_results – [out] Output tensor.
scale_factor – [in] Scalar for the input tensor.
stream – [in] CUDA stream used for the operation.
-
void nvte_scaled_aligned_causal_masked_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)
Compute the backward pass of the scaled softmax activation using an implicit 2D mask aligned to the bottom right corner of the input matrix.
incoming_grads
is the input tensor containing the gradients received from the following layer.softmax_results
is the output tensor of the corresponding forward softmax operation.output_grads
is the output tensor containing the computed gradients.
- Parameters:
incoming_grads – [in] Input gradient tensor for backward.
softmax_results – [in] Output tensor of softmax forward.
output_grads – [out] Output tensor.
scale_factor – [in] Scalar for the output tensor.
stream – [in] CUDA stream used for the operation.