softmax.h

Functions

void nvte_scaled_softmax_forward(const NVTETensor input, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)

Compute scaled softmax activation on the input.

Parameters:
  • input[in] Input tensor for softmax.

  • softmax_results[out] Output tensor.

  • scale_factor[in] Scalar for the input tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_scaled_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)

Compute the backward of the scaled softmax activation.

  • incoming_grads is the input tensor containing the gradients received from the following layer.

  • softmax_results is the output tensor of the corresponding forward softmax operation.

  • output_grads is the output tensor containing the computed gradients.

Parameters:
  • incoming_grads[in] Input gradient tensor for backward.

  • softmax_results[in] Output tensor of softmax forward.

  • output_grads[out] Output tensor.

  • scale_factor[in] Scalar for the output tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_scaled_masked_softmax_forward(const NVTETensor input, const NVTETensor mask, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)

Compute scaled masked softmax activation on the input.

Parameters:
  • input[in] Input tensor for softmax.

  • mask[in] Mask for the input tensor.

  • softmax_results[out] Output tensor.

  • scale_factor[in] Scalar for the input tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_scaled_masked_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)

Compute the backward of the scaled masked softmax activation.

  • incoming_grads is the input tensor containing the gradients received from the following layer.

  • softmax_results is the output tensor of the corresponding forward softmax operation.

  • output_grads is the output tensor containing the computed gradients.

Parameters:
  • incoming_grads[in] Input gradient tensor for backward.

  • softmax_results[in] Output tensor of softmax forward.

  • output_grads[out] Output tensor.

  • scale_factor[in] Scalar for the output tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_scaled_upper_triang_masked_softmax_forward(const NVTETensor input, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)

Compute scaled softmax activation using a 2D upper triangular mask on the input.

Parameters:
  • input[in] Input tensor for softmax.

  • softmax_results[out] Output tensor.

  • scale_factor[in] Scalar for the input tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_scaled_upper_triang_masked_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)

Compute the backward of the scaled softmax activation using a 2D upper triangular mask.

  • incoming_grads is the input tensor containing the gradients received from the following layer.

  • softmax_results is the output tensor of the corresponding forward softmax operation.

  • output_grads is the output tensor containing the computed gradients.

Parameters:
  • incoming_grads[in] Input gradient tensor for backward.

  • softmax_results[in] Output tensor of softmax forward.

  • output_grads[out] Output tensor.

  • scale_factor[in] Scalar for the output tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_scaled_aligned_causal_masked_softmax_forward(const NVTETensor input, NVTETensor softmax_results, float scale_factor, cudaStream_t stream)

Compute scaled softmax activation using an implicit 2D mask aligned to the bottom right corner of the input matrix.

Parameters:
  • input[in] Input tensor for softmax.

  • softmax_results[out] Output tensor.

  • scale_factor[in] Scalar for the input tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_scaled_aligned_causal_masked_softmax_backward(const NVTETensor incoming_grads, const NVTETensor softmax_results, NVTETensor output_grads, float scale_factor, cudaStream_t stream)

Compute the backward pass of the scaled softmax activation using an implicit 2D mask aligned to the bottom right corner of the input matrix.

  • incoming_grads is the input tensor containing the gradients received from the following layer.

  • softmax_results is the output tensor of the corresponding forward softmax operation.

  • output_grads is the output tensor containing the computed gradients.

Parameters:
  • incoming_grads[in] Input gradient tensor for backward.

  • softmax_results[in] Output tensor of softmax forward.

  • output_grads[out] Output tensor.

  • scale_factor[in] Scalar for the output tensor.

  • stream[in] CUDA stream used for the operation.