activation.h

Activation functions.

Functions

void nvte_gelu(const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute GELU activation of the input.

Parameters
  • input[in] Input tensor for GELU activation.

  • output[inout] Output tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_dgelu(const NVTETensor grad, const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute GELU activation gradient.

Parameters
  • grad[in] Incoming gradient.

  • input[in] Input tensor for GELU activation.

  • output[inout] Output tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_geglu(const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute GeGLU of the input.

Parameters
  • input[in] Input tensor of shape [N, H * 2].

  • output[inout] Output tensor of shape [N, H]. It computes GELU(input[N, :H]) x input[N, H:]

  • stream[in] CUDA stream used for the operation.

void nvte_dgeglu(const NVTETensor grad, const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute GeGLU gradient.

Parameters
  • grad[in] Incoming gradient of shape [N, H].

  • input[in] Forward input tensor of shape [N, H * 2].

  • output[inout] Outgoing gradient of shape [N, H * 2].

  • stream[in] CUDA stream used for the operation.

void nvte_relu(const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute RELU activation of the input.

Parameters
  • input[in] Input tensor for RELU activation.

  • output[inout] Output tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_drelu(const NVTETensor grad, const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute RELU activation gradient.

Parameters
  • grad[in] Incoming gradient.

  • input[in] Input tensor for RELU activation.

  • output[inout] Output tensor.

  • stream[in] CUDA stream used for the operation.

void nvte_swiglu(const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute SwiGLU activation of the input.

Parameters
  • input[in] Input tensor of shape [N, H * 2].

  • output[inout] Output tensor of shape [N, H]. It computes Swish(input[N, :H]) x input[N, H:]

  • stream[in] CUDA stream used for the operation.

void nvte_dswiglu(const NVTETensor grad, const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute SwiGLU gradient.

Parameters
  • grad[in] Incoming gradient of shape [N, H].

  • input[in] Forward input tensor of shape [N, H * 2].

  • output[inout] Outgoing gradient of shape [N, H * 2].

  • stream[in] CUDA stream used for the operation.

void nvte_reglu(const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute ReGLU activation of the input.

Parameters
  • input[in] Input tensor of shape [N, H * 2].

  • output[inout] Output tensor of shape [N, H]. It computes ReLU(input[N, :H]) x input[N, H:]

  • stream[in] CUDA stream used for the operation.

void nvte_dreglu(const NVTETensor grad, const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute ReGLU gradient.

Parameters
  • grad[in] Incoming gradient of shape [N, H].

  • input[in] Forward input tensor of shape [N, H * 2].

  • output[inout] Outgoing gradient of shape [N, H * 2].

  • stream[in] CUDA stream used for the operation.

void nvte_qgelu(const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute QuickGELU activation of the input.

Parameters
  • input[in] Input tensor for QuickGELU activation.

  • output[inout] Output tensor. Approximates GELU as input x sigmoid(1.702 x input).

  • stream[in] CUDA stream used for the operation.

void nvte_dqgelu(const NVTETensor grad, const NVTETensor input, NVTETensor output, cudaStream_t stream)

Compute QuickGELU activation gradient.

Parameters
  • grad[in] Incoming gradient.

  • input[in] Input tensor for QuickGELU activation.

  • output[inout] Output tensor.

  • stream[in] CUDA stream used for the operation.