Using the cuBLASDx API

The cuBLASDx library (preview) is a device side API extension for performing BLAS calculations inside CUDA kernels. By fusing numerical operations you can decrease latency and further improve performance of your applications.

  • You can access cuBLASDx documentation here.

  • cuBLASDx is not a part of the CUDA Toolkit. You can download cuBLASDx separately from here.