Skip to main content
Ctrl+K
cuBLASMp - Home cuBLASMp - Home

cuBLASMp

cuBLASMp - Home cuBLASMp - Home

cuBLASMp

Table of Contents

  • Home
  • Getting Started
  • How to Use cuBLASMp
    • Communication Abstraction Library Usage
    • Using cuBLASMp for Tensor Parallelism in Distributed Machine Learning
    • cuBLASMp Logging
    • cuBLASMp Data Types
    • cuBLASMp C API
  • Release Notes
  • Troubleshooting
  • Software License Agreement
  • How to Use cuBLASMp

How to Use cuBLASMp#

This section explains how to use cuBLASMp in your application.

  • Communication Abstraction Library Usage
    • Communication Abstraction Library
    • Creating Communicator Handle with MPI
    • Communication Abstraction Library Data Types
    • Communication Abstraction Library API
  • Using cuBLASMp for Tensor Parallelism in Distributed Machine Learning
    • AllGather+GEMM and GEMM+ReduceScatter in terms of traditional PBLAS
    • On Python and cuBLASMp data ordering
    • AllGather+GEMM
    • GEMM+ReduceScatter
    • GEMM+AllReduce
    • General assumptions and limitations
  • cuBLASMp Logging
    • CUBLASMP_LOG_LEVEL
    • CUBLASMP_LOG_MASK
    • CUBLASMP_LOG_FILE
  • cuBLASMp Data Types
    • Data types
    • Enumerators
  • cuBLASMp C API
    • Library Management
    • Grid Management
    • Matrix Management
    • Matmul Properties
    • Utility
    • Logging
    • Dense Linear Algebra APIs

previous

Getting Started

next

Communication Abstraction Library Usage

NVIDIA NVIDIA
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2023-2025, NVIDIA Corporation & affiliates.