image image image image image

On This Page

Changes and New Features

ComponentCategoryDescription

sharp_am/sharpd/libsharp_coll

Hardware

Added support for the following Mellanox Quantum switch capabilities:

  • Performing data operations on new data types (unsigned short, short, and short floating point data types)
  • 1K OST payload
sharp_am/sharpdResource Management

Added support for enabling and disabling reproducibility on the job level.

sharp_am/sharpdSubnet ManagementAdded support for controlling the SA key for SA operations.
libsharp_collGPUDirectAdded support for CUDA GPUDirect and GPUDirect RDMA.

Parameters Changes

ParameterComponentDescription
control_path_versionsharp_am

New parameter: Defines control path protocol version. When set to 0, will use least common supported protocol. Devices that does not support selected protocol are ignored.
Default: 0

max_compute_ports_per_agg_node

sharp_am

Modified behavior: When set to 0, AN radix is set to maximal radix value.

Default: 0

default_reproducibility

sharp_am

New parameter: Control default reproducibility mode for jobs.

Default: TURE

ib_sa_key

sharp_am

New parameter: Control SA key for SA operations.

Default: 0x1

coll_job_quota_max_payload_per_ost

sharp_job_quota

Modified behavior: Change default value to 1024.

SHARP_COLL_MAX_PAYLOAD_SIZE

Libsharp_coll

Removed

SHARP_COLL_NUM_SHARP_COLL_REQ

Libsharp_coll

Removed

SHARP_COLL_ENABLE_REPRODUCIBLE_MODE

Libsharp_coll

New parameter: Control job reproducibility mode:

0 – Use default.

1 – No reproducibility.

2 – Reproducibility.

SHARP_COLL_ENABLE_CUDA

Libsharp_coll

New parameter: Enables CUDA GPU direct.

SHARP_COLL_ENABLE_GPU_DIRECT_RDMA

Libsharp_coll

New parameter: Enables GPU direct RDMA.

Public API Changes

N/A