Changes and New Features

Feature/Change

Description

Syslog Capabilities

Added support for a new syslog capability to libsharp.

Syslog verbosity level can now be controlled using the SHARP_SYSLOG_VERBOSITY environment variable.

Dynamic Trees Allocation Algorithms

Added support for selecting one of two algorithms that determine how trees should be created for each SHARP job. One algorithm is optimized for SuperPOD fabrics, while the other is optimized for Quasi Fat Trees (QFTs).

For further information, please see Dynamic Trees Allocation Algorithms section.

REST API Jobs Query

Added support for retrieving the status of the current active SHARP jobs along with the structure of the trees assigned to them.

Note that this information is retrieved via REST-API and requires the use of UFM.

Unhealthy Ports

Added support in OpenSM to inform SHARP of dangling or unhealthy links in order to avoid their use in SHARP jobs.

Bug Fixes

See Bug Fixes section.

Parameter

Component

Description

dynamic_tree_algorithm

sharp_am

New parameter: Sets which algorithm should be used by the dynamic tree mechanism.

This parameter is ignored when dynamic_tree_allocation is false.

Possible values:

0 - SuperPOD oriented algorithm

1 - Quasi Fat Tree oriented algorithm

Default: 0 – SuperPOD oriented algorithm

app_resources_default_limit

sharp_am

Sets the default max number of trees allowed to be used in parallel by a single app.
Modified the possible range of values where the value of –1 means no resource limit, and 0 means no resources by default.
Default: -1 – No resource limit

max_quota

sharp_am

Deprecated parameter: This parameter is now marked as deprecated. It is ignored and should not be used.

default_quota

sharp_am

Deprecated parameter: This parameter is now marked as deprecated. It is ignored and should not be used.

SHARP_SYSLOG_VERBOSITY

libsharp

New parameter: Sets the libsharp syslog verbosity level. Possible values:

0 – Disable syslog

1 – Errors log level

2 – Warnings log level

3 – Info log level

Default: 1 – Errors log level

SHARP_GROUP_JOIN_MAD_TIMEOUT

libsharp

Sets the timeout till a retry for GroupJoin MAD, in milliseconds.

Modified the default value.

Default: 3000 milliseconds

SHARP_GROUP_JOIN_MAD_RETRIES

libsharp

Sets the number of retries for GroupJoin MAD.

Modified the default value.

Default: 5 retries

SHARP_QP_CONFIRM_MAD_TIMEOUT

libsharp

Sets the timeout till a retry for QP Allocation confirmation MAD, in milliseconds.

Modified the default value.

Default: 2000 milliseconds

© Copyright 2023, NVIDIA. Last updated on May 24, 2023.