NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) Rev 3.5.1 LTS

Changes and New Features

Feature/Change

Description

Added new capabilities to sharp reservation

Added configuration parameters that control the desired behavior with regard to reservation scale-in and override of one reservation by another.

Bug Fixes

See Bug Fixes section.

Parameter

Component

Description

load_reservation_files

Sharp_am

A boolean parameter, tells whether sharp_am should load the reservation information from its own created files upon sharp_am start.

The default value if modified from False to True.

Default: True

reservation_force_guid_assignment

Sharp_am

New parameter: A boolean parameter, tells whether a new reservation or an update to an existing parameter should be accepted, even if it includes hosts/GUIDs that are already assigned to a different reservation. When set to True, in case there is a conflict of hosts/GUIDs, the creation/update request is fulfilled, and the other reservation is deleted.

Default: False

reservation_stop_jobs_upon_scale_in

Sharp_am

New parameter: A boolean parameter, tells whether a scale-in action of a reservation is approved when there are active jobs affected by the scale-in action.

A scale-in action can be either a deletion of the entire reservation, or an update request with reduced hosts/GUIDs.

In case active jobs use at least one of the removed hosts/guids, the behavior will be as such:

When the parameter is True, the relevant jobs will be stopped. When set to False, the jobs will not be affected, the request will be denied, and an error message will be replied to the request.

Default: True

© Copyright 2023, NVIDIA. Last updated on Dec 12, 2023.