TL/UCP Special Service Worker

This feature enables the use of a separate UCX/UCP worker for performing the service collectives, which are invoked internally during setup. For example, service collectives can be set to use TCP only, while regular collectives may use InfiniBand.

The feature can be enabled by setting the UCC environment variable as follows:

Copy
Copied!
            

UCC_TL_UCP_SERVICE_WORKER=1.

You may pass the UCX configuration for the service worker using the "UCC_TL_UCP_SERVICE_" prefix. For example:

Copy
Copied!
            

UCC_TL_UCP_SERVICE_NET_DEVICES=mlx5_0:1

For further UCC options, run ucc_info -f

© Copyright 2023, NVIDIA. Last updated on Feb 9, 2024.