Environment Variables

NVSHMEM provides a set of environment variables that allows users to configure the NVSHMEM implementation and receive information about the implementation.

NVSHMEM_VERSION

Value: Any

Print the library version at start-up

NVSHMEM_INFO

Value: Any

Print helpful text about all these environment variables

NVSHMEM_SYMMETRIC_SIZE

Value: Non-negative integer or floating point value with an optional character suffix

Specifies the size (in bytes) of the symmetric heap memory per PE. The resulting size is implementation-defined and must be least as large as the integer ceiling of the product of the numeric prefix and the scaling factor. The allowed character suffixes for the scaling factor are as follows:

  • k or K multiplies by 2^{10} (kibibytes)
  • m or M multiplies by 2^{20} (mebibytes)
  • g or G multiplies by 2^{30} (gibibytes)
  • t or T multiplies by 2^{40} (tebibytes)

For example, string “20m” is equivalent to the integer value 20971520, or 20 mebibytes. Similarly the string “3.1M” is equivalent to the integer value 3250586. Only one multiplier is recognized and any characters following the multiplier are ignored, so “20kk” will not produce the same result as “20m”. Usage of string “.5m” will yield the same result as the string “0.5m”.

An invalid value for NVSHMEM_SYMMETRIC_SIZE is an error, which the NVSHMEM library shall report by either returning a nonzero value from nvshmem_init_thread or causing program termination.

NVSHMEM_DEBUG

Value: WARN, INFO, TRACE, or any other value to enable basic debug output

Controls the debug information that is displayed from NVSHMEM.

NVSHMEM_DEBUG_FILE

Value: filename

Set the filename where debug output is written. The filename may contain %h for hostname and %p for pid.

NVSHMEM_ENABLE_NIC_PE_MAPPING

Value: boolean

When not set or set to 0, a PE is assigned the NIC on the node that is closest to it by distance. When set to 1, NVSHMEM either assigns NICs to PEs on a round-robin basis or uses NVSHMEM_HCA_PE_MAPPING or NVSHMEM_HCA_LIST when they are specified.

NVSHMEM_HCA_PE_MAPPING

Value: HCA Mapping

Specifies the mapping of HCAs to PEs as a comma-separated list such that the GPU corresponding to the PE uses the given HCA for all transfers. Each entry in the comma separated list is of the form hca_name:port:count. For example, mlx5_0:1:2,mlx5_0:2:2 means that PE0, PE1 can be mapped to port 1 of mlx5_0, and PE2, PE3 can be mapped to port 2 of mlx5_0.

NVSHMEM_ENABLE_NIC_PE_MAPPING must be set to 1 for this variable to be effective.

NVSHMEM_HCA_LIST

Value: List of HCAs

Specifies a comma-separated list of HCAs to use in the NVSHMEM application. This is useful to skip disabled HCAs or HCAs operating in a different mode than Infiniband. Each entry in the comma separated list is of the form hca_name:port. For example, mlx5_1:1,mlx5_2:2 specifies use of port 1 of mlx5_1 and port 2 of mlx5_2 in the application. A ^ before the list indicates an exclusion list. For example, ^mlx5_1:1,mlx5_1:2 means not to use mlx5_1 port 1 and mlx5_1 port 2.

NVSHMEM_ENABLE_NIC_PE_MAPPING must be set to 1 for this variable to be effective.

NVSHMEM_MPI_LIB_NAME

Value: String

Name of the MPI library that is used for bootstrapping NVSHMEM. By default, NVSHMEM looks for libraries with the name libmpi.so.

NVSHMEM_SHMEM_LIB_NAME

Value: String

Name of the OpenSHMEM library that is used for bootstrapping NVSHMEM. By default, NVSHMEM looks for libraries with the name liboshmem.so.

NVSHMEM_BARRIER_DISSEM_KVAL

Value: Integer radix

Radix of the dissemination algorithm used for nvshmem_barrier and nvshmem_barrier_all collectives. By default, NVSHMEM uses radix 2.

NVSHMEM_BARRIER_TG_DISSEM_KVAL

Value: Integer radix

Radix of the dissemination algorithm used for nvshmem_barrier_warp, nvshmem_barrier_group, nvshmem_barrier_all_warp, and nvshmem_barrier_all_group collectives. By default, NVSHMEM uses radix 2.