Getting Started with IMEX#

Basic Components#

This section provides information about the basic components in the IMEX service.

The IMEX Service#

The core component of IMEX is implemented as a stand-alone executable that runs as a UNIX daemon process. The IMEX installation package will install the required core components and register the service as a systemd service called nvidia-imex.

The IMEX Control App#

In addition to the IMEX service, the installation package will also install the nvidia-imex-ctl, and this tool can be used to query the state of the IMEX service (refer to “The nvidia-imex-ctl” on page 33 for more information).

Supported Platforms#

IMEX currently supports the following products and environments:

Hardware architectures:

  • x86_64

  • aarch64

Here are the NVLink network server architectures:

  • NVIDIA GB200 NVL72

  • OS Environment

IMEX is supported on the following major Linux OS distributions:

  • RHEL/CentOS 7.x and RHEL/CentOS 8.x

  • Ubuntu 20.04.x, Ubuntu 22.04.x and Ubuntu 24.0x

Other NVIDIA Software Packages#

To run the IMEX service, the target system must include a compatible Data Center GPU Driver package starting with version TBD for GB200 systems.

Note: During initialization, the IMEX service checks the currently loaded kernel driver stack version for compatibility, and if the loaded driver stack version is not compatible, aborts the process. |

Installation#

This section provides information about installing the IMEX service.

Managing the IMEX Service#

This section provides information about how to manage the IMEX service.

Starting the IMEX#

  • To start IMEX, run the following command.

# For Linux based OS distributions
sudo systemctl start nvidia-imex

Stopping the IMEX Service#

  • To stop IMEX, run the following command.

# For Linux based OS distributions
sudo systemctl stop nvidia-imex

Checking the IMEX Service Status#

  • To check the IMEX service status, run the following command.

# For Linux based OS distributions
sudo systemctl status nvidia-imex

For a detailed IMEX service status, use the nvidia-imex-ctl tool (refer to The nvidia-imex-ctl for more information).

Enabling the IMEX Service to Automatically Start at Boot#

  • To enable the IMEX service to automatically start at boot, run the following command.

# For Linux based OS distributions
sudo systemctl enable nvidia-imex

Disabling the IMEX Service Automatically Start at Boot#

  • To disable the IMEX service to automatically start at boot, run the following command.

# For Linux based OS distributions
sudo systemctl disable nvidia-imex

Checking IMEX Service System Log Messages#

  • To check the IMEX system log messages, run the following command.

# For Linux based OS distributions
sudo journalctl -u nvidia-imex

IMEX Service Startup Options#

IMEX supports the following CLI options:

. /nvidia-imex -h

/usr/bin/nvidia-imex –h
    [-h | --help]: Displays help information.
    [-v | --version]: Displays the imex version and exit.
    [-c | --config]: Provides imex config file path/name which controls all the config options.
    [-nogpu | --nogpu]: Run in "NO GPU" mode, allowing testing of nvidia-imex communication and configuration without requiring or using a GPU.
    [-p | --port]: Specifies the port for IMEX peer communication. If specified, this overrides the IMEX\_SERVER\_PORT in config file.

Most of the IMEX configurable parameters and options are specified through a text config file. When you install IMEX, a default config file will be copied to a predefined location, and the file will be used by default. To use a different config file location, specify the same parameters and options using the [-c | –config] command-line argument.

Note On Linux-based installations, the default IMEX config file will be in the /etc/nvidia-imex/ directory. If the default config file on the system is modified, to manage the existing config file, subsequent IMEX package updates will provide options such as merge/keep/overwrite. |

IMEX Service File#

This section provides information about the IMEX service file.

On Linux-Based Systems#

On Linux-based systems, the installation package will register the IMEX service using the following systemd service unit file. To change the IMEX service start-up options, modify this service unit file in the /lib/systemd/system/nvidia-imex.service directory.

[Unit]
Description=NVIDIA IMEX service
After=network-online.target
Requires=network-online.target

[Service]
User=root
PrivateTmp=false
Type=forking
TimeoutStartSec=infinity

ExecStart=/usr/bin/nvidia-imex -c /etc/nvidia-imex/config.cfg

LimitCORE=infinity

[Install]
WantedBy=multi-user.target

Running IMEX Service as Non-Root#

On Linux-based systems, for memory fabric events the IMEX service requires administrative (root) privileges to register with the GPU driver. However, system administrators and advanced users can complete the following steps to run IMEX from a non-root account:

  1. If the IMEX service is running, stop it.

  2. IMEX requires access to the following directory/file:

    • /etc/nvidia-imex/: This directory accesses the IMEX configuration files.

    • /var/log/: This directory saves the IMEX log file.

  3. Update the corresponding directory/file access to the appropriate user or user group.

  4. Change the following proc entry read/write access to point to the user or user group.
    /proc/driver/nvidia/capabilities/fabric-imex-mgmt
    The fabric-imex-mgmt file contains following entries:

    • DeviceFileMinor: 4323

    • DeviceFileMode: 256

    • DeviceFileModify: 1

  5. Note down the value of DeviceFileMinor, for example 4323.

  6. Run the following command.
    echo DeviceFileModify: 0 > /proc/driver/nvidia/capabilities/fabric-imex-mgmt

  7. Change the read/write access to the appropriate user/group for the following file
    /dev/nvidia-caps/nvidia-cap<DeviceFileMinor>

  8. Modify the access with the DeviceFileMinor value from step 4.
    /dev/nvidia-caps/nvidia-cap4323

  9. After the required permissions are assigned, manually start the IMEX service process from the user/user group account.

  10. The NVIDIA driver will create/recreate the above /proc entry when the driver loads, so Repeat steps 1-8 on every driver reload or system boot.

When IMEX is configured as a systemd service, the system administrator must edit the IMEX service unit file to instruct systemd to run the IMEX service from a user/group. This user/group can be specified by using the User= and Group= directives in the [Service] section of the IMEX service unit file. The system administrator must ensure that the proc entry and associated file node permission are changed to the user/user group before the IMEX service starts at system boot time.

Note System administrators can set up necessary udev rules to automate the process of changing these proc entry permissions. |

Running IMEX in NO GPU mode#

You can run IMEX in NO GPU mode with the “–nogpu” command line flag to test nvidia-imex communication and configuration without requiring or using a GPU.

Example:

./nvidia-imex -c config.cfg –nogpu