What can I help you with?
NMX Manager (NMX-M) Documentation v85.1.2000

Bring-Up

The Bring-Up process is a fully automated method for configuring and registering a designated switch tray with the NMX Manager. This process enables telemetry collection and NVLink domain management.

Each switch tray is configured to ensure both nmx-controller and nmx-telemetry services are active, and a mutual LTS (mTLS)-secured gRPC connection is established and maintained.

The NMX Manager provides endpoints for securely managing switch credentials via switch profiles.

Default Profile: username: admin, password: admin, (NVIDIA recommends to change or override for enhanced security)

Auth Required

Action

Endpoint

Description

ro-user, rw-user

Retrieve

GET /nmx/v1/switch-profiles

Retrieve a list of switch profiles

rw-user

Create

POST /nmx/v1/switch-profiles

Create a new switch profile

ro-user, rw-user

Retrieve

GET /nmx/v1/switch-profiles/{id}

Retrieve a specific switch profile

rw-user

Delete

DELETE /nmx/v1/switch-profiles/{id}

Delete a switch profile (except default)

rw-user

Update

PATCH /nmx/v1/switch-profiles/{id}

Update an existing switch profile

Update password example

Copy
Copied!
            

curl -X 'PATCH' \ 'https://<NMX-Manager-API>/nmx/v1/switch-profiles/<switch-profile-id>' \   -H 'accept: */*' \   -H 'Content-Type: application/json' \   -d '{   "Password": "admin" }'

Bring-up is an asynchronous operation that tracks the bring-up process for one or more switches.

Auth Required

Action

Endpoint

Description

ro-user, rw-user

Retrieve

GET /nmx/v1/bring-up

Retrieve bring-up operations with optional filters (pending, in-progress, failed, completed)

rw-user

Create

POST /nmx/v1/bring-up

Initiate a new bring-up process for one or more switches

ro-user, rw-user

Retrieve

GET /nmx/v1/bring-up/{id}

Get bring-up status for a specific operation

Usage Examples

The following example demonstrates how to start a new bring-up operation using a POST request:

Single switch with default switch profile

Copy
Copied!
            

curl -X 'POST' \ 'https://<NMX-Manager-API>/nmx/v1/bring-up' \ -H 'accept: application/json' \ -H 'Content-Type: multipart/form-data' \ -F 'FmConfig=@<fm-config-file>' \ -F 'ProfileID=<switch-profile-id>' \ -F 'Switches={ "Address": "" }'   curl -X 'POST' \ 'https://<NMX-Manager-API>/nmx/v1/bring-up' \ -H 'accept: application/json' \ -H 'Content-Type: multipart/form-data' \ -F 'FmConfig=@<fm-config-file>' \ -F 'ProfileID=' \ -F 'Switches={ "Address": "<switch-A-IP-Address-or-hostname>" }' \ -F 'Switches={ "Address": "<switch-B-IP-Address-or-hostname>" }'   curl -X 'POST' \ 'https://<NMX-Manager-API>/nmx/v1/bring-up' \ -H 'accept: application/json' \ -H 'Content-Type: multipart/form-data' \ -F 'FmConfig=@<fm-config-file>' \ -F 'ProfileID=' \ -F 'Switches={ "Address": "<switch-A-IP-Address-or-hostname>" }' \ -F 'Switches={ "Address": "<switch-B-IP-Address-or-hostname>", "ProfileID": "<custom-switch-profile-id>" }'   curl -X 'POST' \ 'https://<NMX-Manager-API>/nmx/v1/bring-up' \ -H 'accept: application/json' \ -H 'Content-Type: multipart/form-data' \ -F 'FmConfig=@fm_config_72x1_C9_S9 2.cfg' \ -F 'ProfileID=' \ -F 'Switches={ "Address": "<switch-A-IP-Address-or-hostname>", "ProfileID": "<custom-switch-profile-A-id>" }' \ -F 'Switches={ "Address": "<switch-B-IP-Address-or-hostname>", "ProfileID": "<custom-switch-profile-B-id>" }'

fm_config / SDN Config

  • Provide an fm_config / SDN config file that matches all switches included in this POST request.

  • For multiple topologies, use separate POST requests for each topology.

Find more information and examples:

Switch Profile

  • Ensure you have a switch profile with the required switch credentials. If not, create one in advance via POST /nmx/v1/switch-profiles.

  • A single bring-up request may specify a global switch profile for all switches, or a separate switch profile for each listed switch.

  • If a switch profile is specified for a particular switch, it takes precedence over the global profile specified in the request.

If all initial validations pass, A HTTP 202 Accepted response is returned with a JSON body containing a bring-up operation ID to track the process:

Copy
Copied!
            

{ "operationId": "682880baaf653727786b618f"  }

To track operation progress:

Copy
Copied!
            

curl -X 'GET' \ 'https://<NMX-Manager-API>/nmx/v1/bring-up/682880baaf653727786b618f' \ -H 'accept: application/json'

Step-by-Step Bring-Up Workflow

Initial Response

When the bring-up process has not yet started, the switch status is marked as "pending." It changes as soon as the NMX Manager begins the bring-up process.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Initial bring-up task", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "pending", "StatusDetails": "", "UpdatedAt": "2025-05-15T07:07:31.485Z" } ], "UpdatedAt": "2025-05-15T07:07:31.485Z" }


Bring-Up Execution Steps

Step 1: Pre Bring-up Validation

The NMX manager sends an API request to the switch tray to check whether the nmx-controller and nmx-telemetry services are already active.

If detected, the bring-up process is skipped to avoid overwriting the existing configuration. Bring-up can only be performed once per switch. Even if services are later shut down, the NMX Manager remembers and blocks repeated bring-up attempts, unless the services are explicitly deregistered from the NMX Manager by the user.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Step 1: Is switch configured request", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "in-progress", "StatusDetails": "Step 1: Is switch configured request: sent to switch-gateway.", "UpdatedAt": "2025-05-15T07:07:31.515Z" } ], "UpdatedAt": "2025-05-15T07:07:31.515Z" }

Step 2: Enable Cluster

The NMX Manager instructs the switch tray to start the nmx-controller and nmx-telemetry services required for cluster operations.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Step 2: Enable cluster request", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "in-progress", "StatusDetails": "Step 2: Enable cluster request: sent to switch-gateway.", "UpdatedAt": "2025-05-15T07:07:37.278Z" } ], "UpdatedAt": "2025-05-15T07:07:37.278Z" }

Step 3: Import And Configure Certificates

The NMX Manager generates the required certificates and sends them to the switch. Certificates are stored locally and configured for both nmx-controller and nmx-telemetry.

Each configuration action is processed asynchronously via the NVOS API, and job success is verified by polling.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Step 3: Import certificates request", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "in-progress", "StatusDetails": "Step 3: Import certificates request: sent to switch-gateway.", "UpdatedAt": "2025-05-15T07:07:59.62Z" } ], "UpdatedAt": "2025-05-15T07:07:59.62Z" }

Step 4: Enable mTLS For NMX Services

The NMX Manager sends a request to configure both services for mTLS encryption, ensuring secure gRPC communication. Each operation is tracked and validated.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Step 4: Enable encryption request", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "in-progress", "StatusDetails": "Step 4: Enable encryption request: sent to switch-gateway.", "UpdatedAt": "2025-05-15T07:08:39.559Z" } ], "UpdatedAt": "2025-05-15T07:08:39.559Z" }

Step 5: Import and Configure SDN Config

The NMX Manager uploads the SDN config (fm_config) provided in the bring-up request to the switch file system. nmx-controller is configured accordingly. Jobs are tracked and confirmed.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Step 5: Install SDN request", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "in-progress", "StatusDetails": "Step 5: Install SDN request: sent to switch-gateway.", "UpdatedAt": "2025-05-15T07:09:02.492Z" } ], "UpdatedAt": "2025-05-15T07:09:02.492Z" }

Step 6: Wait for NMX Controller Status And Register

The NMX Manager polls the controller until its addition-info field reports CONTROL_PLANE_STATE_CONFIGURED. Once confirmed, registration begins using a secure gRPC connection.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Step 6: Wait configured request", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "in-progress", "StatusDetails": "Step 6: Wait configured request: sent to switch-gateway.", "UpdatedAt": "2025-05-15T07:09:17.31Z" } ], "UpdatedAt": "2025-05-15T07:09:17.31Z" }

Step 7: Register NMX Telemetry

NMX Manager initiates registration of nmx-telemetry over a secure gRPC connection. This is also tracked and processed by backend components.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:07:31.428Z", "ID": "682880baaf653727786b618f", "Status": "in-progress", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Registration request", "StartedAt": "2025-05-15T07:07:31.485Z", "Status": "in-progress", "StatusDetails": "Step 7: NMX Telemetry Registration request: sent to inventory.", "UpdatedAt": "2025-05-15T07:09:55.687Z" } ], "UpdatedAt": "2025-05-15T07:09:55.687Z" }

Final Result: Registration Completed

If successful, both nmx-controller and nmx-telemetry are registered, and their ObjectIDs are returned.

Copy
Copied!
            

{ "CreatedAt": "2025-05-15T07:18:06.216Z", "ID": "682880baaf653727786b618f", "Status": "completed", "Switches": [ { "Address": "<switch-IP-Address-or-hostname>", "CurrentStep": "Registration response", "NMX-Controller-ID": "682595b7799bc550eec18a77", "NMX-Telemetry-ID": "682595b8799bc550eec18a78", "StartedAt": "2025-05-15T07:18:06.298Z", "Status": "completed", "StatusDetails": "bring-up completed successfully", "UpdatedAt": "2025-05-15T07:20:25.754Z" } ], "UpdatedAt": "2025-05-15T07:20:25.761Z" }


Operational Considerations

Asynchronous Processing

  • All POST requests return HTTP 202 Accepted with an operationId and a Location header.

  • Operations continue until each sub-task (per switch) reaches a terminal state: failed or completed.

  • Sub-tasks progress independently and in parallel.

Cancellation

  • Bring-up operations cannot be canceled once started.

Timeouts

    • If no progress is made within 10 minutes (configurable), the operation is marked as failed.

    • As long as one sub-task is progressing, the operation continues. Progress is indicated by the UpdatedAt field.

Troubleshooting

If bring-up fails, perform a cleanup:

  1. SSH into the switch.

  2. Retrieve existing certificates:

    Copy
    Copied!
                

    nv show system security ca-certificate nv show system security certificate

  3. Delete NMX-added certificates:

    Copy
    Copied!
                

    nv action delete system security certificate <certificate_name> nv action delete system security ca-certificate <ca_certificate_name>

  4. Remove SDN config:

    Copy
    Copied!
                

    nv action delete sdn config apps nmx-controller type fm_config files fm_config.cfg

  5. Reset SDN configuration:

    Copy
    Copied!
                

    nv action reset sdn factory-default

  6. Disable cluster state:

    Copy
    Copied!
                

    nv set cluster state disabled nv config apply

  7. Clean up certificate and config files:

    Copy
    Copied!
                

    rm /tmp/cert.p12 /tmp/ca-cert.crt /tmp/fm_config.cfg

© Copyright 2025, NVIDIA. Last updated on May 29, 2025.