Appendix - NDT Plugin
NDT plugin is a self-contained Docker container with REST API support managed by UFM. NDT plugin provides NDT topo diff capability. This feature allows the user to compare IB fabric managed by UFM and NDT files which are used by Microsoft for description of IB clusters network topology.
Main usage cases:
Get confidence on the IB fabric connectivity during cluster bring-up.
Get confidence on the specific parts of IB fabric after component replacements.
Automatically detect any changes in topology.
The following are the possible ways NDT plugin can be deployed:
On UFM Appliance
On UFM Software
Detailed instructions on how to deploy NDT plugin could be found on page mellanox/ufm-plugin-ndt.
Following authentication types are supported:
basic (/ufmRest)
client (/ufmRestV2)
token (/ufmRestV3)
The following REST APIs are supported:
GET /help
GET /version
POST /upload_metadata
GET /list
POST /compare
POST /cancel
GET /reports
GET /reports/<report_id>
POST /delete
For detailed information on how to interact with NDT plugin, refer to the NVIDIA UFM Enterprise > Rest API > NDT Plugin REST API.
NDT is a CSV file containing data relevant to the IB fabric connectivity.
NDT plugin extracts the IB connectivity data based on the following five fields:
Start device
Start port
End device
End port
Link type
Switch to Switch NDT
By default, IB links are filtered by:
Link Type is Data
Start Device and End Device end with IBn, where n is a numeric value.
For TOR switches, Start port/End port field should be in the format Port N, where N is a numeric value.
For Director switches, Start port/End port should be in the format Blade N_Port i/j, where N is a leaf number, i is an internal ASIC number and j is a port number.
Examples:
Start Device |
Start Port |
End Device |
End Port |
Link Type |
DSM07-0101-0702-01IB0 |
Port 21 |
DSM07-0101-0702-01IB1 |
Blade 2_Port 1/1 |
Data |
DSM07-0101-0702-01IB0 |
Port 22 |
DSM07-0101-0702-01IB1 |
Blade 2_Port 1/1 |
Data |
DSM07-0101-0702-01IB0 |
Port 23 |
DSM07-0101-0702-02IB1 |
Blade 3_Port 1/1 |
Data |
DSM09-0101-0617-001IB2 |
Port 33 |
DSM09-0101-0721-001IB4 |
Port 1 |
Data |
DSM09-0101-0617-001IB2 |
Port 34 |
DSM09-0101-0721-001IB4 |
Port 2 |
Data |
DSM09-0101-0617-001IB2 |
Port 35 |
DSM09-0101-0721-001IB4 |
Port 3 |
Data |
Switch to Host NDT
NDT is a CSV file containing data not only relevant to the IB connectivity.
Extracting the IB connectivity data is based on the following five fields:
Start device
Start port
End device
End port
Link type
IB links should be filtered by the following:
Link type is Data
Start device or End device end with IBN, where N is a numeric value.
The other Port should be based on persistent naming convention: ibpXsYfZ, where X, Y and Z are numeric values.
For TOR switches, Start port/End port field will be in the format Port n, where n is a numeric value.
For Director switches, Start port/End port will be in the format Blade N_Port i/j, where N is a leaf number, i is an internal ASIC number and j is a port number.
Examples:
Start Device |
Start Port |
End Device |
End Port |
Link Type |
DSM071081704019 |
DSM071081704019 ibp11s0f0 |
DSM07-0101-0514-01IB0 |
Port 1 |
Data |
DSM071081704019 |
DSM071081704019 ibp21s0f0 |
DSM07-0101-0514-01IB0 |
Port 2 |
Data |
DSM071081704019 |
DSM071081704019 ibp75s0f0 |
DSM07-0101-0514-01IB0 |
Port 3 |
Data |
Comparison results are forwarded to syslog as events. Example of /var/log/messages content:
Dec 9 12:32:31 <server_ip> ad158f423225[4585]: NDT: missing in UFM "SAT111090310019/SAT111090310019 ibp203s0f0 - SAT11-0101-0903-19IB0/15"
Dec 9 12:32:31 <server_ip> ad158f423225[4585]: NDT: missing in UFM "SAT11-0101-0903-09IB0/27 - SAT11-0101-0905-01IB1-A/Blade 12_Port 1/9"
Dec 9 12:32:31 <server_ip> ad158f423225[4585]: NDT: missing in UFM "SAT11-0101-0901-13IB0/23 - SAT11-0101-0903-01IB1-A/Blade 08_Port 2/13"
For detailed information about how to check syslog, please refer to the NVIDIA UFM-SDN Appliance Command Reference Guide > UFM Commands > UFM Logs.
Minimal interval value for periodic comparison in five minutes.
In case of an error the clarification will be provided.
For example, the request “POST /compare” without NDTs uploaded will return the following:
response code: 400
Response:
{ "error": [ "No NDTs were uploaded for comparison" ] }
Configurations could be found in “ufm/conf/ndt.conf”
Log level (default: INFO)
Log size (default: 10240000)
Log file backup count (default: 5)
Reports number to save (default: 10)
NDT format check (default: enabled)
Switch to switch and host to switch patterns (default: see NDT format section)
For detailed information on how to export or import the configuration, refer to the NVIDIA UFM-SDN Appliance Command Reference Guide > UFM Commands > UFM Configuration Management.
Logs could be found in “ufm/logs/ndt.log”.
For detailed information on how to generate a debug dump, refer to the NVIDIA UFM-SDN Appliance Command Reference Guide > System Management > Configuration Management > File System.