The Cable Validation tool contains two sub-projects - Collector and Cable Agent.
The collector is the main module that should be deployed and run on a host with management network access. It is important to note that an IB interface is not required on the host.
Deploy
Deploy the cables_bringup container on a host, as follows:
docker load -i /tmp/cables_bringup_<version>.tar.gz
docker run --name cables_bringup -itd --network=host cables_bringup
docker exec -it cables_bringup /bin/bash
Setting Docker Environment
Specifying the Network Interface
If the host system is equipped with multiple network interfaces and the switches are connected to the host through an interface that differs from the default management interface, the user has the option to designate this particular interface through the utilization of a specific environment variable, namely AGENTS_IFC_NAME. To illustrate, assuming the hypothetical interface name is eno3:
docker run --name cables_bringup -itd --network=host --env AGENTS_IFC_NAME=eno3 cables_bringup
Adding Hostnames
If the switches are not configured in the DNS server, you may add hostnames; the user may use the --add-host option when running the container. For example (assuming the switch name is switch-3245fa and its IP is 192.168.1.1):
docker run --name cables_bringup -itd --network=host --add-host=switch
-3245fa:192.168
.1.1
cables_bringup
Using Volumes
Volumes can be used for data persistence or easier file transfer to the cables_bringup container. For data persistency, the volume must be mapped to /cable_bringup_root in the container. This volume can also be used for loading topology files. Example:
docker run --name cables_bringup -itd --network=host -v /opt/bringup_data:/cable_bringup_root cables_bringup
Running bringup CLI
Run exec bringupcli in the container:
docker exec -it cables_bringup bringupcli
Alternatively, it is possible to run exec bash in the container and run bringcli from anywhere within the container:
docker exec -it cables_bringup
bringupcli Usage
bringupcli may have command line arguments, see usage below for more details:
root@r-ufm65:/# bringupcli -h
usage: bringupcli [-h] [-V] [-k]
Optional Arguments:
Argument |
Description |
-h, --help |
Show this help message and exit |
-V, --version |
Show program version number and exit |
-k, --kill-other-sessions |
Kill other CLI sessions if existent |
To initialize the tool, perform the following:
Load the fabric topology file:
load_topo <topo filename> topo file extension load_ptp <topo filename > excel file extension load_ip <ip filename> load <topo filename> <ip filename>(both topo and ips)
Set the credentials for the switches. Use set_default_creds/set_switch_creds to set the credentials.
Deploy the agent on all switches. Run:
deploy_all_agents
Run bringup GUI
Open the following URL in the browser: https://<bringup_machine_ip>/cables_validation
Enter default credentials in the login page.
User management is not supported in the current version. To change it manually, use the htpasswd Linux utility.
In the bringup container, locate the .htaccees file
It is located at ${BRINGUP_CONF_APACHE_PATH}/.htaccess
Use htpasswd to add, modify or delete users.
user may change the default self signed certificate located by default in the container at:
SSLCertificateFile ${BRINGUP_CONF_APACHE_PATH}/certs/cv-cert.crt SSLCertificateKeyFile ${BRINGUP_CONF_APACHE_PATH}/
private
/cv-cert.key
Validations
show_switches: Show list of loaded switches as loaded from the topology file
check_switch_status: Check switch connectivity status (Ping/JSON-API/Agent )
start_validation: Push topology to switches and get validation reports
stop_validation: Unsubscribe from getting switches updates
Other commands
show_switch_history: Lists data files collected from switches in the last days
amber_show_latest: Shows latest collected amber data from switches
Troubleshooting
deploy_single_agent
deploy_all_agents
remove_all_agents
remove_single_agent
Complete CLI commands reference
load_topo - Loads topology file (topo file extension).
load_topo <filename> dns=true –> assumes that dns is active and you can access the switches by hostnames by default dns=true.A topo file example:
MQM8700 sw-hdr-proton01 CFG: main=4x P1 -4x-50G-> sw-hdr-proton02 P1 P2 -4x-50G-> sw-hdr-proton02 P2 P3 -4x-50G-> HCA_12 swx-proton03 mlx5_0/P1 P4 -4x-50G-> HCA_12 swx-proton04 mlx5_2/P1```
load_ptp - Loads PTP topology file (Excel file).
load_ptp <filename> sheets="sheet 1,my-sheet" dns=true –> assumes that DNS is active and that you can access the switches by hostnames by the default setting of dns=true.
If sheets argument is provided, only given sheets are loaded, otherwise, all sheets will be loaded. An example of sheet in the ptp file:rack
U
Name
HCA/Port
Rack
U
Name
Port
316
22
c-csi-0329s
1
R113
22
c-csi-mqm9700-0327
1
316
24
c-csi-0331s
1
R113
22
c-csi-mqm9700-0327
1
oad_ip - Loads switch ip addresses, can be used if DNS is inactive. Loads the IP/switch-name mapping, to allow reaching the switch via REST API to retrieve local topology, GUID, etc. The file format is pairs of IP addresses and hostname. This file will be used in association with a ‘topo’ file in case DNS is unavailable.
An IP file example:# A comment
10.0
.30
switch110.0
.0.31
switch2load - Loads both IP addresses and topo files. load inputs/my-topo loads inputs/my-topo.topo and inputs/my-topo.ip
show_switches - Shows the list of loaded switches as loaded from the topology file.
Example output:MQM8700 sw-hdr-proton01 ----------------------- MQM8700 sw-hdr-proton01 P3 --> swx-proton03 mlx5_0 P1 MQM8700 sw-hdr-proton01 P4 --> swx-proton04 mlx5_2 P1 MQM8700 ufm-sw-hdr01 -------------------- MQM8700 ufm-sw-hdr01 P1 --> ufm-sw-hdr02 P1 MQM8700 ufm-sw-hdr02 -------------------- MQM8700 ufm-sw-hdr02 P1 --> ufm-sw-hdr01 P1
set_default_creds - Sets the default switch/host credentials to override the built-in default credentials. These credentials are used for communication with any switch that does not have specific credentials.
set_default_creds user=<user> pwd=<pwd> [type=
switch
|host]set_node_creds - Sets the credentials for a specific switch/host, it can be used when the switch credentials are different than the defaults.
set_node_creds <
switch
> user=<user> pwd=<pwd>deploy_all_agents - Deploys agents on loaded switches that have no agents.
deploy_single_agent - Deploys agent on a specific switch.
remove_all_agents - Removes agents from loaded switches that have agents.
remove_single_agent - Removes an agent from a specific switch.
show_switch_history - Lists data files collected from switches in the last days show_switch_history past=3d. Past argument can be used to specify the history interval, by default it is set to one week past=1w.
amber_show_latest - Shows the latest collected amber data from switches
check_switch_status - Checks switch connectivity status (Ping/JSON-API/Agent).
Example output:Host IP ping JSONAPI Agent ----------------------------- ------------- ---- ------- ----- sw-hdr-proton01.mtr.labs.mlnx
209.44
.74
True True True ufm-sw-hdr01.mtr.labs.mlnx10.209
.36.113
True True True ufm-sw-hdr02.mtr.labs.mlnx10.209
.36.122
True True Trueupgrade_switch_os - TBD
start_validation - Initiates validation routine: pushes topology to switches and gets validation reports timeout (an optional argument), in which validation stops. (For example timeout=20m or timeout=2h). If timeout is not provided, use the stop_validation command to stop it. start_validation timeout=n (in seconds/minutes/hours/days).
stop_validation - Stops validation routine. Unsubscribe from getting switches updates.
version - Shows application version.
exit - Exits the application.
help - Shows a list of commands. For help on a specific command, run help <command>
Bringup Server REST API
The collector has a web server listening on two internal ports 8251 and 8252. These ports are not advertised outside the machine. The bringup server is running on the Apache server which uses the default http/https ports. It is not recommended to change the internal ports, as this requires changing the Apache service configuration. The Apache service uses a self signed certificate, that the user can change to his own certificate. All REST APIs can run only with https. The following listed the supported REST APIs
Login
To use a REST API, you need to have session credentials. If you want to use curl to access the REST API, you should log in first by going to the URL cablevalidation/login and saving the cookie. After that, you can use the saved cookie for subsequent requests.
# login and save cookie
curl -k -X POST -c cookies.txt -d "httpd_username=<user>"
-d "httpd_password=<password>"
https://127.0.0.1/cablevalidation/login
# use saved cookie for
REST API requests
curl -k --cookie cookies.txt https://127.0.0.1/cablevalidation/report/validation
Retrieving Validation Report
Run:
GET https://<host-ip-or-name>/cablevalidation/report/validation
Validation Report Output Example
curl -k https://swx-proton01/cablevalidation/report/validation | python3 -m json.tool
{
"report"
: "ValidationReport"
,
"stats"
: {
"in_progress"
: 3
,
"no_issues"
: 0
,
"not_started"
: 0
},
"issues"
: [
{
"timestamp"
: 1666176949.5110743
,
"node_desc"
: "MQM8700 sw-hdr-proton01"
,
"issues"
: [
[
"Wrong-neighbor"
,
"MQM8700 sw-hdr-proton01:P3"
,
"HCA_12 swx-proton03 mlx5_0:P1"
,
"None:PNA"
],
[
"Wrong-neighbor"
,
"MQM8700 sw-hdr-proton01:P4"
,
"HCA_12 swx-proton04 mlx5_2:P1"
,
"HCA_12 swx-proton04 mlx5_0:P1"
]
]
},
{
"timestamp"
: 1666176949.4999607
,
"node_desc"
: "MQM8700 ufm-sw-hdr02"
,
"issues"
: [
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr02:P2"
,
"NONE"
,
"MQM8700 ufm-sw-hdr01:P2"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr02:P3"
,
"NONE"
,
"MQM8700 ufm-sw-hdr01:P3"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr02:P7"
,
"NONE"
,
"MQM8700 ufm-sw-hdr01:P7"
]
]
},
{
"timestamp"
: 1666176949.4870453
,
"node_desc"
: "MQM8700 ufm-sw-hdr01"
,
"issues"
: [
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr01:P2"
,
"NONE"
,
"MQM8700 ufm-sw-hdr02:P2"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr01:P3"
,
"NONE"
,
"MQM8700 ufm-sw-hdr02:P3"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr01:P7"
,
"NONE"
,
"MQM8700 ufm-sw-hdr02:P7"
]
]
}
]
}
Bringup commands support via REST API
The processing of bringup commands is not limited to the CLI; it can also be accomplished through the REST API.
Processing a Command
Run:
POST https://<host-ip-or-name>/cablevalidation/commands/{command_name} <command-data>
Process Command Example
The command body is a JSON dictionary of key-value arguments as described in the table below.
curl -k https://127.0.0.1/cablevalidation/commands/load_topo -d '{"files":["inputs/lab.topo"], "dns":true}' -X POST
Command load_topo completed successfully
Supported Commands
Command |
Async |
Argument |
Type |
Mandatory |
load_topo |
False |
|||
dns |
bool |
False |
||
files |
list |
True |
||
load_ip |
False |
|||
files |
list |
True |
||
load_ptp |
False |
|||
dns |
bool |
False |
||
sheets |
list |
False |
||
files |
str |
True |
||
set_default_creds |
False |
|||
user |
str |
True |
||
pwd |
str |
True |
||
type |
str |
False |
||
set_node_creds |
False |
|||
user |
str |
True |
||
pwd |
str |
True |
||
type |
str |
True |
||
deploy_all_agents |
True |
|||
deploy_single_agent |
True |
|||
switch |
str |
True |
||
remove_all_agents |
True |
|||
remove_single_agent |
True |
|||
switch |
str |
True |
||
start_validation |
True |
|||
stop_validation |
True |
Getting List of Supported Commands
The following command returns a JSON dictionary with all supported commands as well as their arguments and if it async or sync.
GET https://<host-ip-or-name>/cablevalidation/commands
Supported Commands Output Example
Output has been cut.
{
"load_topo"
: {
"args"
: {
"dns"
: {
"type"
: "bool"
,
"mandatory"
: false
},
"files"
: {
"type"
: "list"
,
"mandatory"
: true
}
},
"is_async"
: false
}
}
Rack View
Rack and unit information can be shown when loading a PTP Excel file, however, topo files do not contain such information, therefore, rack view is not available.
Rack view is supported via two REST APIs.
Getting List of Racks
The following command returns a JSON list of all loaded racks.
GET https://<host-ip-or-name>/resources/racks
Racks List Output Example
[
"1108"
,
"1106"
]
Getting Rack View of a Specific Rack
The following command returns a JSON dictionary with rack details.
GET https://<host-ip-or-name>/resources/racks/{rack-name}
Rack View Output Example
{
"name"
: "1108"
,
"units"
: [
{
"nodedesc"
: "MSB7800 r-ufm-sw10"
,
"ports"
: [
{
"port"
: "P25"
,
"syndrome"
: "Wrong-neighbor"
},
{
"port"
: "P26"
,
"syndrome"
: "Wrong-neighbor"
},
{
"port"
: "P27"
,
"syndrome"
: "Active"
},
{
"port"
: "P28"
,
"syndrome"
: "Active"
}
],
"unit"
: "40"
}
]
}
Build Collector
Note: this section is for development only Run build/build_collector a new docker image will be created: image /tmp/cablesbringup_<version>.tar.gz was created
Run bringup GUI from source
In case you are running the bringup from the docker, all you need is to open the following URL in the browser: https://<bringup_machine_ip>/cables_validation
In case you are running the bringup from the source code, you need first to run build/build_gui.sh in order to compile the GUI code; the media directory will be created under cables_validation/src/collector as a result of the building script.
TBD: How to run GUI, without Apache
Build Agent
This section is for development only.
Executing the build/build_agent script generates a new Docker image, which will subsequently be stored as /tmp/cables_agent_<version>.tar.gz. This file can be utilized for disseminating the Docker image to other computing environments or for preservation purposes.
Check if cable agent runs on the switch:
Run:
ssh admin@<
switch
-ip-or-name>Enable
Show docker images
Exit
If cables agent is running on the switch, the following output is prompted.
----------------------------------------------------------------------------
Image Version Created Size
----------------------------------------------------------------------------
cables_agent latest 13
hours ago 788MB
Deploy on the Switch
Usually, it is not necessary to manually deploy the agent onto the switch, as it is recommended to use the deploy_all_agents or deploy_single_agent commands from the bringup CLI. However, in instances where manual deployment is required, the following commands can be executed:
enable
configure terminal
no docker shutdown
image fetch scp://<user>:<pwd>@<hostname>/tmp/cables_agent_<version>.tar.gz cables_agent_latest.tar.gz
docker load cables_agent_latest.tar.gz
docker start cables_agent latest cables_agent now-and-init privileged network
For cleanup, run:
docker no start cables_agent
docker remove image cables_agent latest
image delete cables_agent_latest.tar.gz
To enter terminal in the container running on the switch, run:
enable
configure terminal
docker exec cables_agent /bin/bash
Cables Agent REST API
the agent has a web server listening on port 8251. The following two REST APIs are supported:
https://<switch-ip-or-name>:8251/resources/links
https://<switch-ip-or-name>:8251/resources/ports
Output Example of Links:
curl -k https://sw-hdr-proton01:8251/resources/links | python3 -m json.tool
[
{
"info"
: {
"md5"
: "256477d766fa8d8853848c43c35982ba"
,
"timestamp"
: 1659355401394591
,
"time"
: "2022-08-01 12:03:21.394601"
},
"src"
: {
"Node Description"
: "MF0;sw-hdr-proton01:MQM8700/U1"
,
"Guid"
: "0x0c42a1030079a6ec"
,
"ip"
: "10.209.44.74"
,
"Node Name"
: "sw-hdr-proton01"
},
"dests"
: {
"4"
: {
"Node Description"
: "swx-proton04 mlx5_2"
,
"Guid"
: "0xb8cef6030083bea2"
,
"LocalPort"
: "1"
},
"2"
: {
"Node Description"
: "Quantum Mellanox Technologies"
,
"Guid"
: "0xb8cef60300fbf210"
,
"LocalPort"
: "2"
},
"3"
: {
"Node Description"
: "swx-proton03 mlx5_0"
,
"Guid"
: "0xb8cef6030083bf02"
,
"LocalPort"
: "1"
},
"1"
: {
"Node Description"
: "Quantum Mellanox Technologies"
,
"Guid"
: "0xb8cef60300fbf210"
,
"LocalPort"
: "1"
}
}
}
]
Output Example of Ports
curl -k https://sw-hdr-proton01:8251/resources/ports | python3 -m json.tool
[
{
"port"
: "IB1/10"
,
"port_num"
: "10"
,
"logical"
: "Down"
,
"physical"
: "Polling"
},
{
"port"
: "IB1/11"
,
"port_num"
: "11"
,
"logical"
: "Down"
,
"physical"
: "Polling"
},
{
"port"
: "IB1/12"
,
"port_num"
: "12"
,
"logical"
: "Down"
,
"physical"
: "Polling"
},
{
"port"
: "IB1/13"
,
"port_num"
: "13"
,
"logical"
: "Down"
,
"physical"
: "Polling"
}
]