Collector
The collector is the main module that should be deployed and run on a host with management network access. It is important to note that an IB interface is not required on the host.
The Cable Validation tool can be deployed in two methods: as a standalone or as a UFM Enterprise plugin.
Deploying the Module as Standalone
Deploy the cables_bringup container on a host, as follows:
docker load -i /tmp/cables_bringup_<version>.tar.gz
docker run --name cables_bringup -itd --network=host cables_bringup
docker exec -it cables_bringup /bin/bash
Setting Docker Environment
Specifying the Network Interface
If the host system is equipped with multiple network interfaces and the switches are connected to the host through an interface that differs from the default management interface, the user can designate this particular interface by utilizing a specific environment variable, namely AGENTS_IFC_NAME. To illustrate, assuming the hypothetical interface name is eno3:
docker run --name cables_bringup -itd --network=host --env AGENTS_IFC_NAME=eno3 cables_bringup
Adding Hostnames
If the switches are not configured in the DNS server, you may add hostnames; the user may use the --add-host option when running the container. For example (assuming the switch name is switch-3245fa and its IP is 192.168.1.1):
docker run --name cables_bringup -itd --network=host --add-host=switch
-3245fa:192.168
.1.1
cables_bringup
Using Volumes
Volumes can be used for data persistence or easier file transfer to the cables_bringup container. The volume must be mapped to /cable_bringup_root in the container for data persistence. This volume can also be used for loading topology files. Example:
docker run --name cables_bringup -itd --network=host -v /opt/bringup_data:/cable_bringup_root cables_bringup
Overriding Apache Configuration
In the event that a host machine is running another Apache instance and utilizing the default ports 80/443, an alternative port may be designated for the bringup server by the user, these ports should be available and free. To accomplish this, the APACHE_HTTPS_PORT and APACHE_HTTP_PORT environment variables can be employed. Consider the following example:
docker run --name cables_bringup -itd --network=host --env APACHE_HTTP_PORT=9080
--env APACHE_HTTPS_PORT=9443
cables_bringup
Deploying the Module as a UFM Enterprise Plugin
Please note that Running Cable Validation as plugin is not supported on UFM Gen2.0.
Deploy the module as a UFM Enterprise plugin as follows:
docker load -i /tmp/cables_bringup_<version>.tar.gz
./manage_ufm_plugins.sh add -p cablevalidation
docker exec -it ufm-plugin-cablevalidation bash
Copy Files to the Plugin
Users have two methods for copying files to the Cable Validation plugin:
Copy the files to the plugin's data volume located at /opt/ufm/ufm_plugins_data/cablevalidation, which is mapped to /data/ inside the plugin container.
Use the 'docker cp' command to transfer the required files directly to the container.
Overriding the Apache Configuration
When using Cable Validation as a plugin, the default ports 80/443 are already in use by UFM Enterprise. Therefore, port 8280 will be used for HTTP, and 8633 for HTTPS by default. Users can opt to use different ports for the bring-up server, provided that these ports are available and free.
The plugin config.cfg file can be modified to update APACHE_HTTPS_PORT and APACHE_HTTP_PORT variables for that purpose. To make this adjustment, follow these steps:
Execute ./manage_ufm_plugins.sh add -p cablevalidation to add the Cable Validation plugin.
Stop the plugin using ./manage_ufm_plugins.sh stop -p cablevalidation
Use vim /opt/ufm/files/conf/plugins/cablevalidation/config.cfg to modify the 'APACHE_HTTPS_PORT' and 'APACHE_HTTP_PORT' variables.
Update and save the file.
Start the plugin again with ./manage_ufm_plugins.sh start -p cablevalidation.
With these changes, the new configuration will take effect, and Apache will run with the updated ports."
Run exec bringupcli in the container:
docker exec -it cables_bringup bringupcli
Alternatively, it is possible to run exec bash in the container and run bringcli from anywhere within the container:
docker exec -it cables_bringup
bringupcli Usage
bringupcli may have command line arguments, see usage below for more details:
root@r-ufm65:/# bringupcli -h
usage: bringupcli [-h] [-V] [-k]
Optional Arguments:
Argument |
Description |
-h, --help |
Show this help message and exit |
-V, --version |
Show program version number and exit |
-k, --kill-other-sessions |
Kill other CLI sessions if existent |
To initialize the tool, perform the following:
Load the fabric topology file:
load_topo <topo filename> topo file extension [cluster=<cluster name>] load_ptp <topo filename > excel file extension [cluster=<cluster name>] load_ip <ip filename> [cluster=<cluster name>] load <topo filename> <ip filename>(both topo and ips) [cluster=<cluster name>] load_clusters <clusters file>
Set the credentials for the switches. Use set_default_creds/set_switch_creds to set the credentials.
The argument `[save=true|false] default: true` can be used with both commands to indicate whether to save the credentials to a file or not.Deploy the agent on all switches. Run:
deploy_all_agents
Running bringup GUI
Open the following URL in the browser: https://<bringup_machine_ip>/cables_validation
Enter default credentials in the login page.
User management is not supported in the current version. To change it manually, use the htpasswd Linux utility.
In the bringup container, locate the .htaccees file
It is located at ${BRINGUP_CONF_APACHE_PATH}/.htaccess
Use htpasswd to add, modify or delete users.
user may change the default self signed certificate located by default in the container at:
SSLCertificateFile ${BRINGUP_CONF_APACHE_PATH}/certs/cv-cert.crt SSLCertificateKeyFile ${BRINGUP_CONF_APACHE_PATH}/
private
/cv-cert.key
To update a certificate, run the following command:
add_certificate <crt file> <key_file>: update the ssl certificate.
show_clusters: Show list of loaded clusters as loaded from the clusters file.
show_switches: Show list of loaded switches as loaded from the topology file
check_switch_status [cluster=<cluster>]: Check switch connectivity status (Ping/JSON-API/Agent )
start_validation [cluster=<cluster>]: Push topology to switches and get validation reports
stop_validation: Unsubscribe from getting switches updates
show_switch_history: Lists data files collected from switches in the last days
amber_show_latest: Shows latest collected amber data from switches
deploy_single_agent
deploy_all_agents
remove_all_agents
remove_single_agent
load_topo - Loads topology file (topo file extension).
load_topo <filename> dns=true [cluster=<cluster name >]–> assumes that DNS is active and you can access the switches by hostnames by default dns=true.
A topo file example:MQM8700 sw-hdr-proton01 CFG: main=4x P1 -4x-50G-> sw-hdr-proton02 P1 P2 -4x-50G-> sw-hdr-proton02 P2 P3 -4x-50G-> HCA_12 swx-proton03 mlx5_0/P1 P4 -4x-50G-> HCA_12 swx-proton04 mlx5_2/P1
load_ptp - Loads PTP topology file (Excel file).
load_ptp <filename> sheets="sheet 1,my-sheet" dns=true [cluster=<cluster name >]–> assumes that DNS is active and that you can access the switches by hostnames by the default setting of dns=true.
If sheets argument is provided, only given sheets are loaded, otherwise, all sheets will be loaded. An example of sheet in the ptp file:rack
U
Name
HCA/Port
Rack
U
Name
Port
316
22
c-csi-0329s
1
R113
22
c-csi-mqm9700-0327
1
316
24
c-csi-0331s
1
R113
22
c-csi-mqm9700-0327
1
Please be aware that the designated port can be indicated either as a singular numerical value or as a combination of two numbers separated by a forward slash in the case of a split port. Concerning the port numbers for Host Channel Adapters (HCA), the following mapping convention should be applied: 1 represents mlx5_0 P1, 2 represents mlx5_1 P1, and so on.
Moreover, it is mandatory for the Precision Time Protocol (PTP) file to incorporate a "Legend" sheet, which contains vital details regarding switch and host patterns. The below is an example:
Name
Model
Switch/HCA
Speed
c-csi-mqm*
MQM9700
switch
4x-100G
c-csi-0*
HCA_2
hca
4x-100G
Load_ip [cluster=<cluster name >]-Loads switch ip addresses, can be used if DNS is inactive. Loads the IP/switch-name mapping, to allow reaching the switch via REST API to retrieve local topology, GUID, etc. The file format is pairs of IP addresses and hostname. This file will be used in association with a ‘topo’ file in case DNS is unavailable.
An IP file example:
# A comment
10.0
.30
switch110.0
.0.31
switch2load [cluster=<cluster name >]- Loads both IP addresses and topo files. load inputs/my-topo loads inputs/my-topo.topo and inputs/my-topo.ip.
load_clusters <clusters file> - Clusters file should have the following format, where topo file should be in xlsx format and the ip file is optional, if it is not provided, dns will be considered as true when loading topo.
# cluster_name, topo_file, [ip_file] CLUSTER1, cluster1_topo.xlsx, cluster1.ip CLUSTER2, cluster2_topo.xlsx,
show_switches [cluster=<cluster name >]- Shows the list of loaded switches as loaded from the topology file. If the cluster name is provided, show the switch in the given cluster only.
Example output:MQM8700 sw-hdr-proton01 ----------------------- MQM8700 sw-hdr-proton01 P3 --> swx-proton03 mlx5_0 P1 MQM8700 sw-hdr-proton01 P4 --> swx-proton04 mlx5_2 P1 MQM8700 ufm-sw-hdr01 -------------------- MQM8700 ufm-sw-hdr01 P1 --> ufm-sw-hdr02 P1 MQM8700 ufm-sw-hdr02 -------------------- MQM8700 ufm-sw-hdr02 P1 --> ufm-sw-hdr01 P1
set_default_creds - Sets the default switch/host credentials to override the built-in default credentials. These credentials are used for communication with any switch that does not have specific credentials.
set_default_creds user=<user> pwd=<pwd> [type=
switch
|host] [save=true
|false
]set_node_creds - Sets the credentials for a specific switch/host, it can be used when the switch credentials are different than the defaults.
set_node_creds <
switch
> user=<user> pwd=<pwd> [save=true
|false
]deploy_all_agents - Deploys agents on loaded switches that have no agents.
deploy_single_agent - Deploys agent on a specific switch.
remove_all_agents - Removes agents from loaded switches that have agents.
remove_single_agent - Removes an agent from a specific switch.
show_switch_history - Lists data files collected from switches in the last days show_switch_history past=3d. Past argument can be used to specify the history interval, by default it is set to one week past=1w.
amber_show_latest - Shows the latest collected amber data from switches
check_switch_status [cluster=<cluster name>] - Checks switch connectivity status (Ping/JSON-API/Agent). If the cluster is provided, the check will be done for the switches in the provided cluster only.
Example output:Host IP ping JSONAPI Agent ----------------------------- ------------- ---- ---- ----- sw-hdr-proton01.mtr.labs.mlnx
209.44
.74
True True True ufm-sw-hdr01.mtr.labs.mlnx10.209
.36.113
True True True ufm-sw-hdr02.mtr.labs.mlnx10.209
.36.122
True True Trueadd_certificate <crt file> <key_file> - Updates the SSL certificate file used by Apache for secure connections. The provided file should be a valid SSL certificate file in crt format. The old certificate file will be backed up before replacing it with the new one.
start_validation - Initiates validation routine: pushes topology to switches and gets validation reports timeout (an optional argument), in which validation stops. (For example timeout=20m or timeout=2h). If timeout is not provided, use the stop_validation command to stop it. start_validation timeout=n (in seconds/minutes/hours/days).
stop_validation - Stops validation routine. Unsubscribe from getting switches updates.
version - Shows application version.
exit - Exits the application.
help - Shows a list of commands. For help on a specific command, run help <command>
The collector has a web server listening on two internal ports 8251 and 8252. These ports are not advertised outside the machine. The bringup server is running on the Apache server which uses the default http/https ports. It is not recommended to change the internal ports, as this requires changing the Apache service configuration. The Apache service uses a self signed certificate, that the user can change to his own certificate. All REST APIs can run only with https.
Please note that for all the following REST API URLs, the <host> attribute is the host IP or the hostname with the correct port number in case it is not the default one. For example:
https://10.20.30.40/ # the default port: 433
The following are the supported REST APIs:
Login
To use a REST API, you need to have session credentials. If you want to use curl to access the REST API, you should log in first by going to the URL cablevalidation/login and saving the cookie. After that, you can use the saved cookie for subsequent requests.
# login and save cookie
curl -k -X POST -c cookies.txt -d "httpd_username=<user>"
-d "httpd_password=<password>"
https://<host>/cablevalidation/login
# use saved cookie for
REST API requests
curl -k --cookie cookies.txt https://127.0.0.1/cablevalidation/report/validation
Retrieving Validation Report
Run:
GET https://<host>/cablevalidation/report/validation
Validation Report Output Example
curl -k https://swx-proton01/cablevalidation/report/validation | python3 -m json.tool
{
"report"
: "ValidationReport"
,
"stats"
: {
"in_progress"
: 3
,
"no_issues"
: 0
,
"not_started"
: 0
},
"issues"
: [
{
"timestamp"
: 1666176949.5110743
,
"node_desc"
: "MQM8700 sw-hdr-proton01"
,
"issues"
: [
[
"Wrong-neighbor"
,
"MQM8700 sw-hdr-proton01:P3"
,
"HCA_12 swx-proton03 mlx5_0:P1"
,
"None:PNA"
],
[
"Wrong-neighbor"
,
"MQM8700 sw-hdr-proton01:P4"
,
"HCA_12 swx-proton04 mlx5_2:P1"
,
"HCA_12 swx-proton04 mlx5_0:P1"
]
]
},
{
"timestamp"
: 1666176949.4999607
,
"node_desc"
: "MQM8700 ufm-sw-hdr02"
,
"issues"
: [
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr02:P2"
,
"NONE"
,
"MQM8700 ufm-sw-hdr01:P2"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr02:P3"
,
"NONE"
,
"MQM8700 ufm-sw-hdr01:P3"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr02:P7"
,
"NONE"
,
"MQM8700 ufm-sw-hdr01:P7"
]
]
},
{
"timestamp"
: 1666176949.4870453
,
"node_desc"
: "MQM8700 ufm-sw-hdr01"
,
"issues"
: [
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr01:P2"
,
"NONE"
,
"MQM8700 ufm-sw-hdr02:P2"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr01:P3"
,
"NONE"
,
"MQM8700 ufm-sw-hdr02:P3"
],
[
"Extra-cable"
,
"MQM8700 ufm-sw-hdr01:P7"
,
"NONE"
,
"MQM8700 ufm-sw-hdr02:P7"
]
]
}
]
}
Bringup Commands Support via REST API
The processing of bringup commands is not limited to the CLI; it can also be accomplished through the REST API.
Processing a Command
Run:
POST https://<host>/cablevalidation/commands/{command_name} <command-data>
Supported Commands
Command |
Async |
Argument |
Type |
Mandatory |
load_topo |
False |
|||
dns |
bool |
False |
||
files |
list |
True |
||
cluster |
str |
False |
||
load_ip |
False |
|||
files |
list |
True |
||
cluster |
str |
False |
||
load_ptp |
False |
|||
dns |
bool |
False |
||
sheets |
list |
False |
||
files |
str |
True |
||
cluster |
str |
False |
||
load |
False |
|||
file_prefix |
str |
True |
||
cluster |
str |
False |
||
Load_clusters |
False |
|||
file |
str |
True |
||
set_default_creds |
False |
|||
user |
str |
True |
||
pwd |
str |
True |
||
type |
str |
False |
||
save |
bool |
False |
||
set_node_creds |
False |
|||
user |
str |
True |
||
pwd |
str |
True |
||
type |
str |
True |
||
save |
bool |
False |
||
deploy_all_agents |
True |
|||
deploy_single_agent |
True |
|||
switch |
str |
True |
||
remove_all_agents |
True |
|||
remove_single_agent |
True |
|||
switch |
str |
True |
||
start_validation |
True |
|||
cluster |
str |
False |
||
stop_validation |
True |
|||
add_certificate |
False |
|||
crt_file |
str |
True |
||
key_file |
str |
True |
||
check_switch_status |
True |
|||
show_switches |
False |
|||
name_pattern |
str |
False |
||
show_switch_history |
False |
|||
switches |
str |
False |
||
past |
str |
False |
||
amber_show_latest |
False |
|||
filter |
str |
False |
||
exit |
False |
Process Command Example
The command body is a JSON dictionary of key-value arguments as described in the table below.
curl -k https://127.0.0.1/cablevalidation/commands/load_topo -d '{"files":["inputs/lab.topo"], "dns":true}' -X POST
Command load_topo completed successfully
Supported Commands
Getting Command Output
GET https://<host>/cablevalidation/commands/{command_name}/output
timestamp is an optional argument that enables the user to obtain only the output generated after a particular point in time. It is included in the following format: GET https://<host>/cablevalidation/commands/{command_name}/output?timestamp=<val>.
The response to this request takes the form of a JSON dictionary, containing the following details:
command: the processed command.
request_ts: timestamp of the request made by the user, if a timestamp was provided; otherwise, it is set to 0.
last_ts: the timestamp of the most recent message in the output, which the user can utilize for subsequent requests.
status: represents the current status of the command, which can be either "Completed" or "InProgress".
content: the actual output log of the command.
Command Output Example
curl -k https://localhost/cablevalidation/commands/deploy_all_agents/output 2> /dev/null | python -m json.tool
{
"command"
: "deploy_all_agents"
,
"content"
: [
"Will install agent on 10.209.44.74"
,
"Will install agent on 10.209.36.113"
,
"Will install agent on 10.209.36.122"
,
Getting Commands Processing Status
GET https://<host>/cablevalidation/commands/status
The response to the request provides a JSON dictionary that conveys pertinent information regarding the processing status of commands, which may fall into one of two categories:
Idle - In this scenario, the user is at liberty to initiate a new command.
Executing <command> - In this instance, the processor is currently engaged in executing a command, and as such, is incapable of processing any new commands until the current operation is complete.
Getting a List of Supported Commands
The following command returns a JSON dictionary with all supported commands as well as their arguments and if it async or sync.
GET https://<host>/cablevalidation/commands
Supported Commands Output Example
Output has been cut.
{
"load_topo"
: {
"args"
: {
"dns"
: {
"type"
: "bool"
,
"mandatory"
: false
},
"files"
: {
"type"
: "list"
,
"mandatory"
: true
}
},
"is_async"
: false
}
}
Getting Help on Command
GET https://<host>/cablevalidation/commands/{command_name}/help
The response to the request is in the form of a JSON dictionary, which provides the following details:
command: The name of the command that was executed.
help: A list of output lines that convey relevant information about the command.
Command Help Example
curl -k https://localhost/cablevalidation/commands/load_topo/help 2> /dev/null | python -m json.tool
{
"command"
: "load_topo"
,
"help"
: [
""
,
" load_topo filename dns=true/false"
,
""
,
" default dns=true"
,
" If no dns server to resolve hostnames in topo file, you should set dns=false and provide IP addresses file."
,
" when true, no need to provide IP addresses."
,
""
Rack View
Rack and unit information can be shown when loading a PTP Excel file, however, topo files do not contain such information, therefore, rack view is not available.
Rack view is supported via two REST APIs.
Getting List of Racks
The following command returns a JSON list of all loaded racks.
GET https://<host>/resources/racks
Racks List Output Example
[
"1108"
,
"1106"
]
Getting Rack View of a Specific Rack
The following command returns a JSON dictionary with rack details.
GET https://<host>/resources/racks/{rack-name}
Rack View Output Example
{
"name"
: "1108"
,
"units"
: [
{
"nodedesc"
: "MSB7800 r-ufm-sw10"
,
"ports"
: [
{
"port"
: "P25"
,
"syndrome"
: "Wrong-neighbor"
},
{
"port"
: "P26"
,
"syndrome"
: "Wrong-neighbor"
},
{
"port"
: "P27"
,
"syndrome"
: "Active"
},
{
"port"
: "P28"
,
"syndrome"
: "Active"
}
],
"unit"
: "40"
}
]
}