Deployment
The Telemetry Agent is packaged in a Docker image that should be loaded and deployed on a supporting Mellanox Spectrum® Ethernet Switch. This section describes how to deploy the Docker image on the switch.
The NEO application features automated deployment of the Telemetry Agent on Mellanox Spectrum switch systems. For more information, please refer to the NEO Telemetry Agent Appendix in the NEO User Manual.
Before deploying the Telemetry Agent on the switch, make sure that the switch is docker-enabled. For example, when using Mellanox Onyx, you can verify that the docker is enabled using the "show docker" command, and when needed, enable the docker using the "docker no shutdown" command.
To deploy the Docker image, the following steps should be performed:
Download the NEO Telemetry Agent from the Mellanox customer portal and copy it to a remote server.
Connect to the Mellanox switch via SSH.
Enter the switch CLI mode:
switch > enable switch # configure terminal
Copy the Docker image from the remote server, for example:
switch (config) # image fetch scp://admin:qwerty@10.20.30.100/docker_files/docker_images/telemetry-agent_<version>.img.gz
Make sure that the Docker service is running.
switch (config) # no docker shutdown
Load the image, using the docker load <image_name> command:
switch (config) # docker load telemetry-agent_<version>.img.gz
Once the image is copied to the switch, deploy it using the following command:
switch (config) # docker start mellanox/telemetry-agent <version> <container name> now-and-init cpus 0.5 memory 300 privileged network sdk
Run the configuration write command:
switch (config) # configuration write
The telemetry agent must create trust with the switch in order to allow telemetry on LAGs and MLAGs. Run:
switch (config) # docker exec [docker instance name] "/opt/telemetry/utils/create_trust.sh"
Copy the key generated and printed on your screen:
switch (config) # docker exec <docker-instance-name> /opt/telemetry/utils/create_trust.sh Running exec_name: [/opt/telemetry/utils/create_trust.sh] Generating public/private rsa key pair. Crated directory '/root/.ssh'. Your identification has been saved in /root/.ssh/id rsa. Your public key has been saved in /root/.ssh/id_ rsa.pub. The key fingerprint is: root@switch The kye's randomart image is: +---[RSA 20 8)-----+ | | | | | | | | | | | | | | | | | | | | +------------------+ ssh-rsa Some1Random2Genraced3Key4Wich5Random6Chars7 rooc@swicch
And run the following command:
switch (config) # ssh client user admin authorized-key sshv2 "ssh-rsa Some1Random2Genraced3Key4Wich5Random6Chars7 rooc@switch"
The Telemetry Agent is waiting for Mellanox SDK installation. Install it, using the following command from the switch prompt:
switch (config) # docker switch (config) # copy-sdk <container-name> to /
Once Mellanox SDK was installed, the Telemetry Agent service should be automatically running on the Docker. In order to verify that the Telemetry Agent is running, do the following:
Make sure that the Docker has been loaded/started: find your newly created Docker name in the output of the "docker ps" command. If the Docker name exists, run:
switch (config) # docker exec <container-name> "/bin/bash"
This will bring you into Docker standard Linux prompt. Run:
"/etc/init.d/telemetryd status"
If service is running, the output should look like the following:
#/etc/init.d/telemetryd status Telemetry agent status: Telemetry agent is running
To exit the Telemetry Agent Docker context, run "exit" command to return to the switch CLI context.
Run initial telemetry configuration:
switch (config) # docker exec <container-name> "bash /opt/telemetry/utils/telemetry_agent_init.sh 127.0.0.1 7654"
Save the configuration.
switch (config) # configuration write
For initial settings and configuration instructions, see Initial Settings and Configuration.
The Telemetry Agent is running and waiting for correct configuration in the config file. In order to set the initial configuration, users must access the telemetry agent docker container using the following command:
docker exec neo-agent /bin/bash.
T he path to the config file is: /opt/telemetry/conf/tm.ini.
The default structure of the tm.ini file is as follows:
[Controller]
controller_ip=l27 0.0.l
controller_port=7654
enable_telemetry=false
min_polling_interval_in_ms=l00
error_ack_check_interval=60
system_error_ack_timeout=60
session_error_ack_timeout=30
update_active=ports_interval=300
calc_rates=false
# max wjh packets buffer - min value: l max value: max_buffer_packets*max_messages_per_interval<5000
max_packets_buffer=l250
max_messages_per_interval=4
[Logging]
log_level=INFO
[OS]
switch_os=Onyx
sub_type=Ethernet
sample_down_ports=false
enable_lag_mlag_discovery=true
[Collector]
# clean json message and remove empty fields
clean_json=true
# counter chunk to limit interface counters per message
counters_chunk=false
# size of interfaces to send per message
counters_chunk_size=64
# connection timeout in seconds - min value: l
connection_timeout=3
# max collector messages in queue - min value: 2, max value: 20
max_collector_messages=l0
The configuration keys are listed in the following table:
Section |
Key |
Type |
Default Value |
Optional Values |
Description |
Controller |
controller_ip |
String |
127.0.0.1 |
Ips |
Controller IP |
controller_port |
Int |
7654 |
ports |
Controller port |
|
enable_telemetry |
Boolean |
false |
true/false |
Must be set to true for telemetry to start |
|
calc_rates |
Boolean |
false |
true/false |
Return telemetry counter as rates according to the interval which exists for the counter session instead of raw data. |
|
max_packets_buffer |
Int |
1250 |
1-5000 |
Maximum WJH packets buffer. The configured value is the amount of WJH events sent per interval (i.e. max_packets_buffer*max_messages_per_interval). |
|
max_messages_per_interval |
Int |
4 |
- |
Max WJH messages sent per interval |
|
Logging |
log_level |
String |
INFO |
INFO/DEBUG/ERROR |
You can view the logs in /opt/telemetry/log/telemetry.log |
OS |
enable_lag_mlag_discovery |
Boolean |
true |
true/false |
Enable LAG/MLAG discovery using NOS |
Collector |
clean_json |
String |
true |
true/false |
Clean JSON message and remove empty fields – can decrease performance |
counters_chunk |
String |
false |
true/false |
Counter chunk to limit interface counters per message |
|
counters_chunk_size |
Int |
64 |
- |
Size of interfaces to send per message |
|
connection_timeout |
Int |
3 |
≥1 |
Connection timeout in seconds |
|
max_collector_messages |
Int |
10 |
2-20 |
Maximum collector messages in queue |
For the Telemetry Agent to start connection attempts to the controller, the controller_ip and controller_port must be changed to the correct provider values and the enable_telemetry parameter must be set to "true". This is possible to perform using the telemetry configuration script that is located at /opt/telemetry directory on the Docker:
/opt/telemetry/utils/telemetry_agent_init.sh <controller-ip> <controller-port>
The Telemetry Agent will try to establish connection with the controller.
To upgrade an existing version of Telemetry Agent, the old agent and image should be deleted and reinstalled.
Extract container name from the container. Run the "show docker ps" command to extract the container name with the image in which Telemetry Agent is installed.
switch (config) # show docker ps ------------------------------------------------------------------------------------------- Container Image:Version Created Status ------------------------------------------------------------------------------------------- neo-agent telemetry-agent:2.4.9- 27 minutes ago Up 27 minutes
Stop the docker container and remove the image:
docker no start <container-name> docker remove image <image-name>
Next, refer to ".Deployment v2.7#Deploying the Docker Image on Mellanox Onyx-Based Systems" section to reinstall the new version of Telemetry Agent.