NVIDIA Cumulus NetQ 4.0 Release Notes

Download 4.0 Release Notes xls    Download all 4.0 release notes as .xls

4.0.1 Release Notes

Open Issues in 4.0.1

Issue IDDescriptionAffectsFixed
2794608
Sometimes the NetQ upgrade from 4.0.0 to 4.0.1 fails because the upgrade process cannot connect to the Docker registry running on the master node (the registry runs on the localhost:5000 port). The Docker registry container comes up without error but is not reachable on the host network. The issue occurs on cluster setups when you reboot the NetQ 4.0.0 telemetry server before upgrading
The netq-admin-app logs return an error message similar to the following:
 kubectl logs -l app=netq-app-admin –tail 1000Get http://localhost:5000/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Since the Docker registry is not reachable, any Docker push command to it times out. You can test this with the following command :
sudo docker push localhost:5000/master-operator:4.0.0
To work around this issue, restart the Docker registry container:
kubectl get pods -n kube-system  | grep docker-registry  | cut -f1 -d' ' | xargs kubectl delete pod -n kube-system
Then restart the NetQ upgrade.
4.0.0-4.0.1
2711101
When RoCE (RDMA over Converged Ethernet) data collection is enabled in Cumulus Linux 4.3.z and 4.4.z, you can experience high dual uplink convergence times
To work around this issue, disable RoCE monitoring:1. Edit ‘/etc/netq/commands/cl4-netq-commands.yml’ and comment out the following lines:
#- period: “60”
# key: “roce”
# isactive: true
# command: “/usr/lib/cumulus/mlxcmd –json roce counters”
# parser: “local"2. Delete the ‘/var/run/netq/netq_commands.yml’ file:
$ sudo rm /var/run/netq/netq_commands.yml3. Restart the NetQ agent:
$ netq config agent restart
4.0.0-4.0.1
2690469
While upgrading an on-premises deployment from version 2.4.x to 3.x.y then to 4.x, the upgrade fails during the NetQ application stage
To work around this issue, run the following command on the NetQ telemetry server, then start the upgrade again:‘netq install opta activate-job config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIiw3T2sweW9kR3Y4Wk9sTHU3MkwrQTRjNkhhQkU3bVpBNVlZVjEvWWgyZGJBPQ==’
3.2.1-3.3.1, 4.0.0-4.0.1
2663534
Validation check filtering is only applied to errors in validation results and is not applied to warnings in validation results.4.0.0-4.0.1
2663274
You cannot set a validation filter for sensor validations.4.0.0-4.0.1
2661988
Rerunning a validation in the UI or the CLI can return the same error if the query includes special characters, such as + or :.4.0.0-4.0.1
2555854
NETQ-8245
NetQ Agent: If a NetQ Agent is downgraded to the 3.0.0 version from any higher release, the default commands file present in the /etc/netq/commands/ also needs to be updated to prevent the NetQ Agent from becoming rotten.3.0.0-3.3.1, 4.0.0-4.0.1
2555197
NETQ-7966
NetQ CLI: Occasionally, when a command response contains a large number of objects to be displayed the NetQ CLI does not display all results in the console. When this occurs, view all results using the json format option.3.3.0-3.3.1, 4.0.0-4.0.1
2549649
NETQ-5737
NetQ UI: Warnings might appear during the post-upgrade phase for a Cumulus Linux switch upgrade job. They are caused by services that have not yet been restored by the time the job is complete. Cumulus Networks recommend waiting five minutes, creating a network snapshot, then comparing that to the pre-upgrade snapshot. If the comparison shows no differences for the services, the warnings can be ignored. If there are differences, then troubleshooting the relevant service(s) is recommended.3.0.0-3.3.1, 4.0.0-4.0.1

Fixed Issues in 4.0.1

Issue IDDescriptionAffects

4.0.0 Release Notes

Open Issues in 4.0.0

Issue IDDescriptionAffectsFixed
2794608
Sometimes the NetQ upgrade from 4.0.0 to 4.0.1 fails because the upgrade process cannot connect to the Docker registry running on the master node (the registry runs on the localhost:5000 port). The Docker registry container comes up without error but is not reachable on the host network. The issue occurs on cluster setups when you reboot the NetQ 4.0.0 telemetry server before upgrading
The netq-admin-app logs return an error message similar to the following:
 kubectl logs -l app=netq-app-admin –tail 1000Get http://localhost:5000/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Since the Docker registry is not reachable, any Docker push command to it times out. You can test this with the following command :
sudo docker push localhost:5000/master-operator:4.0.0
To work around this issue, restart the Docker registry container:
kubectl get pods -n kube-system  | grep docker-registry  | cut -f1 -d' ' | xargs kubectl delete pod -n kube-system
Then restart the NetQ upgrade.
4.0.0-4.0.1
2711101
When RoCE (RDMA over Converged Ethernet) data collection is enabled in Cumulus Linux 4.3.z and 4.4.z, you can experience high dual uplink convergence times
To work around this issue, disable RoCE monitoring:1. Edit ‘/etc/netq/commands/cl4-netq-commands.yml’ and comment out the following lines:
#- period: “60”
# key: “roce”
# isactive: true
# command: “/usr/lib/cumulus/mlxcmd –json roce counters”
# parser: “local"2. Delete the ‘/var/run/netq/netq_commands.yml’ file:
$ sudo rm /var/run/netq/netq_commands.yml3. Restart the NetQ agent:
$ netq config agent restart
4.0.0-4.0.1
2690469
While upgrading an on-premises deployment from version 2.4.x to 3.x.y then to 4.x, the upgrade fails during the NetQ application stage
To work around this issue, run the following command on the NetQ telemetry server, then start the upgrade again:‘netq install opta activate-job config-key EhVuZXRxLWVuZHBvaW50LWdhdGV3YXkYsagDIiw3T2sweW9kR3Y4Wk9sTHU3MkwrQTRjNkhhQkU3bVpBNVlZVjEvWWgyZGJBPQ==’
3.2.1-3.3.1, 4.0.0-4.0.1
2663534
Validation check filtering is only applied to errors in validation results and is not applied to warnings in validation results.4.0.0-4.0.1
2663274
You cannot set a validation filter for sensor validations.4.0.0-4.0.1
2661988
Rerunning a validation in the UI or the CLI can return the same error if the query includes special characters, such as + or :.4.0.0-4.0.1
2555854
NETQ-8245
NetQ Agent: If a NetQ Agent is downgraded to the 3.0.0 version from any higher release, the default commands file present in the /etc/netq/commands/ also needs to be updated to prevent the NetQ Agent from becoming rotten.3.0.0-3.3.1, 4.0.0-4.0.1
2555197
NETQ-7966
NetQ CLI: Occasionally, when a command response contains a large number of objects to be displayed the NetQ CLI does not display all results in the console. When this occurs, view all results using the json format option.3.3.0-3.3.1, 4.0.0-4.0.1
2549649
NETQ-5737
NetQ UI: Warnings might appear during the post-upgrade phase for a Cumulus Linux switch upgrade job. They are caused by services that have not yet been restored by the time the job is complete. Cumulus Networks recommend waiting five minutes, creating a network snapshot, then comparing that to the pre-upgrade snapshot. If the comparison shows no differences for the services, the warnings can be ignored. If there are differences, then troubleshooting the relevant service(s) is recommended.3.0.0-3.3.1, 4.0.0-4.0.1

Fixed Issues in 4.0.0

Issue IDDescriptionAffects
2611898
Fixed an issue where deleting a snapshot does not remove the snapshot card from the workbench. However, the workbench might refresh before the deleted snapshot’s card is removed. During the refresh, you may notice a brief flashing. This is expected behavior and you can safely ignore the flashing.
2553453
NETQ-7318
The netqd daemon logs a traceback to /var/log/netqd.log when the OPTA server is unreachable and netq show commands are run.3.1.0-3.3.1
2549319
NETQ-5571
NetQ UI: The legend and segment colors on Switches and Upgrade History card graphs sometimes do not match. These cards appear on the lifecycle management dashboard (Manage Switch Assets view). Hover over graph to view the correct values.3.0.0-3.3.1