The Unhealthy Ports view shows all the unhealthy nodes in the fabric and the OpenSM health policy of the healthy/unhealthy nodes.

After the Subnet Manager examines the behavior of subnet nodes (switches and hosts) and discovers that a node is “unhealthy” according to the conditions specified below, the node is displayed in the Unhealthy Ports window. Once a node is declared as “unhealthy”, Subnet Manager can either ignore, report, isolate or disable the node. The user is provided with the ability to control the actions performed and the phenomena that declares a node “unhealthy." Moreover, the user can “clear” nodes that were previously marked as “unhealthy."

The information is displayed in a tabular form and includes the unhealthy port’s state, source node, source port, source port GUID, peer node, peer port, peer GUID, peer LID, condition, and status time.

Warning The feature requires OpenSM parameter hm_unhealthy_ports_checks to be set to TRUE (default).

Warning This feature is not available in the "Monitoring Only Mode."

The following are the conditions that would declare a node as “unhealthy”:

Reboot - If a node was rebooted more than 10 times during last 900 seconds

Flapping - If several links of the node found in Initializing state in 5 out of 10 previous sweeps

Unresponsive - A port that does not respond to one of the SMPs and the MAD status is TIMEOUT in 5 out of 7 previous SM sweeps

Noisy Node - If a node sends traps 129, 130 or 131 more than 250 traps with interval of less than 60 seconds between each two traps

Seterr - If a node respond with bad status upon SET SMPs (PortInfo, SwitchInfo, VLArb, SL2VL or Pkeys)

Illegal - If illegal MAD fields are discovered after a check for MADs/fields during receive_process

Manual - Upon user request mark the node as unhealthy/healthy

Link Level Retransmission (LLR) – Activated when retransmission-per-second counter exceeds its threshold

All conditions except LLR generate Unhealthy port event, LLR generates a High Data retransmission event.

To clear a node from the Unhealthy Ports Tab, do the following:

Go to the Unhealthy Ports window under Managed Elements. From the Unhealthy Ports table, right click the desired port it and mark it as healthy.



To mark a node as permanently healthy, do the following:

Open the /opt/ufm/files/conf/health-policy.conf.user_ext file. Enter the node and the port information and set it as "Healthy." Run the /opt/ufm/scripts/sync_hm_port_health_policy_conf.sh script.