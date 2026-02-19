116 Port Xmit Discards 1 1 Minor 200 300 Port Communication Error Total number of outbound packets discarded by the port when the port is down or congested. Reasons include: Output port is not in the active state

Packet length exceeded NeighborMTU

Switch Lifetime Limit exceeded

Switch HOQ Lifetime Limit exceeded

Packets discarded while in VLStalled State

117 Port Xmit Constraint Errors 1 1 Minor 200 300 Port Communication Error Total number of packets not transmitted from the switch physical port for the following reasons: FilterRawOutbound is true and packet is raw

PartitionEnforcementOutbound is true and packet fails partition key check or IP version check

120 Excessive Buffer Overrun Errors 1 1 Minor 100 300 Port Communication Error The number of times that OverrunErrors consecutive flow control update periods occurred, each having at least one overrun error. Message: ExcessiveBufferOverrunErrors counter threshold exceeded. Threshold is %d, received value is %d.

121 VL15 Dropped 1 1 Minor 50 300 Port Communication Error Number of incoming VL15 packets dropped due to resource limitations (e.g., lack of buffers) in the port. Message: VL15Dropped counter threshold exceeded. Threshold is %d, received value is %d.

118 Port Receive Constraint Errors 1 1 Minor 200 300 Port Communication Error Total number of packets received on the switch physical port that are discarded for the following reasons: FilterRawInbound is true and packet is raw

PartitionEnforcementInbound is true and packet fails partition key check or IP version check

145 System Image GUID changed 1 0 Info 1 300 Port Communication Error System GUID is changed for the specific LID

115 Port Receive Switch Relay Errors 1 1 Minor 9999 300 Port Fabric Configuration Total number of packets received on the port that were discarded because they could not be forwarded by the switch relay. Reasons for this include: DLID mapping

VL mapping

Looping (output port = input port)

256 Bad M_Key 1 0 Minor 1 300 Port Fabric Configuration Found bad M_Key. Check your HCA driver or partition settings. SM Trap. Management Key (M_Key): Enforces the control of a master subnet manager. Administered by the subnet manager and used in certain subnet management packets. Message: Bad M_Key: port1(lid %(lid)d, #%(portn)d) %(pkey)08x, port2(lid%(lid2)d #%(portn2)d)

257 Bad P_Key 1 0 Minor 1 300 Port Fabric Configuration Found a bad P_Key. Check your partitioning settings. SM Trap. Partition Key (P_Key): Enforces membership. Administered through the subnet manager by the partition manager (PM). Message: Bad P_Key: port1(lid %(lid)d, #%(portn)d) %(pkey)08x, port2(lid%(lid2)d #%(portn2)d)

258 Bad Q_Key 1 0 Minor 1 300 Port Fabric Configuration Found bad Q_Key. Security error. SM Trap. Queue Key (Q_Key): Enforces access rights for reliable and unreliable datagram service (RAW datagram service type not included). Message: Bad Q_Key: port1(lid %(lid)d, #%(portn)d) %(pkey)08x, port2(lid%(lid2)d #%(portn2)d)

259 Bad P_Key Switch External Port 1 0 Critical 1 300 Port Fabric Configuration Found a bad P_Key. Check your partitioning settings. SM Trap. Partition Key (P_Key): Enforces membership. Administered through the subnet manager by the partition manager (PM). Message: Bad P_Key switch external port: port1(lid %(lid)d, #%(portn)d) %(pkey)08x, port2(lid%(lid2)d #%(portn2)d)

64 GID Address In Service 1 0 Info 1 300 Port Fabric Notification New GID is connected to the Fabric

65 GID Address Out of Service 1 0 Warning 1 300 Port Fabric Notification Existing GID is disconnected from the Fabric

66 New MCast Group Created 1 0 Info 1 300 Port Fabric Notification New Multicast Group is created in SM

67 MCast Group Deleted 1 0 Info 1 300 Port Fabric Notification Multicast Group is removed from SM.

328 Link is Up 1 0 Info 1 0 Link Fabric Topology Event is sent upon discovery of a new link

328 Link is Down 1 0 Warning 1 0 Link Fabric Topology Event is sent when exiting link is removed

144 Capability Mask Modified 0 0 Info 1 300 Port Fabric Notification Capability Mask of the specific LID is modified

602 UFM Server Failover 1 1 Critical 1 0 Site Fabric Notification Failover in UFM Server (in HA mode)

391 Switch Module Removed 1 0 Info 1 0 Switch Fabric Notification Module (line card, FAN or PS) is removed from the switch

331 Node is Down 1 0 Warning 1 0 Site Fabric Topology Node is disconnected or down

332 Node is Up 1 0 Info 1 300 Site Fabric Topology Node is connected or up

907 Switch is Down 1 1 Critical 1 0 Site Fabric Topology Switch is disconnected or down

908 Switch is Up 1 1 Info 1 300 Site Fabric Topology Switch is connected or up

370 Gateway Ethernet Link State Changed 1 0 Warning 1 0 Gateway Gateway Gateway Ethernet Physical link has changed state

371 Gateway Re-register Event Received 1 0 Warning 1 0 Gateway Gateway 10GbE Gateway received a re-register event from the SM.

372 Number of Gateways is Changed 1 0 Warning 1 0 Gateway Gateway Change in the number of 10GbE Gateways has been detected

373 Gateway will be Rebooted 1 0 Warning 1 0 Gateway Gateway 10GbE Gateway is about to reboot

374 Gateway Reloading Finished 1 0 Info 1 0 Gateway Gateway 10GbE Gateway has finished reloading.

110 Symbol Error 1 1 Warning 200 300 Port Hardware Total number of minor link errors detected on one or more physical lanes

111 Link Error Recovery 1 1 Minor 1 300 Port Hardware Total number of times the Port Training state machine has successfully completed the link error recovery process

112 Link Downed 1 1 Critical 1 300 Port Hardware Total number of times the Port Training state machine has failed the link error recovery process and downed the link.

113 Port Receive Errors 1 1 Minor 5 300 Port Hardware Total number of packets containing an error that were received on a port. These errors include: Local physical errors (CRC, VCRC, FCCRC and all physical errors that cause entry into the BAD PACKET or BAD PACKET DISCARD states of the packet receiver state machine)

Data packet errors

Link packet errors

Packets discarded due to buffer overrun

114 Port Receive Remote Physical Errors 1 0 Minor 5 300 Port Hardware Total number of packets marked with the EBP delimiter received on the port

119 Local Link Integrity Errors 1 1 Minor 5 300 Port Hardware The number of times that the frequency of packets containing local physical errors has exceeded LocalPhyErrors. Message: LocalLinkIntegrityErrors counter threshold exceeded. Threshold is %d, received value is %d

122 Congested Bandwidth (%) Threshold Reached 1 1 Minor 10 300 Port Hardware Percent of Congested Bandwidth has exceeded defined threshold. Note: a different threshold can be set specifically for Tier 4 ports.

131 Non-optimal link width (1X instead of 4X) 1 1 Minor 1 0 Port Hardware 4X link operates as 1X link

132 Non-optimal link width (1X or 4X instead of 12X) 1 1 Minor 1 0 Port Hardware 12X links operates as 4X or 1X link

701 Non-optimal Link Speed 1 1 Minor 1 0 Port Hardware DDR link operates as SDR or QRD link operates as DDR or QDR or EDR link operates as FDR,QDR,DDR or SDR or FDR link operates as QDR,DDR or SDR

140 Excessive Buffer Overrun Threshold Reached 1 0 Minor 1 300 Port Hardware SM Trap. This error is detected when the number of consecutive flow control update periods with at least one overrun error in each period exceeds the OverrunErrors threshold given in the PortInfo attribute. Message: Excessive Buffer Overrun Threshold is reached: lid %(lid)d, port #%(portn)d

141 Flow Control Update Watchdog Timer Expired 1 0 Warning 1 300 Port Hardware SM Trap. The error indicates a failure of the flow control machine at the other end of the link. If the timer expires without receiving an update, a flow control update error has occurred. Message: Flow Control Update watchdog timer has expired: lid %(lid)d, port #%(portn)d

392 Module Temperature Threshold Reached 1 0 Info 40 0 Module Hardware Temperature detected by module sensor is too high, has exceeded the defined threshold.

350 Environment Added 1 0 Info 1 0 Env Logical Model New Logical Environment is created

351 Environment Removed 1 0 Info 1 0 Env Logical Model Logical Environment is deleted

306 Logical Server Added 1 0 Info 1 0 Logical Server Logical Model New Logical Server or Logical Servers Group is created

307 Logical Server Removed 1 0 Info 1 0 Logical Server Logical Model Logical Server or Logical Servers Group is deleted

352 Network Added 1 0 Info 1 0 Network Logical Model New Network is created

353 Network Removed 1 0 Info 1 0 Network Logical Model Network is deleted

340 Network Interface Added 1 0 Info 1 0 Logical Server Logical Model New Network Interface is created

341 Network Interface Removed 1 0 Info 1 0 Logical Server Logical Model Network Interface is deleted

313 Compute Resource Allocated 1 0 Info 1 0 Logical Server Logical Model A resource is allocated to the Logical Server

312 Compute Resource Released 1 0 Info 1 0 Logical Server Logical Model A resource is released from the Logical Server

317 Logical Server Compute Resource is Up 1 1 Warning 1 0 Logical Server Logical Model An allocated resource is Down or Disconnected

316 Logical Server Compute Resource is Down 1 1 Critical 1 0 Logical Server Logical Model An allocated resources is Up or Connected back

301 Logical Server State Changed 1 0 Info 1 0 Logical Server Logical Model Logical Server state is changed

302 Logical Server State Change Failed 1 0 Minor 1 0 Logical Server Logical Model Logical Server has failed to change the state. RM (Resource Manager) Event. Indicates error in Logical Server state change. This error might be caused by any error condition related to the Logical Server resources allocation. Message: Logical Server changed state from %s to %s

308 Logical Server Resources Allocated 1 0 Info 1 0 Logical Server Logical Model New resources are allocated to the Logical Server

314 Logical Server Additional Resources Allocated 1 0 Info 1 0 Logical Server Logical Model Additional resources are allocated to the Logical Server

315 Logical Server Resources Released 1 0 Info 1 0 Logical Server Logical Model Resources were released from the Logical Server

336 Port Action Succeeded 1 0 Info 1 0 Port Maintenance Port Management Action (reset, disable) succeeded

337 Port Action Failed 1 0 Minor 1 0 Port Maintenance Port Management Action (reset, disable) failed

338 Device Action Succeeded 1 0 Info 1 0 Port Maintenance Device Management Action succeeded

339 Device Action Failed 1 0 Minor 1 0 Port Maintenance Device Management Action failed

385 Switch FW Upgrade Started 1 0 Info 1 0 Switch Maintenance Switch FW Upgrade process has started

386 Switch SW Upgrade Started 1 0 Info 1 0 Switch Maintenance Switch SW Upgrade process has started

381 Switch Upgrade Failed 1 0 Info 1 0 Switch Maintenance Switch SW or FW Upgrade process failed

388 Host FW Upgrade Started 1 0 Info 1 0 Computer Maintenance Host FW Upgrade process has started

389 Host SW Upgrade Started 1 0 Info 1 0 Computer Maintenance Host SW Upgrade process has started

383 Host Upgrade Failed 1 0 Info 1 0 Computer Maintenance Host SW or FW Upgrade process failed

502 Device Upgrade Finished 1 0 Info 1 300 Device Maintenance Device SW or FW Upgrade has finished

909 Director Switch is Down 1 1 Critical 1 300 Site Fabric Topology Director Switch is disconnected or down

910 Director Switch is Up 1 1 Info 1 0 Site Fabric Topology Director Switch is connected or up

911 Module Temperature Low Threshold Reached 1 1 Warning 60 300 Module Hardware Temperature detected by module sensor is too high, has exceeded the low threshold

912 Module Temperature High Threshold Reached 1 1 Critical 60 300 Module Hardware Temperature detected by module sensor is too high, has exceeded the high threshold

913 Module High Voltage 1 1 Warning 10 420 Switch Module Status Sensor Voltage Threshold Exceeded

914 Module High Current 1 1 Warning 10 420 Switch Module Status Sensor Current Threshold Exceeded

394 Module Status FAULT 1 1 Critical 1 420 Switch Module Status Module Status FAULT

545 SM is not responding 1 1 Critical 1 300 Grid Maintenance SM is not responding

915 BER_ERROR 1 1 Critical 1e-8 420 Port Hardware Effective BER Error on port exceeded the threshold

916 BER_WARNING 1 1 Warning 1e-13 420 Port Hardware Effective BER Warning on port exceeded the threshold

1300 SM_SAKEY_VIOLATION 1 1 Warning 5300 Port Fabric Notification "SA Key Violation Committed"

1301 SM_SGID_SPOOFED 1 1 Warning 5300 Port Fabric Notification "SGID spoofed by VPort/port"

1302 SM_RATE_LIMIT_EXCEEDED 1 1 Warning 5300 Port Fabric Notification "Rate Limit Exceeded"

1303 SM_MULTICAST_GROUPS_LIMIT_EXCEEDED 1 1 Warning 5300 Port Fabric Notification "Multicast Groups Limit Exceeded"

1304 SM_SERVICES_LIMIT_EXCEEDED 1 1 Warning 5300 Port Fabric Notification "Services, Limit Exceeded"

1305 SM_EVENT_SUBSCRIPTION_LIMIT_EXCEEDED 1 1 Warning 5300 Port Fabric Notification "Event Subscription Limit Exceeded"

1500 New cable detected 1 0 Info 1 0 Link Hardware "New cable was detected"

1502 Cable detected in a new location 1 0 Warning 1 0 Link Hardware "Cable detected in a new location"