Event Notifications

NVIDIA MLNX-OS User Manual v3.11.1004

The OS features a variety of supported events. Events are printed in the system log file and can, optionally, be sent to the system administrator via email, SNMP trap or directly prompted to the terminal.

The following table presents the supported events and maps them to their relevant MIB OID.

Event Name

Event Description

MIB OID

Comments

asic-chip-down

ASIC (chip) down

Mellanox-EFM-MIB:
asicChipDown

Not supported

cpu-util-high

CPU utilization has risen too high

Mellanox-EFM-MIB: cpuUtilHigh

N/A

disk-space-low

File system free space has fallen too low

Mellanox-EFM-MIB:
diskSpaceLow

N/A

health-module-status

Health module status changed

Mellanox-EFM-MIB:
systemHealthStatus

N/A

insufficient-fans

Insufficient amount of fans in system

Mellanox-EFM-MIB:
insufficientFans

N/A

insufficient-fans-recover

Insufficient amount of fans in system recovered

Mellanox-EFM-MIB:
insufficientFansRecover

N/A

insufficient-power

Insufficient power supply

Mellanox-EFM-MIB:
insufficientPower

N/A

interface-down

An interface’s link state has changed to DOWN

RFC1213: linkdown (SNMPv1)

Supported for InfiniBand interfaces for 1U and blade systems

interface-up

An interface’s link state has changed to UP

RFC1213: linkup (SNMPv1)

Supported for InfiniBand interfaces for 1U and blade systems

internal-bus-error

Internal bus (I2C) error

Mellanox-EFM-MIB:
internalBusError

N/A

internal-link-speed-mismatch

There is a mismatch in the speeds of the internal links between spine and leaf modules

Mellanox-EFM-MIB:
internalSpeedMismatch

Supported only for modular switches

liveness-failure

A process in the system is detected as hung

Not implemented

N/A

low-power

Low power supply

Mellanox-EFM-MIB:
lowPower

N/A

low-power-recover

Low power supply recover

Mellanox-EFM-MIB:
lowPowerRecover

N/A

paging-high

Paging activity has risen too high

N/A

Not supported

power-redundancy-mismatch

Power redundancy mismatch

Mellanox-EFM-MIB:
powerRedundancyMismatch

Supported only for modular switches

process-crash

A process in the system has crashed

Mellanox-EFM-MIB:
procCrash

N/A

process-exit

A process in the system unexpectedly exited

Mellanox-EFM-MIB:
procUnexpectedExit

N/A

send-test

Send a test notification

testTrap

Run the CLI command “snmp-server notify send-test”

snmp-authtrap

An SNMPv3 request has failed authentication

Not implemented

N/A

temperature-too-high

Temperature is too high

Mellanox-EFM-MIB:
asicOverTemp

N/A

unexpected-shutdown

Unexpected system shutdown

Mellanox-EFM-MIB:
unexpectedShutdown

N/A

cli-line-executed

disk-io-high

entity-state-change

expected-shutdown

memusage-high

netusage-high

sm-restart

sm-start

sm-stop

unexpected-cluster-join

unexpected-cluster-leave

unexpected-cluster-size

user-login

user-logout

To print events to the terminal, set the events you wish to print to the terminal. Run:

Copy
Copied!
            

switch (config) # logging monitor events notice

This command prints system events in the severity “notice” to the screen. For example, in case of interface-down event, the following gets printed to the screen.

Copy
Copied!
            

switch (config) # Wed Jul 10 11:30:42 2022: Interface 1/17 changed state to DOWN Wed Jul 10 11:30:43 2022: Interface 1/18 changed state to DOWN switch (config) #

To configure the OS to send you emails for all configured events and failures:

  1. Set your mailhub to the IP address to be your mail client’s server – for example, Microsoft Outlook exchange server.

    Copy
    Copied!
                

    switch (config) # email mailhub <IP address>

  2. Add your email address for notifications. Run:

    Copy
    Copied!
                

    switch (config) # email notify recipient <email address>

  3. Configure the system to send notifications for a specific event. Run:

    Copy
    Copied!
                

    switch (config) # email notify event <event name>

  4. Show the list of events for which an email is sent. Run:

    Copy
    Copied!
                

    switch (config) # show email events Failure events for which emails will be sent: process-crash: A process in the system has crashed unexpected-shutdown: Unexpected system shutdown   Informational events for which emails will be sent: asic-chip-down: ASIC (Chip) Down cpu-util-high: CPU utilization has risen too high cpu-util-ok: CPU utilization has fallen back to normal levels disk-io-high: Disk I/O per second has risen too high disk-io-ok: Disk I/O per second has fallen back to acceptable levels disk-space-low: Filesystem free space has fallen too low ...

  5. Have the system send you a test email. Run:

    Copy
    Copied!
                

    switch (config) # email send-test   The last command should generate the following email: -----Original Message----- From: Admin User [mailto:do-not-reply@switch.] Sent: Sunday, May 01, 2011 11:17 AM To: <name> Subject: System event on switch: Test email for event notification   ==== System information: Hostname: switch Version: <version> 2011-05-01 14:56:31 ... Date: 2011/05/01 08:17:29 Uptime: 17h 8m 28.060s   This is a test email. ==== Done.

email autosupport enable

email autosupport enable
no email autosupport enable

Sends automatic support notifications via email.
The no form of the command stops sending automatic support notifications via email.

Syntax Description

N/A

Default

N/A

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email autosupport enable

Related Commands

Notes

email autosupport event

email autosupport event <event>
no email autosupport event

Specifies for which events to send auto-support notification emails.
The no form of the command resets auto-support email security mode to its default.

Syntax Description

event

  • process-crash – a process has crashed

  • process-exit – a process unexpectedly exited

  • liveness-failure – a process iss detected as hung

  • cpu-util-high – CPU utilization has risen too high

  • cpu-util-ok – CPU utilization has fallen back to normal levels

  • paging-high – paging activity has risen too high

  • paging-ok – paging activity has fallen back to normal levels

  • disk-space-low – filesystem free space has fallen too low

  • disk-space-ok – filesystem free space is back in the normal range

  • memusage-high – memory usage has risen too high

  • memusage-ok – memory usage has fallen back to acceptable levels

  • netusage-high – network utilization has risen too high

  • netusage-ok – network utilization has fallen back to acceptable levels

  • disk-io-high – disk I/O per second has risen too high

  • disk-io-ok – disk I/O per second has fallen back to acceptable levels

  • unexpected-cluster-join – node has unexpectedly joined the cluster

  • unexpected-cluster-leave – node has unexpectedly left the cluster

  • unexpected-cluster-size – the number of nodes in the cluster is unexpected

  • unexpected-shutdown – unexpected system shutdown

  • interface-up – an interface’s link state has changed to up

  • interface-down – an interface's link state has changed to down

  • user-login – a user has logged into the system

  • user-logout – a user has logged out of the system

  • health-module-status – health module status

  • temperature-too-high – temperature has risen too high

  • low-power – low power supply

  • low-power-recover – low power supply recover

  • insufficient-power – insufficient power supply

  • power-redundancy-mismatch – power redundancy mismatch

  • insufficient-fans – insufficient amount of fans in system

  • insufficient-fans-recover – insufficient amount of fans in system recovered

  • asic-chip-down – ASIC (chip) down

  • internal-bus-error – internal bus (I2C) error

  • internal-link-speed-mismatch – internal links speed mismatch

Default

N/A

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email autosupport event process-crash

Related Commands

Notes

email autosupport ssl mode

email autosupport ssl mode {none | tls | tls-none}
no email autosupport ssl mode

Configures type of security to use for auto-support email.
The no form of the command resets auto-support email security mode to its default.

Syntax Description

none

Does not use TLS to secure auto-support email.

tls

Uses TLS over the default server port to secure auto-support email and does not send an email if TLS fails.

tls-none

Attempts TLS over the default server port to secure auto-support email, and falls back on plaintext if this fails.

Default

tls-none

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email autosupport ssl mode tls

Related Commands

Notes

email autosupport ssl cert-verify

email autosupport ssl cert-verify
no email autosupport ssl cert-verify

Verifies server certificates.
The no form of the command does not verify server certificates.

Syntax Description

N/A

Default

N/A

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email autosupport ssl cert-verify

Related Commands

Notes

email autosupport ssl ca-list

email autosupport ssl ca-list {<ca-list-name> | default_ca_list | none}
no email autosupport ssl ca-list

Configures supplemental CA certificates for verification of server certificates.
The no form of the command removes supplemental CA certificate list.

Syntax Description

default_ca_list

Default supplemental CA certificate list

none

No supplemental list (uses built-in list only)

Default

default_ca_list

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email autosupport ssl ca-list default_ca_list

Related Commands

Notes

email dead-letter

email dead-letter {cleanup max-age <duration> | enable}
no email dead-letter

Configures settings for saving undeliverable emails.
The no form of the command disables sending of emails to vendor auto-support upon certain failures.

Syntax Description

duration

Example: “5d4h3m2s” for 5 days, 4 hours, 3 minutes, 2 seconds

enable

Saves dead-letter files for undeliverable emails

Default

Save dead letter is enabled
The default duration is 14 days

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email dead-letter enable

Related Commands

show email

Notes

email domain

email domain <hostname-or-ip-address>
no email domain

Sets the domain name from which the emails appear to come (provided that the return address is not already fully-qualified). This is used in conjunction with the system hostname to form the full name of the host from which the email appears to come.
The no form of the command clears email domain override.

Syntax Description

hostname-or-ip-address

Hostname or IP address of email domain

Default

No email domain

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email domain my_domain

Related Commands

show emails

Notes

email mailhub

email mailhub <hostname-or-ip-address>
no email mailhub

Sets the mail relay to be used to send notification emails.
The no form of the command clears the mail relay to be used to send notification emails.

Syntax Description

hostname-or-ip-address

Hostname or IP address

Default

N/A

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email mailhub 10.0.8.11

Related Commands

show email [events]

Notes

email autosupport mailhub

email autosupport mailhub <hostname-or-ip-address>
no email autosupport mailhub

Sets the mail relay to be used for sending autosupport notification emails.
The no form of the command clears the mail relay to be used for sending autosupport notification emails.

Syntax Description

<hostname-or-ip-address>

The mail hub hostname or IP address

Default

N/A

Configuration Mode

config

History

3.7.1000

Example

switch (config) # email autosupport mailhub 10.10.10.1

Related Commands

show email

Notes

email autosupport recipient

email autosupport recipient <email-addr>
no email autosupport recipient

Sets the recipient for autosupport emails.
The no form of the command clears the configured autosupport recipient.

Syntax Description

email-addr

The autosupport recipient email address

Default

N/A

Configuration Mode

config

History

3.7.1000

Example

switch (config) # email autosupport recipient user@example.com

Related Commands

show email

Notes

email mailhub-port

email mailhub-port <port number>
no email mailhub-port

Sets the mail relay port to be used to send notification emails.
The no form of the command resets the port to its default.

Syntax Description

hostname-or-ip-address

Port number

Default

25

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email mailhub-port 125

Related Commands

show email

Notes

email notify event

email notify event <event>
no email notify event <event>

Enables sending email notifications for the specified event type.
The no form of the command disables sending email notifications for the specified event type.

Syntax Description

event

Available event names:

  • process-crash – a process has crashed

  • process-exit – a process unexpectedly exited

  • liveness-failure – a process iss detected as hung

  • cpu-util-high – CPU utilization has risen too high

  • cpu-util-ok – CPU utilization has fallen back to normal levels

  • paging-high – paging activity has risen too high

  • paging-ok – paging activity has fallen back to normal levels

  • disk-space-low – filesystem free space has fallen too low

  • disk-space-ok – filesystem free space is back in the normal range

  • memusage-high – memory usage has risen too high

  • memusage-ok – memory usage has fallen back to acceptable levels

  • netusage-high – network utilization has risen too high

  • netusage-ok – network utilization has fallen back to acceptable levels

  • disk-io-high – disk I/O per second has risen too high

  • disk-io-ok – disk I/O per second has fallen back to acceptable levels

  • unexpected-cluster-join – node has unexpectedly joined the cluster

  • unexpected-cluster-leave – node has unexpectedly left the cluster

  • unexpected-cluster-size – the number of nodes in the cluster is unexpected

  • unexpected-shutdown – unexpected system shutdown

  • interface-up – an interface’s link state has changed to up

  • interface-down – an interface's link state has changed to down

  • user-login – a user has logged into the system

  • user-logout – a user has logged out of the system

  • health-module-status – health module status

  • temperature-too-high – temperature has risen too high

  • low-power – low power supply

  • low-power-recover – low power supply recover

  • insufficient-power – insufficient power supply

  • power-redundancy-mismatch – power redundancy mismatch

  • insufficient-fans – insufficient amount of fans in system

  • insufficient-fans-recover – insufficient amount of fans in system recovered

  • asic-chip-down – ASIC (chip) down

  • internal-bus-error – internal bus (I2C) error

  • internal-link-speed-mismatch – internal links speed mismatch

Default

No events are enabled

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email notify event process-crash

Related Commands

email autosupport event
show email
show email events

Notes

This does not affect auto-support emails. Auto-support can be disabled overall, but if it is enabled, all auto-support events are sent as emails.

email notify recipient

email notify recipient <email-addr> [class {info | failure} | detail]
no email notify recipient <email-addr> [class {info | failure} | detail]

Adds an email address from the list of addresses to which to send email notifications of events.
The no form of the command removes an email address from the list of addresses to which to send email notifications of events.

Syntax Description

email-addr

Email address of intended recipient.

class

Specifies which types of events are sent to this recipient.

info

Sends informational events to this recipient.

failure

Sends failure events to this recipient.

detail

Sends detailed event emails to this recipient.

Default

N/A

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email notify recipient user2@autosupport.mydomain.com

Related Commands

show email

Notes

email return-addr

email return-addr <username>
no email domain

Sets the username or fully-qualified return address from which email notifications are sent.

  • If the string provided contains an “@” character, it is considered to be fully-qualified and used as-is.

  • Otherwise, it is considered to be just the username, and we append “@<hostname>.<domain>”. The default is “do-not-reply”, but this can be changed to “admin” or whatnot in case something along the line does not like fictitious addresses.

The no form of the command resets this attribute to its default.

Syntax Description

username

Username

Default

N/A

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email return-addr user1

Related Commands

show email

Notes

email return-host

email return-host
no email return-host

Includes the hostname in the return address for emails.
The no form of the command does not include the hostname in the return address for emails.

Syntax Description

N/A

Default

No return host

Configuration Mode

config

History

3.1.0000

Example

switch (config) # no email return-host

Related Commands

show email

Notes

This only takes effect if the return address does not contain an “@” character

email send-test

email send-test

Sends test-email to all configured event and failure recipients.

Syntax Description

N/A

Default

No return host

Configuration Mode

config

History

3.1.0000

Example

switch (config) # email send-test

Related Commands

show email [events]

Notes

email ssl mode

email ssl mode {none | tls | tls-none}
no email ssl mode

Sets the security mode(s) to try for sending email.
The no form of the command resets the email SSL mode to its default.

Syntax Description

none

No security mode, operates in plaintext

tls

Attempts to use TLS on the regular mailhub port, with STARTTLS. If this fails, it gives up.

tls-none

Attempts to use TLS on the regular mailhub port, with STARTTLS. If this fails, it falls back on plaintext.

Default

default-cert

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email ssl mode tls-none

Related Commands

show email

Notes

email ssl cert-verify

email ssl cert-verify
no email ssl cert-verify

Enables verification of SSL/TLS server certificates for email.
The no form of the command disables verification of SSL/TLS server certificates for email.

Syntax Description

N/A

Default

N/A

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email ssl cert-verify

Related Commands

show email

Notes

This command has no impact unless TLS is used.

email ssl ca-list

email ssl ca-list {<ca-list-name> | default-ca-list | none}
no email ssl ca-list

Specifies the list of supplemental certificates of authority (CA) from the certificate configuration database that is to be used for verification of server certificates when sending email using TLS, if any.
The no form of the command uses no list of supplemental certificates.

Syntax Description

ca-list-name

Specifies CA list name

default-ca-list

Uses default supplemental CA certificate list

none

Uses no list of supplemental certificates

Default

default-ca-list

Configuration Mode

config

History

3.2.3000

Example

switch (config) # email ssl ca-list none

Related Commands

show email

Notes

This command has no impact unless TLS is used, and certificate verification is enabled.

show email

show email

Displays email configuration or events for which email should be sent upon.

Syntax Description

N/A

Default

N/A

Configuration Mode

Any command mode

History

3.1.0000

Example

switch (config) # show email
Mail hub: 10.0.8.70
Mail hub port: 25
Domain override:
Return address: do-not-reply
Include hostname in return address: yes

Current reply address: do-not-reply@<hostname>

Security mode: tls-none
Verify server cert: yes
Supplemental CA list: default-ca-list

Dead letter settings:
Save dead.letter files: yes
Dead letter max age: 14 days

Email notification recipients:
No recipients configured.

Autosupport emails
Enabled: no
Recipient:
Mail hub:
Security mode: tls-none
Verify server cert: yes
Supplemental CA list: default-ca-list

Related Commands

Notes

show email events

show email events

Displays list of events for which notification emails are sent.

Syntax Description

N/A

Default

N/A

Configuration Mode

Any command mode

History

3.1.0000

Example

switch (config) # show email events
Failure events for which emails will be sent:
expected-shutdown: Expected system shutdown
process-crash: A process in the system has crashed
unexpected-shutdown: Unexpected system shutdown

Informational events for which emails will be sent:
asic-chip-down: ASIC (Chip) Down
cpu-util-high: CPU utilization has risen too high
cpu-util-ok: CPU utilization has fallen back to normal levels
disk-io-high: Disk I/O per second has risen too high
disk-io-ok: Disk I/O per second has fallen back to acceptable levels
disk-space-low: Filesystem free space has fallen too low
disk-space-ok: Filesystem free space is back in the normal range
health-module-status: Health module Status
insufficient-fans: Insufficient amount of fans in system
insufficient-fans-recover: Insufficient amount of fans in system recovered
insufficient-power: Insufficient power supply
internal-bus-error: Internal bus (I2C) Error
internal-link-speed-mismatch: Internal links speed mismatch
liveness-failure: A process in the system was detected as hung
low-power: Low power supply
low-power-recover: Low power supply Recover
memusage-high: Memory usage has risen too high
memusage-ok: Memory usage has fallen back to acceptable levels
netusage-high: Network utilization has risen too high
netusage-ok: Network utilization has fallen back to acceptable levels
paging-high: Paging activity has risen too high
paging-ok: Paging activity has fallen back to normal levels
power-redundancy-mismatch: Power redundancy mismatch
process-exit: A process in the system unexpectedly exited
sm-restart: Subnet Manager restarted for parameter change
sm-start: Subnet Manager started
sm-stop: Subnet Manager stopped
temperature-too-high: Temperature has risen too high
unexpected-cluster-join: A node has unexpectedly joined the cluster
unexpected-cluster-leave: A node has unexpectedly left the cluster
unexpected-cluster-size: The number of nodes in the cluster is unexpected

All events for which autosupport emails will be sent:
liveness-failure: A process in the system was detected as hung
process-crash: A process in the system has crashed

Related Commands

Notes

© Copyright 2023, NVIDIA. Last updated on Sep 8, 2023.