Appendix D: Platform-Dependent Workarounds#

Some Grace platforms require temporary (or permanent) alterations to their configurations to work around known issues, such as hardware errata. These workarounds are described in the following sections by the corresponding Grace platform.

Note

Unless mentioned here, later releases of RHEL 9 do not require platform-dependent workarounds.

D.1 All Grace Platforms#

  • The RHEL 9.2 installation media does not carry a patch that is required to resolve an issue with the ast driver that is used to interface with the AST2600 BMC.

    The absence of this patch can manifest a variety of issues, including kernel hangs and distorted output from the on-board VGA port. Until the system is installed and running with kernel version 5.14.0-284.30.1.el9_2 or later (available through the RHEL 9.2.z update stream), a temporary workaround is required on all Grace platforms to avoid undefined behaviors (refer to RHSA-2023:5069 - Security Advisory for more information). As a side effect of this workaround, because the on-board VGA port is inaccessible, a serial console solution (for example, SOL) must be used for console access to the system.

    • To temporarily deploy this workaround for the duration of the current boot:

      1. During boot, stop at the grub menu, select the boot entry, and press the e key to edit the entry.

      2. Append modprobe.blacklist=ast to the end of the list of kernel boot parameters.

      3. Boot the entry by clicking Ctrl-X or pressing F10.

    • To permanently deploy this workaround so that it is always active upon boot:

      1. With administrative privileges, run the following command to add a boot parameter to all kernels:

        grubby --update-kernel=ALL --args="modprobe.blacklist=ast"
        
      2. Reboot the system.

    • To permanently remove this workaround so that it is no longer active upon the boot:

      1. With administrative privileges, run the following command to remove the boot parameter from all kernels:

        grubby --update-kernel=ALL --remove-args="modprobe.blacklist=ast"
        
      2. Reboot the system.

    • To verify the presence of the workaround, complete one of the following options:

      • Evaluate the kernel boot parameters that were set for the current boot by running the following command:

        cat /proc/cmdline | grep ast
        

        When nothing is returned, the workaround is not active.

      • Alternatively, evaluate the loaded modules for the current boot by running the following command:

        lsmod | grep ast
        

        When nothing is returned, the workaround is active.

    Note

    When this workaround is applied using the temporary deployment method from the RHEL Installer, it is automatically included in the installed system.

    Refer to Red Hat Modifying Kernel Boot Parameters for additional guidance about modifying kernel boot parameters.

  • Due to a firmware bug that can cause an invalid memory access in the kernel, Grace platforms from some vendors might experience a crash during the installation or at boot time.`

    This issue can impact any version of RHEL that is supported on the Grace platform. Until this issue is resolved, a workaround is required to allow a successful boot.

    • To temporarily deploy this workaround for the duration of the current boot:

      1. During boot, stop at the grub menu, select the desired boot entry, and press the e key to edit the entry.

      2. Append video=simplefb:off to the end of the list of kernel boot parameters.

      3. Boot the entry by clicking Ctrl-X or pressing F10.

    • To permanently deploy this workaround, so that it is always active upon boot:

      1. With administrative privileges, run the following command to add a boot parameter to all kernels.

        grubby --update-kernel=ALL --args=”video=simplefb:off”
        
      2. Reboot the system.

    • To verify the presence of the workaround, evaluate the kernel boot parameters set for the current boot by running the following command:

      cat /proc/cmdline | grep simplefb
      

      When nothing is returned, the workaround is not active.

    Note

    When this workaround is applied using the temporary deployment method from the RHEL Installer, it is automatically included in the installed system.

    Refer to Red Hat Modifying Kernel Boot Parameters for additional guidance about modifying kernel boot parameters.

  • RHEL 9.3 and 9.4 are exposed to a TPM kernel patch that can cause kernel crashes with different stack traces.

    The fix for this issue is available starting with kernel 5.14.0-427.30.1.el9_4. NVIDIA recommends that you update to this kernel version or later.

  • RHEL does not support NVIDIA BlueField-3 with the in-box driver.

    • Starting with RHEL 9.4, Red Hat added the BF3 card to the disabled hardware list, and kernel modules for devices on this list will not be loaded. For example, when a system has a BF3, the following kernel messages are displayed:

      [ 3.590881] Warning: Disabled Hardware is detected:
      mlx5_core:15B3:A2DC @ 0000:01:00.0 is no longer enabled in this
      release.
      [ 3.590892] mlx5_core: probe of 0000:01:00.0 failed with error -13
      
    • A device that is present in the disabled hardware list is not allowed to be configured. This is accomplished by failing the driver’s probe entry point with EACCESS (-13). In the mlx5_core module, this causes the following crash to occur when the system is rebooted or shutdown:

      [46609.598966] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
      [46609.608623] Mem abort info:
      [46609.612101] ESR = 0x0000000096000005
      [46609.616546] EC = 0x25: DABT (current EL), IL = 32 bits
      [46609.622591] SET = 0, FnV = 0
      [46609.626297] EA = 0, S1PTW = 0
      [46609.630084] FSC = 0x05: level 1 translation fault
      [46609.635656] Data abort info:
      [46609.639177] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
      [46609.645383] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      [46609.651125] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
      [46609.657142] user pgtable: 64k pages, 48-bit VAs,pgdp=000010012c34de00
      [46609.664382] [0000000000000000] pgd=0000000000000000,p4d=0000000000000000, pud=0000000000000000
      [46609.673850] Internal error: Oops: 0000000096000005 `1 <https://jirasw.nvidia.com/browse/DGX-9834#1>`__ SMP
      
      [46609.680099] Modules linked in: binfmt_misc nvidia_uvm(OE)
      nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
      nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct rfkill nft_chain_nat
      nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables
      nfnetlink vfat fat nvidia_vgpu_vfio(OE) nvidia_drm(OE)
      nvidia_modeset(OE) ast nvidia(OE) mlx5_ib drm_shmem_helper video
      i2c_algo_bit drm_kms_helper acpi_ipmi fb_sys_fops ib_uverbs ipmi_ssif
      syscopyarea i2c_smbus ipmi_devintf sysfillrect ib_core spi_nor
      sysimgblt arm_cspmu_module arm_spe_pmu ipmi_msghandler mtd
      coresight_stm coresight_tmc stm_core coresight_funnel cppc_cpufreq
      coresight fuse drm xfs libcrc32c crct10dif_ce mlx5_core ghash_ce
      sha2_ce sha256_arm64 sha1_ce mlxfw nvme sbsa_gwdt psample nvme_core
      tls nvme_common pci_hyperv_intf spi_tegra210_quad acpi_power_meter
      rndis_host cdc_ether usbnet mii dm_mirror dm_region_hash dm_log
      dm_mod
      
      [46609.763629] CPU: 72 PID: 83142 Comm: reboot Kdump: loaded Tainted:
      G OE ------- --- 5.14.0-427.13.1.el9_4.aarch64+64k #1
      
      [46609.776816] Hardware name: Supermicro Super Server/G2DMH-GI, BIOS 2.0 07/16/2024
      [46609.785016] pstate: 63400009 (nZCv daif +PAN UAO +TCO +DIT -SSBS BTYPE=-)
      [46609.792771] pc : shutdown+0x20/0x84 [mlx5_core]
      [46609.798101] lr : pci_device_shutdown+0x38/0x70
      [46609.803257] sp : ffff8000c694fa70
      [46609.807221] x29: ffff8000c694fa70 x28: ffff100043fb5d00 x27: 0000000000000000
      [46609.815087] x26: ffff8000813d6158 x25: 0000000000000001 x24: ffff8000826d0050
      [46609.822944] x23: ffff00009401a140 x22: ffff800082a1d7b0 x21: ffff00009400d0c0
      [46609.830801] x20: ffff800003417580 x19: 0000000000000000 x18: ffffffffffffffff
      [46609.838657] x17: 0000000000000000 x16: ffff800080e7d604 x15: ffff80014694f86d
      [46609.846515] x14: 0000000000000029 x13: 0000000000000000 x12: ffff100020031840
      [46609.854371] x11: 0000000000000003 x10: 0000000000000002 x9: ffff8000807f98e8
      [46609.862209] x8 : 000000000000ffff x7: ffff8000c694f8a0 x6: 0000000000000402
      [46609.870035] x5 : ffffffc0002497e0 x4: ffff8000c694f8a0 x3: dead000000000122
      [46609.877844] x2 : 0000000000000000 x1: ffff8000033d89f8 x0: ffff00009401a000
      [46609.885633] Call trace:
      [46609.888599] shutdown+0x20/0x84 [mlx5_core]
      [46609.893383] pci_device_shutdown+0x38/0x70
      [46609.898024] device_shutdown+0x12c/0x230
      [46609.902484] kernel_restart+0x4c/0x94
      [46609.906673] _do_sys_reboot+0x228/0x250
      [46609.911115] __arm64_sys_reboot+0x28/0x30
      [46609.915646] invoke_syscall.constprop.0+0x7c/0xd0
      
  • Starting with RHEL 9.4 kernel version 5.14.0-427.18.1.el_94, Red Hat moved BF3 from the disabled hardware list to the unmaintained hardware list.

    The unmaintained list allows modules to be probed and used, but with the understanding that it is unsupported. Here is an example of the kernel message that is displayed when unmaintained hardware is detected:

    [49.945691] Warning: Unmaintained Hardware is detected: mlx5_core:15B3:A2DC @ 0006:03:00.0
    

    To avoid the crash during a reboot or a shutdown, NVIDIA recommends that you update to RHEL kernel 5.14.0-427.18.1.el_94 or later. You can also use an out-of-box driver that supports BF3 because RHEL bypasses the disabled and unmaintained hardware logic when a device is supported by an installed out-of-box driver. MLNX_OFED 24.04 and later is an example of an out-of-box driver that supports BF3.

  • NVIDIA is aware of an issue when running RHEL on Grace platforms where some cpufreq policy sysfs nodes fail to enumerate.

    When this happens, the following message is printed to the kernel log:

    cpufreq: cpufreq_online: ->get() failed
    

    There will also be at least one missing cpufreq “policy” directory that corresponds to the CPUs that failed to enumerate, which prevents the CPU governor policy from being changed for these CPUs.

    /sys/devices/system/cpu/cpufreq/policy##
    

    To workaround this issue and refresh the sysfs nodes, reload the cppc_cpufreq module:

    sudo rmmod cppc_cpufreq
    sudo modprobe cppc_cpufreq
    
  • Some Grace systems with NVIDIA BlueField-3 might see these messages in the kernel log at boot time:

    [0.739556] pci_bus 0006:00: Some PCI device resources are unassigned, try booting with pci=realloc
    

    The following commands can be used to confirm that the PCIe BAR was not assigned:

    # Print the PCI BDF for the BlueField-3 Soc Management Interface
    lspci | grep "BlueField-3 SoC Management Interface" | awk '{print $1}' 0006:03:00.2
    
    # Use the BDF to verify if BAR was assigned
    lspci -v -s "0006:03:00.2" \| grep "Memory at"
    Memory at <ignored> (64-bit, prefetchable) [disabled]
    

    To avoid this issue, NVIDIA recommends that you add the pci=realloc kernel parameter. To permanently deploy this workaround so that it is always active upon boot:

    1. With administrative privileges, run the following command to add a boot parameter to all kernels:

      grubby --update-kernel=ALL --args="pci=realloc"
      
    2. Reboot the system.

D.2 Multi-Socket Grace Platforms#

The RHEL 9.2 installation media does not carry a patch required to work around NVIDIA hardware erratum T241-FABRIC-4. This errata impacts Grace systems with three- and four-socket configurations, for example, Grace Hopper x4. Until the system is installed and running with kernel version 5.14.0-284.30.1.el9_2 or later (available through the RHEL 9.2.z update stream), a temporary workaround is required on impacted Grace platforms to avoid undefined behaviors (refer to RHSA-2023:5069 - Security Advisory for more information).

  • The temporary workaround will restrict the system to one socket configuration, which reduces the server to a quarter of its total compute capacity.

    Caution

    Be careful when profiling a Grace Hopper x4 system with this temporary workaround.

  • To temporarily deploy this workaround for the duration of the current boot:

    1. During boot, stop at the grub menu, select the desired boot entry, and press the e key to edit the entry.

    2. Append nr_cpus=72 to the end of the list of kernel boot parameters.

    3. Boot the entry by clicking Ctrl-X or pressing F10.

  • To permanently deploy this workaround so that is always active upon boot:

    1. With administrative privileges, run the following command to add a boot parameter to all kernels:

      grubby --update-kernel=ALL --args="nr_cpus=72"
      
    2. Reboot the system.

  • To permanently remove this workaround, so that it is not active upon boot:

    1. With administrative privileges, run the following command to remove the boot parameter from all kernels:

      grubby --update-kernel=ALL --remove-args="nr_cpus=72"
      
    2. Reboot the system.

  • To verify the presence of the workaround, complete one of the following tasks

    • Evaluate the kernel boot parameters set for the current boot.

      cat /proc/cmdline | grep nr_cpus
      

      When nothing is returned, the workaround is not active.

    • Evaluate the CPU configuration for the current boot.

      lscpu | grep -E "NUMA node[0-9]+"
      

      Only one of the nodes should be populated [with CPUs 0-71].

    Note

    When this workaround is applied using the temporary deployment method from the RHEL Installer, it is automatically included in the installed system.

    Refer to Red Hat Modifying Kernel Boot Parameters for additional guidance about modifying kernel boot parameters.