Skip to content

megacli and megaclisas-status kill conttroller's FW  #136

@romeor

Description

@romeor

Hello,
I've installed megacli, megaclisas-status from your repository and ran into an issue with my hardware. First, my HW:

Linux pve2 5.19.17-1-pve #1 SMP PREEMPT_DYNAMIC PVE 5.19.17-1 (Mon, 14 Nov 2022 20:25:12  x86_64 GNU/Linux
18:00.0 RAID bus controller: Broadcom / LSI MegaRAID 12GSAS/PCIe Secure SAS39xx

Raid is 3916 to be precise. Running latest FW: 

Firmware Package Build = 52.22.0-4571
Firmware Version = 5.220.02-3691
PSOC FW Version = 0x0017
PSOC Part Number = 15987-231-8GB
NVDATA Version = 5.2200.21-0585
CBB Version = 23.25.01.00
Bios Version = 7.22.00.0_0x07160300
HII Version = 07.22.03.00
HIIA Version = 07.22.03.00
Driver Name = megaraid_sas
Driver Version = 07.719.03.00-rc1


System Information
        Manufacturer: Supermicro
        Product Name: SYS-110P-WTR

The issue was: as soon as I run

megacli -AdpAllInfo -aALL or megaclisas-status (or periodic run of megaclisas-statusd)

My system freeze for a while, i was not able to write nor read from disk and dmesg was full of these errors:

1661.722811] megaraid_sas 0000:18:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[ 1661.722829] megaraid_sas 0000:18:00.0: FW in FAULT state Fault code:0x10000 subcode:0x0 func:megasas_wait_for_outstanding_fusion
[ 1661.722848] megaraid_sas 0000:18:00.0: resetting fusion adapter scsi0.
[ 1661.723202] megaraid_sas 0000:18:00.0: Outstanding fastpath IOs: 4
[ 1668.382749] megaraid_sas 0000:18:00.0: Waiting for FW to come to ready state
[ 1691.286479] megaraid_sas 0000:18:00.0: FW now in Ready state
[ 1691.286483] megaraid_sas 0000:18:00.0: FW now in Ready state
[ 1691.286684] megaraid_sas 0000:18:00.0: Current firmware supports maximum commands: 5101       LDIO threshold: 0
[ 1691.286687] megaraid_sas 0000:18:00.0: Performance mode :Balanced (latency index = 8)
[ 1691.286688] megaraid_sas 0000:18:00.0: FW supports sync cache        : Yes
[ 1691.286691] megaraid_sas 0000:18:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[ 1691.398489] megaraid_sas 0000:18:00.0: FW supports atomic descriptor : Yes
[ 1693.890459] megaraid_sas 0000:18:00.0: FW provided supportMaxExtLDs: 1       max_lds: 240
[ 1693.890471] megaraid_sas 0000:18:00.0: controller type       : MR(8192MB)
[ 1693.890476] megaraid_sas 0000:18:00.0: Online Controller Reset(OCR)  : Enabled
[ 1693.890479] megaraid_sas 0000:18:00.0: Secure JBOD support   : Yes
[ 1693.890482] megaraid_sas 0000:18:00.0: NVMe passthru support : Yes
[ 1693.890484] megaraid_sas 0000:18:00.0: FW provided TM TaskAbort/Reset timeout        : 6 secs/60 secs
[ 1693.890485] megaraid_sas 0000:18:00.0: JBOD sequence map support     : Yes
[ 1693.890486] megaraid_sas 0000:18:00.0: PCI Lane Margining support    : Yes
[ 1701.562362] megaraid_sas 0000:18:00.0: megasas_get_ld_map_info DCMD timed out, RAID map is disabled
[ 1708.170289] megaraid_sas 0000:18:00.0: Waiting for FW to come to ready state
[ 1728.026073] megaraid_sas 0000:18:00.0: FW now in Ready state
[ 1728.026077] megaraid_sas 0000:18:00.0: FW now in Ready state
[ 1728.026300] megaraid_sas 0000:18:00.0: Current firmware supports maximum commands: 5101       LDIO threshold: 0
[ 1728.026303] megaraid_sas 0000:18:00.0: Performance mode :Balanced (latency index = 8)
[ 1728.026304] megaraid_sas 0000:18:00.0: FW supports sync cache        : Yes
[ 1728.026306] megaraid_sas 0000:18:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[ 1728.402068] megaraid_sas 0000:18:00.0: FW supports atomic descriptor : Yes
[ 1728.550065] megaraid_sas 0000:18:00.0: FW provided supportMaxExtLDs: 1       max_lds: 240
[ 1728.550068] megaraid_sas 0000:18:00.0: controller type       : MR(8192MB)
[ 1728.550069] megaraid_sas 0000:18:00.0: Online Controller Reset(OCR)  : Enabled
[ 1728.550070] megaraid_sas 0000:18:00.0: Secure JBOD support   : Yes
[ 1728.550071] megaraid_sas 0000:18:00.0: NVMe passthru support : Yes
[ 1728.550072] megaraid_sas 0000:18:00.0: FW provided TM TaskAbort/Reset timeout        : 6 secs/60 secs
[ 1728.550074] megaraid_sas 0000:18:00.0: JBOD sequence map support     : Yes
[ 1728.550074] megaraid_sas 0000:18:00.0: PCI Lane Margining support    : Yes
[ 1736.149985] megaraid_sas 0000:18:00.0: megasas_get_ld_map_info DCMD timed out, RAID map is disabled
[ 1742.837909] megaraid_sas 0000:18:00.0: Waiting for FW to come to ready state
[ 1762.581695] megaraid_sas 0000:18:00.0: FW now in Ready state
[ 1762.581700] megaraid_sas 0000:18:00.0: FW now in Ready state
[ 1762.581901] megaraid_sas 0000:18:00.0: Current firmware supports maximum commands: 5101       LDIO threshold: 0
[ 1762.581904] megaraid_sas 0000:18:00.0: Performance mode :Balanced (latency index = 8)
[ 1762.581905] megaraid_sas 0000:18:00.0: FW supports sync cache        : Yes
[ 1762.581907] megaraid_sas 0000:18:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
[ 1762.985689] megaraid_sas 0000:18:00.0: FW supports atomic descriptor : Yes
[ 1763.145688] megaraid_sas 0000:18:00.0: FW provided supportMaxExtLDs: 1       max_lds: 240
[ 1763.145690] megaraid_sas 0000:18:00.0: controller type       : MR(8192MB)
[ 1763.145692] megaraid_sas 0000:18:00.0: Online Controller Reset(OCR)  : Enabled
[ 1763.145693] megaraid_sas 0000:18:00.0: Secure JBOD support   : Yes
[ 1763.145694] megaraid_sas 0000:18:00.0: NVMe passthru support : Yes
[ 1763.145695] megaraid_sas 0000:18:00.0: FW provided TM TaskAbort/Reset timeout        : 6 secs/60 secs
[ 1763.145697] megaraid_sas 0000:18:00.0: JBOD sequence map support     : Yes
[ 1763.145698] megaraid_sas 0000:18:00.0: PCI Lane Margining support    : Yes
[ 1763.145699] megaraid_sas 0000:18:00.0: return -EBUSY from megasas_refire_mgmt_cmd 4362 cmd 0x5 opcode 0x10b0100
[ 1763.145732] megaraid_sas 0000:18:00.0: return -EBUSY from megasas_mgmt_fw_ioctl 8408 cmd 0x5 opcode 0x10b0100 cmd->cmd_status_drv 0x3
[ 1763.145782] megaraid_sas 0000:18:00.0: waiting for controller reset to finish
[ 1763.205697] megaraid_sas 0000:18:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000
[ 1763.205984] megaraid_sas 0000:18:00.0: Adapter is OPERATIONAL for [scsi:0](https://mail.tlulib.ee/scsi:0)
[ 1763.206131] megaraid_sas 0000:18:00.0: Snap dump wait time   : 15
[ 1763.206132] megaraid_sas 0000:18:00.0: Reset successful for scsi0.
[ 1763.206295] megaraid_sas 0000:18:00.0: 10672 (722633074s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c

[ 1763.206572] megaraid_sas 0000:18:00.0: 10675 (722633081s/0x0020/CRIT) - Controller encountered an error and was reset
[ 1763.211401] megaraid_sas 0000:18:00.0: scanning for scsi0...
[ 1763.211666] megaraid_sas 0000:18:00.0: 10719 (722633106s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c

[ 1763.211963] megaraid_sas 0000:18:00.0: 10722 (722633113s/0x0020/CRIT) - Controller encountered an error and was reset
[ 1763.218960] megaraid_sas 0000:18:00.0: scanning for scsi0...
[ 1763.221603] megaraid_sas 0000:18:00.0: 10765 (722633133s/0x0020/DEAD) - Fatal firmware error: Line 188 in fw\raid\utils.c

[ 1763.221742] megaraid_sas 0000:18:00.0: 10768 (722633140s/0x0020/CRIT) - Controller encountered an error and was reset
[ 1763.226380] megaraid_sas 0000:18:00.0: scanning for scsi0...

nothing happens with megaraidsas-status and latest storcli, that i got from broadcom site.

Could you please fix or add storcli (ubuntu pkg is available from broadcom site https://www.broadcom.com/products/storage/raid-controllers/megaraid-9560-16i

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions