Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-808

XenServer 7.2 Kernel uses old bnx2x driver that causes major packet loss

    Details

    • Type: Bug
    • Status: To Do (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 7.2
    • Fix Version/s: None
    • Component/s: Dom0, Networking
    • Labels:
      None
    • Team:
    • Internal JIRA Reference:
      CA-285272

      Description

      XenServer 7.2 includes an old version of the bnx2x kernel module - 1.714.1.

      This module version can trigger an serious bug in the Broadcom firmware used on many HP, Dell and Qlogic network cards resulting in major packet loss and network disruptions intermittently after booting.

      HP recommends a minimum bnx2x module version of 7.12.37.1-1 or newer.

      We hit this issue when upgrading from XenServer 7 to 7.2 earlier this week and it nearly destroyed our whole pool as the packet loss went undetected by XenServer or the rolling pool upgrade process.

      I suggest:

      • Immediately issue a hotfix to upgrade the bnx2x module version to 7.12.37.1-1 or newer.
      • Investigate adding health checks between the pool master and slaves for packet loss or network connectivity issues on the management network, at least during a rolling pool upgrade if not during normal operation.

      For further information, see my blog post: https://smcleod.net/tech/Drop-Broadcom/

      The driver included with XenServer 7.2 that triggers the problem is 1.714.1:

      filename: /lib/modules/4.4.0+10/updates/bnx2x.ko version: 1.714.1 license: GPL description: QLogic BCM57710/57711/57711E/57712/57712_MF/57800/57800_MF/57810/57810_MF/57840/57840_MF Driver author: Eliezer Tamir srcversion: 927337210F53311B18D0D7E alias: pci:v000014E4d0000163Fsv*sd*bc*sc*i* alias: pci:v000014E4d0000163Esv*sd*bc*sc*i* alias: pci:v000014E4d0000163Dsv*sd*bc*sc*i* alias: pci:v00001077d000016ADsv*sd*bc*sc*i* alias: pci:v000014E4d000016ADsv*sd*bc*sc*i* alias: pci:v00001077d000016A4sv*sd*bc*sc*i* alias: pci:v000014E4d000016A4sv*sd*bc*sc*i* alias: pci:v000014E4d000016ABsv*sd*bc*sc*i* alias: pci:v000014E4d000016AFsv*sd*bc*sc*i* alias: pci:v000014E4d000016A2sv*sd*bc*sc*i* alias: pci:v00001077d000016A1sv*sd*bc*sc*i* alias: pci:v000014E4d000016A1sv*sd*bc*sc*i* alias: pci:v000014E4d0000168Dsv*sd*bc*sc*i* alias: pci:v000014E4d000016AEsv*sd*bc*sc*i* alias: pci:v000014E4d0000168Esv*sd*bc*sc*i* alias: pci:v000014E4d000016A9sv*sd*bc*sc*i* alias: pci:v000014E4d000016A5sv*sd*bc*sc*i* alias: pci:v000014E4d0000168Asv*sd*bc*sc*i* alias: pci:v000014E4d0000166Fsv*sd*bc*sc*i* alias: pci:v000014E4d00001663sv*sd*bc*sc*i* alias: pci:v000014E4d00001662sv*sd*bc*sc*i* alias: pci:v000014E4d00001650sv*sd*bc*sc*i* alias: pci:v000014E4d0000164Fsv*sd*bc*sc*i* alias: pci:v000014E4d0000164Esv*sd*bc*sc*i* depends: mdio,libcrc32c,ptp,vxlan vermagic: 4.4.0+10 SMP mod_unload modversions parm: pri_map: Priority to HW queue mapping (uint) parm: num_queues: Set number of queues (default is as a number of CPUs) (int) parm: disable_iscsi_ooo: Disable iSCSI OOO support (uint) parm: disable_tpa: Disable the TPA (LRO) feature (uint) parm: int_mode: Force interrupt mode other than MSI-X (1 INT#x; 2 MSI) (uint) parm: dropless_fc: Pause on exhausted host ring (uint) parm: poll: Use polling (for debug) (uint) parm: mrrs: Force Max Read Req Size (0..3) (for debug) (int) parm: debug: Default debug msglevel (uint) parm: num_vfs: Number of supported virtual functions (0 means SR-IOV is disabled) (uint) parm: autogreeen: Set autoGrEEEn (0:HW default; 1:force on; 2:force off) (uint) parm: native_eee:int parm: eee:set EEE Tx LPI timer with this value; 0: HW default; -1: Force disable EEE. parm: tx_switching: Enable tx-switching (uint)

      Whereas XenServer 7.0 has driver version 1.713.04 which seems not to trigger the issue:

      filename: /lib/modules/3.10.0+10/extra/bnx2x.ko version: 1.713.04 license: GPL description: QLogic BCM57710/57711/57711E/57712/57712_MF/57800/57800_MF/57810/57810_MF/57840/57840_MF Driver author: Eliezer Tamir srcversion: 13EAA521200A40118055D63 alias: pci:v000014E4d0000163Fsv*sd*bc*sc*i* alias: pci:v000014E4d0000163Esv*sd*bc*sc*i* alias: pci:v000014E4d0000163Dsv*sd*bc*sc*i* alias: pci:v00001077d000016ADsv*sd*bc*sc*i* alias: pci:v000014E4d000016ADsv*sd*bc*sc*i* alias: pci:v00001077d000016A4sv*sd*bc*sc*i* alias: pci:v000014E4d000016A4sv*sd*bc*sc*i* alias: pci:v000014E4d000016ABsv*sd*bc*sc*i* alias: pci:v000014E4d000016AFsv*sd*bc*sc*i* alias: pci:v000014E4d000016A2sv*sd*bc*sc*i* alias: pci:v00001077d000016A1sv*sd*bc*sc*i* alias: pci:v000014E4d000016A1sv*sd*bc*sc*i* alias: pci:v000014E4d0000168Dsv*sd*bc*sc*i* alias: pci:v000014E4d000016AEsv*sd*bc*sc*i* alias: pci:v000014E4d0000168Esv*sd*bc*sc*i* alias: pci:v000014E4d000016A9sv*sd*bc*sc*i* alias: pci:v000014E4d000016A5sv*sd*bc*sc*i* alias: pci:v000014E4d0000168Asv*sd*bc*sc*i* alias: pci:v000014E4d0000166Fsv*sd*bc*sc*i* alias: pci:v000014E4d00001663sv*sd*bc*sc*i* alias: pci:v000014E4d00001662sv*sd*bc*sc*i* alias: pci:v000014E4d00001650sv*sd*bc*sc*i* alias: pci:v000014E4d0000164Fsv*sd*bc*sc*i* alias: pci:v000014E4d0000164Esv*sd*bc*sc*i* depends: mdio,libcrc32c,ptp vermagic: 3.10.0+10 SMP mod_unload modversions parm: pri_map: Priority to HW queue mapping (uint) parm: num_queues: Set number of queues (default is as a number of CPUs) (int) parm: disable_iscsi_ooo: Disable iSCSI OOO support (uint) parm: disable_tpa: Disable the TPA (LRO) feature (uint) parm: int_mode: Force interrupt mode other than MSI-X (1 INT#x; 2 MSI) (uint) parm: dropless_fc: Pause on exhausted host ring (uint) parm: poll: Use polling (for debug) (uint) parm: mrrs: Force Max Read Req Size (0..3) (for debug) (int) parm: debug: Default debug msglevel (uint) parm: num_vfs: Number of supported virtual functions (0 means SR-IOV is disabled) (uint) parm: autogreeen: Set autoGrEEEn (0:HW default; 1:force on; 2:force off) (uint) parm: native_eee:int parm: eee:set EEE Tx LPI timer with this value; 0: HW default; -1: Force disable EEE. parm: tx_switching: Enable tx-switching (uint)

       

        Attachments

          Activity

            People

            • Assignee:
              simoncro Simon Crowe
              Reporter:
              s_mcleod Sam McLeod
            • Votes:
              6 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Created:
                Updated: