Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-810

xenserver crashes and reboots due driver error

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done (View Workflow)
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 7.2
    • Fix Version/s: 8.0
    • Component/s: Dom0, Driver, Networking
    • Labels:
      None
    • Environment:

      Xenserver 6.5, xenserver 7.0, 7.1 and 7.2 

      Broadcom Limited NetXtreme II BCM57800 1/10 Gigabit Ethernet

      Dell r730 XD

    • Team:
    • Internal JIRA Reference:
      CA-285272

      Description

      We are having problem with the bnx2x driver with the Broadcom Limited NetXtreme II BCM57800 1/10 Gigabit Ethernet card in dell r730 XD servers.

      We tested with xenserver 6.5, 7.0, 7.1 and 7.2 and in all versions the test had a problem. Firmware version downgrade for tests on xenserver 7.1 was done and the problem persisted.

      The problem occurs when a VM using the network interface with the bnx2x driver in NAT mode is used as the network gateway. It was the only way we could reproduce the problem.

      /var/log/kern.log

       

      [ 29.804048] bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.714.1 ($DateTime: 2016/10/20 21:22:05 $)

      Oct 16 16:10:26 SRV01 kernel: [1022316.276658] -----------[ cut here ]-----------
      Oct 16 16:10:26 SRV01 kernel: [1022316.276679] WARNING: CPU: 8 PID: 0 at net/sched/sch_generic.c:306 dev_watchdog+0x193/0x260()
      Oct 16 16:10:26 SRV01 kernel: [1022316.276682] NETDEV WATCHDOG: eth0 (bnx2x): transmit queue 24 timed out
      Oct 16 16:10:26 SRV01 kernel: [1022316.276685] Modules linked in: tun bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc openvswitch nf_defrag_ipv6 8021q garp mrp stp llc ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_multiport xt_conntrack nf_conntrack iptable_filter dm_multipath ipmi_devintf x86_pkg_temp_thermal coretemp crc32_pclmul aesni_intel mxm_wmi dcdbas dm_mod sg aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper lpc_ich sb_edac edac_core shpchp mfd_core ipmi_si ipmi_msghandler hed tpm_tis tpm wmi nls_utf8 isofs nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables sd_mod usb_storage tg3 bnx2x(O) mdio ahci libcrc32c libahci vxlan ip6_udp_tunnel ehci_pci udp_tunnel libata ptp ehci_hcd pps_core megaraid_sas(O) scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ipv6 autofs4
      Oct 16 16:10:26 SRV01 kernel: [1022316.276750] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G O 4.4.0+2 #1
      Oct 16 16:10:26 SRV01 kernel: [1022316.276752] Hardware name: Dell Inc. PowerEdge R730xd/072T6D, BIOS 2.4.3 01/17/2017
      Oct 16 16:10:26 SRV01 kernel: [1022316.276754] 0000000000000000 ffff88018d503db0 ffffffff8131abe3 ffff88018d503df8
      Oct 16 16:10:26 SRV01 kernel: [1022316.276756] ffffffff8186cd01 ffff88018d503de8 ffffffff81071d1e 0000000000000018
      Oct 16 16:10:26 SRV01 kernel: [1022316.276758] ffff88006e478000 000000000000005b ffff88006e488780 0000000000000008
      Oct 16 16:10:26 SRV01 kernel: [1022316.276761] Call Trace:
      Oct 16 16:10:26 SRV01 kernel: [1022316.276763] <IRQ> [<ffffffff8131abe3>] dump_stack+0x63/0x90
      Oct 16 16:10:26 SRV01 kernel: [1022316.276779] [<ffffffff81071d1e>] warn_slowpath_common+0x9e/0xc0
      Oct 16 16:10:26 SRV01 kernel: [1022316.276783] [<ffffffff81071d8c>] warn_slowpath_fmt+0x4c/0x50
      Oct 16 16:10:26 SRV01 kernel: [1022316.276788] [<ffffffff81506283>] dev_watchdog+0x193/0x260
      Oct 16 16:10:26 SRV01 kernel: [1022316.276790] [<ffffffff815060f0>] ? dev_deactivate_queue.constprop.34+0x60/0x60
      Oct 16 16:10:26 SRV01 kernel: [1022316.276798] [<ffffffff810ceaaf>] call_timer_fn+0x5f/0x140
      Oct 16 16:10:26 SRV01 kernel: [1022316.276806] [<ffffffff815060f0>] ? dev_deactivate_queue.constprop.34+0x60/0x60
      Oct 16 16:10:26 SRV01 kernel: [1022316.276810] [<ffffffff810d0210>] run_timer_softirq+0x220/0x2a0
      Oct 16 16:10:26 SRV01 kernel: [1022316.276816] [<ffffffff81076199>] __do_softirq+0x129/0x290
      Oct 16 16:10:26 SRV01 kernel: [1022316.276821] [<ffffffff810764d2>] irq_exit+0x42/0x90
      Oct 16 16:10:26 SRV01 kernel: [1022316.276827] [<ffffffff813c5285>] xen_evtchn_do_upcall+0x35/0x50
      Oct 16 16:10:26 SRV01 kernel: [1022316.276835] [<ffffffff815a256e>] xen_do_hypervisor_callback+0x1e/0x40
      Oct 16 16:10:26 SRV01 kernel: [1022316.276839] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
      Oct 16 16:10:26 SRV01 kernel: [1022316.276852] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
      Oct 16 16:10:26 SRV01 kernel: [1022316.276860] [<ffffffff8100c460>] ? xen_safe_halt+0x10/0x20
      Oct 16 16:10:26 SRV01 kernel: [1022316.276867] [<ffffffff81020d27>] ? default_idle+0x57/0xf0
      Oct 16 16:10:26 SRV01 kernel: [1022316.276872] [<ffffffff8102145f>] ? arch_cpu_idle+0xf/0x20
      Oct 16 16:10:26 SRV01 kernel: [1022316.276879] [<ffffffff810aad52>] ? default_idle_call+0x32/0x40
      Oct 16 16:10:26 SRV01 kernel: [1022316.276884] [<ffffffff810aafac>] ? cpu_startup_entry+0x1ec/0x330
      Oct 16 16:10:26 SRV01 kernel: [1022316.276891] [<ffffffff81013c18>] ? cpu_bringup_and_idle+0x18/0x20
      Oct 16 16:10:26 SRV01 kernel: [1022316.276897] --[ end trace 7185181c3b7a6807 ]--
      Oct 16 16:10:26 SRV01 kernel: [1022316.276971] ULP_STOP
      Oct 16 16:10:29 SRV01 kernel: [1022319.284616] bnx2fc: ERROR:bnx2fc_destroy_timer - Destroy compl not received!!
      Oct 16 16:10:29 SRV01 kernel: [1022319.321021] bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished
      Oct 16 16:10:29 SRV01 kernel: [1022319.340708] bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished
      Oct 16 16:10:31 SRV01 kernel: [1022321.338043] [bnx2x_clean_tx_queue:1610(eth0)]timeout waiting for queue[0]: txdata->tx_pkt_prod(10870) != txdata->tx_pkt_cons(10867)
      Oct 16 16:10:33 SRV01 kernel: [1022323.342897] [bnx2x_clean_tx_queue:1610(eth0)]timeout waiting for queue[24]: txdata->tx_pkt_prod(21080) != txdata->tx_pkt_cons(21078)
      Oct 16 16:10:35 SRV01 kernel: [1022325.346875] [bnx2x_clean_tx_queue:1610(eth0)]timeout waiting for queue[0]: txdata->tx_pkt_prod(10870) != txdata->tx_pkt_cons(10867)
      Oct 16 16:10:37 SRV01 kernel: [1022327.346693] [bnx2x_clean_tx_queue:1610(eth0)]timeout waiting for queue[24]: txdata->tx_pkt_prod(21080) != txdata->tx_pkt_cons(21078)
      Oct 16 16:10:47 SRV01 kernel: [1022337.358143] [bnx2x_state_wait:339(eth0)]timeout waiting for state 0
      Oct 16 16:10:47 SRV01 kernel: [1022337.358159] bnx2x: [bnx2x_del_all_macs:9363(eth0)]Failed to delete MACs: -16
      Oct 16 16:10:47 SRV01 kernel: [1022337.358169] bnx2x: [bnx2x_chip_cleanup:10192(eth0)]Failed to schedule DEL commands for UC MACs list: -16
      Oct 16 16:10:57 SRV01 kernel: [1022347.365685] [bnx2x_state_wait:339(eth0)]timeout waiting for state 9
      Oct 16 16:11:07 SRV01 kernel: [1022357.372322] [bnx2x_state_wait:339(eth0)]timeout waiting for state 2
      Oct 16 16:11:07 SRV01 kernel: [1022357.372337] bnx2x: [bnx2x_func_stop:9963(eth0)]FUNC_STOP ramrod failed. Running a dry transaction
      Oct 16 16:11:08 SRV01 kernel: [1022357.716180] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:08 SRV01 kernel: [1022357.716196] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:08 SRV01 kernel: [1022357.914792] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:08 SRV01 kernel: [1022357.914800] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:08 SRV01 kernel: [1022358.113492] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:08 SRV01 kernel: [1022358.113503] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:08 SRV01 kernel: [1022358.312791] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:08 SRV01 kernel: [1022358.312798] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:08 SRV01 kernel: [1022358.511390] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:08 SRV01 kernel: [1022358.511396] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:09 SRV01 kernel: [1022358.710778] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:09 SRV01 kernel: [1022358.710784] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:09 SRV01 kernel: [1022358.909658] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:09 SRV01 kernel: [1022358.909663] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:09 SRV01 kernel: [1022359.108643] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:09 SRV01 kernel: [1022359.108649] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:09 SRV01 kernel: [1022359.308253] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:09 SRV01 kernel: [1022359.308260] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:09 SRV01 kernel: [1022359.507128] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:09 SRV01 kernel: [1022359.507134] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:10 SRV01 kernel: [1022359.706759] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:10 SRV01 kernel: [1022359.706765] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:10 SRV01 kernel: [1022359.905790] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:10 SRV01 kernel: [1022359.905795] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:10 SRV01 kernel: [1022360.104738] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:10 SRV01 kernel: [1022360.104744] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:10 SRV01 kernel: [1022360.304349] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:10 SRV01 kernel: [1022360.304356] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:10 SRV01 kernel: [1022360.503290] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:10 SRV01 kernel: [1022360.503295] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:11 SRV01 kernel: [1022360.702854] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:11 SRV01 kernel: [1022360.702860] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:11 SRV01 kernel: [1022360.901803] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:11 SRV01 kernel: [1022360.901809] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:11 SRV01 kernel: [1022361.100929] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:11 SRV01 kernel: [1022361.100935] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:11 SRV01 kernel: [1022361.300654] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:11 SRV01 kernel: [1022361.300660] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:11 SRV01 kernel: [1022361.499581] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:11 SRV01 kernel: [1022361.499587] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:12 SRV01 kernel: [1022361.699250] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:12 SRV01 kernel: [1022361.699255] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:12 SRV01 kernel: [1022361.898170] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:12 SRV01 kernel: [1022361.898176] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1
      Oct 16 16:11:12 SRV01 kernel: [1022362.097206] bnx2x: [bnx2x_issue_dmae_with_comp:762(eth0)]DMAE timeout!
      Oct 16 16:11:12 SRV01 kernel: [1022362.097211] bnx2x: [bnx2x_write_dmae:823(eth0)]DMAE returned failure -1

       

       

        Attachments

        1. dmesg.txt
          226 kB
        2. kern.log
          92 kB
        3. SECH85.txt
          99 kB

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              joaoreis João Reis
            • Votes:
              2 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: