Details
-
Bug
-
Resolution: Incomplete
-
Major
-
None
-
7.3
-
None
-
Supermicro microblade MBI-6219G-T
https://www.supermicro.com/products/MicroBlade/module/MBI-6219G-T.cfm
(issue is not switch dependent)
Description
Hi there!
We have very often on XS7.2 (applied patch XS72E015) - igb bond flapping problem.
Problem occurs time to time - and it is impossible to reproduce right now.
We have this problem almost daily...
What happens?
One of the network drivers hangs and will not work anymore (sometime flaps UP/DOWN) - until modprobe -r igb and modprobe igb back - after that is everything working again - also server reboot helps
[root@xen ~]# ethtool -i eth0
driver: igb
version: 5.3.5.3
firmware-version: 1.63, 0x800009fd
bus-info: 0000:01:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
DMESG:
[Mon Feb 19 20:24:23 2018] -----------[ cut here ]-----------
[Mon Feb 19 20:24:23 2018] WARNING: CPU: 7 PID: 0 at net/sched/sch_generic.c:306 dev_watchdog+0x193/0x260()
[Mon Feb 19 20:24:23 2018] NETDEV WATCHDOG: eth0 (igb): transmit queue 0 timed out
[Mon Feb 19 20:24:23 2018] Modules linked in: tun nfsv3 nfs fscache iptable_filter openvswitch nf_defrag_ipv6 nf_conntrack libcrc32c 8021q garp mrp stp llc dm_multipath ipmi_devintf x86_pkg_temp_thermal coretemp crc32_pclmul dm_mod aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul sg glue_helper i2c_i801 shpchp
ipmi_si ipmi_msghandler video tpm_tis tpm nls_utf8 isofs nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables hid_generic usbhid hid raid1 md_mod sd_mod ahci libahci libata xhci_pci igb(O) xhci_hcd ptp pps_core scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod ipv6 autofs4
[Mon Feb 19 20:24:23 2018] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G O 4.4.0+10 #1
[Mon Feb 19 20:24:23 2018] Hardware name: Supermicro Super Server/B2SS2-F, BIOS 2.0a 06/10/2017
[Mon Feb 19 20:24:23 2018] 0000000000000000 ffff880087fc3db0 ffffffff8131bb63 ffff880087fc3df8
[Mon Feb 19 20:24:23 2018] ffffffff8186eb0a ffff880087fc3de8 ffffffff81071d6e 0000000000000000
[Mon Feb 19 20:24:23 2018] ffff880080d7c000 0000000000000010 ffff880080d7b700 0000000000000007
[Mon Feb 19 20:24:23 2018] Call Trace:
[Mon Feb 19 20:24:23 2018] <IRQ> [<ffffffff8131bb63>] dump_stack+0x63/0x90
[Mon Feb 19 20:24:23 2018] [<ffffffff81071d6e>] warn_slowpath_common+0x9e/0xc0
[Mon Feb 19 20:24:23 2018] [<ffffffff81071ddc>] warn_slowpath_fmt+0x4c/0x50
[Mon Feb 19 20:24:23 2018] [<ffffffff815083e3>] dev_watchdog+0x193/0x260
[Mon Feb 19 20:24:23 2018] [<ffffffff81508250>] ? dev_deactivate_queue.constprop.34+0x60/0x60
[Mon Feb 19 20:24:23 2018] [<ffffffff810ceb4f>] call_timer_fn+0x5f/0x140
[Mon Feb 19 20:24:23 2018] [<ffffffff81508250>] ? dev_deactivate_queue.constprop.34+0x60/0x60
[Mon Feb 19 20:24:23 2018] [<ffffffff810d02b0>] run_timer_softirq+0x220/0x2a0
[Mon Feb 19 20:24:23 2018] [<ffffffff810761e9>] __do_softirq+0x129/0x290
[Mon Feb 19 20:24:23 2018] [<ffffffff81076522>] irq_exit+0x42/0x90
[Mon Feb 19 20:24:23 2018] [<ffffffff813c64d5>] xen_evtchn_do_upcall+0x35/0x50
[Mon Feb 19 20:24:23 2018] [<ffffffff815a4dae>] xen_do_hypervisor_callback+0x1e/0x40
[Mon Feb 19 20:24:23 2018] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[Mon Feb 19 20:24:23 2018] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[Mon Feb 19 20:24:23 2018] [<ffffffff8100c460>] ? xen_safe_halt+0x10/0x20
[Mon Feb 19 20:24:23 2018] [<ffffffff81020d67>] ? default_idle+0x57/0xf0
[Mon Feb 19 20:24:23 2018] [<ffffffff8102149f>] ? arch_cpu_idle+0xf/0x20
[Mon Feb 19 20:24:23 2018] [<ffffffff810aadb2>] ? default_idle_call+0x32/0x40
[Mon Feb 19 20:24:23 2018] [<ffffffff810ab00c>] ? cpu_startup_entry+0x1ec/0x330
[Mon Feb 19 20:24:23 2018] [<ffffffff81013c18>] ? cpu_bringup_and_idle+0x18/0x20
[Mon Feb 19 20:24:23 2018] --[ end trace 2e96dee36a582c18 ]--
[Mon Feb 19 20:24:24 2018] igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[Mon Feb 19 20:24:24 2018] igb 0000:01:00.1 eth1: igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Now is the latest version 5.3.5.15 12-19-2017
https://sourceforge.net/projects/e1000/files/igb%20stable/
-> but question if it will help...
Also described there:
https://sourceforge.net/p/e1000/bugs/549/?limit=25
https://discussions.citrix.com/topic/385033-xen-70-p27-intel-network-flapping-in-lacpslb-bond/
Without bond it crashes very very rarely.