Details
-
Bug
-
Resolution: Done
-
Critical
-
None
-
7.0
-
None
-
HP DL360 Gen9 (Xeon E5-2620 v3), XenServer 7 (build 125380c)
Linux PV guests (Debian, Linux 3.16 kernel)
Description
I have been seeing random crashes on domU machines lately. This has happened on three different guests about three times per week. They are all linux PV guests running Debian with either 3.16.36-1+deb8u1 or 3.16.7-ckt25-2+deb8u3 kernel. HVM guests seem to work fine.
I haven't found a clear trigger for the crashes and thus cannot reproduce cleanly. This just happens at random intervals. After crash XenServer restarts the machine and it works normally afterwards (until next crash).
In /var/log/xen/hypervisor.log I get entries like
[2016-10-19 09:59:53] (XEN) [2393005.780063] Pagetable walk from fffffffffffffff8: [2016-10-19 09:59:53] (XEN) [2393005.780067] L4[0x1ff] = 0000001482a16067 0000000000001816 [2016-10-19 09:59:53] (XEN) [2393005.780071] L3[0x1ff] = 0000001482a18067 0000000000001818 [2016-10-19 09:59:53] (XEN) [2393005.780074] L2[0x1ff] = 0000000000000000 ffffffffffffffff [2016-10-19 09:59:53] (XEN) [2393005.780085] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-10-19 09:59:53] (XEN) [2393005.780088] Domain 439 (vcpu#0) crashed on cpu#1: [2016-10-19 09:59:53] (XEN) [2393005.780092] ----[ Xen-4.6.1-xs128153 x86_64 debug=n Not tainted ]---- [2016-10-19 09:59:53] (XEN) [2393005.780094] CPU: 1 [2016-10-19 09:59:53] (XEN) [2393005.780097] RIP: e033:[<ffffffff8151aca3>] [2016-10-19 09:59:53] (XEN) [2393005.780099] RFLAGS: 0000000000010246 EM: 1 CONTEXT: pv guest (d439v0) [2016-10-19 09:59:53] (XEN) [2393005.780104] rax: 0000000000000004 rbx: 0000000000000003 rcx: 0000000018331000 [2016-10-19 09:59:53] (XEN) [2393005.780106] rdx: 000000000000002f rsi: 0000000000000000 rdi: 0000000000000000 [2016-10-19 09:59:53] (XEN) [2393005.780108] rbp: 000000001831b4f4 rsp: 0000000000000000 r8: 0000000000000000 [2016-10-19 09:59:53] (XEN) [2393005.780110] r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 [2016-10-19 09:59:53] (XEN) [2393005.780112] r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 [2016-10-19 09:59:53] (XEN) [2393005.780114] r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000001526e0 [2016-10-19 09:59:53] (XEN) [2393005.780116] cr3: 00000013b4f71000 cr2: 000000000000b888 [2016-10-19 09:59:53] (XEN) [2393005.780119] ds: 002b es: 002b fs: 0000 gs: 003f ss: e02b cs: e033 [2016-10-19 09:59:53] (XEN) [2393005.780121] Guest stack trace from rsp=0000000000000000: [2016-10-19 09:59:53] (XEN) [2393005.780123] Stack empty.
All of them seem to occur in some part of create_bounce_frame
[2016-09-25 01:34:40] (XEN) [289092.579668] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-09-26 15:55:46] (XEN) [427158.510171] domain_crash_sync called from entry.S: fault at ffff82d080236a44 entry.o#create_bounce_frame+0xbc/0x13a [2016-09-27 11:09:31] (XEN) [496384.070132] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-09-28 11:37:28] (XEN) [584460.662464] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-10-04 03:48:18] (XEN) [1074710.520232] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-10-05 05:09:32] (XEN) [1165984.610292] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-10-05 13:18:36] (XEN) [1195328.410385] domain_crash_sync called from entry.S: fault at ffff82d080236a1b entry.o#create_bounce_frame+0x93/0x13a [2016-10-12 15:31:42] (XEN) [1808114.799880] domain_crash_sync called from entry.S: fault at ffff82d080236a44 entry.o#create_bounce_frame+0xbc/0x13a [2016-10-14 01:10:48] (XEN) [1929260.360172] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-10-14 15:15:47] (XEN) [1979959.800093] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-10-17 04:04:10] (XEN) [2198862.070361] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a [2016-10-19 09:59:53] (XEN) [2393005.780085] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a