Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-630

Random domU crashes on create_bounce_frame

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done (View Workflow)
    • Priority: Critical
    • Resolution: Done
    • Affects Version/s: 7.0
    • Fix Version/s: None
    • Component/s: VM Lifecycle
    • Labels:
      None
    • Environment:

      HP DL360 Gen9 (Xeon E5-2620 v3), XenServer 7 (build 125380c)
      Linux PV guests (Debian, Linux 3.16 kernel)

      Description

      I have been seeing random crashes on domU machines lately. This has happened on three different guests about three times per week. They are all linux PV guests running Debian with either 3.16.36-1+deb8u1 or 3.16.7-ckt25-2+deb8u3 kernel. HVM guests seem to work fine.

      I haven't found a clear trigger for the crashes and thus cannot reproduce cleanly. This just happens at random intervals. After crash XenServer restarts the machine and it works normally afterwards (until next crash).

      In /var/log/xen/hypervisor.log I get entries like

      [2016-10-19 09:59:53] (XEN) [2393005.780063] Pagetable walk from fffffffffffffff8:
      [2016-10-19 09:59:53] (XEN) [2393005.780067]  L4[0x1ff] = 0000001482a16067 0000000000001816
      [2016-10-19 09:59:53] (XEN) [2393005.780071]  L3[0x1ff] = 0000001482a18067 0000000000001818
      [2016-10-19 09:59:53] (XEN) [2393005.780074]  L2[0x1ff] = 0000000000000000 ffffffffffffffff 
      [2016-10-19 09:59:53] (XEN) [2393005.780085] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-10-19 09:59:53] (XEN) [2393005.780088] Domain 439 (vcpu#0) crashed on cpu#1:
      [2016-10-19 09:59:53] (XEN) [2393005.780092] ----[ Xen-4.6.1-xs128153  x86_64  debug=n  Not tainted ]----
      [2016-10-19 09:59:53] (XEN) [2393005.780094] CPU:    1
      [2016-10-19 09:59:53] (XEN) [2393005.780097] RIP:    e033:[<ffffffff8151aca3>]
      [2016-10-19 09:59:53] (XEN) [2393005.780099] RFLAGS: 0000000000010246   EM: 1   CONTEXT: pv guest (d439v0)
      [2016-10-19 09:59:53] (XEN) [2393005.780104] rax: 0000000000000004   rbx: 0000000000000003   rcx: 0000000018331000
      [2016-10-19 09:59:53] (XEN) [2393005.780106] rdx: 000000000000002f   rsi: 0000000000000000   rdi: 0000000000000000
      [2016-10-19 09:59:53] (XEN) [2393005.780108] rbp: 000000001831b4f4   rsp: 0000000000000000   r8:  0000000000000000
      [2016-10-19 09:59:53] (XEN) [2393005.780110] r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000000
      [2016-10-19 09:59:53] (XEN) [2393005.780112] r12: 0000000000000000   r13: 0000000000000000   r14: 0000000000000000
      [2016-10-19 09:59:53] (XEN) [2393005.780114] r15: 0000000000000000   cr0: 0000000080050033   cr4: 00000000001526e0
      [2016-10-19 09:59:53] (XEN) [2393005.780116] cr3: 00000013b4f71000   cr2: 000000000000b888
      [2016-10-19 09:59:53] (XEN) [2393005.780119] ds: 002b   es: 002b   fs: 0000   gs: 003f   ss: e02b   cs: e033
      [2016-10-19 09:59:53] (XEN) [2393005.780121] Guest stack trace from rsp=0000000000000000:
      [2016-10-19 09:59:53] (XEN) [2393005.780123]   Stack empty.
      

      All of them seem to occur in some part of create_bounce_frame

      [2016-09-25 01:34:40] (XEN) [289092.579668] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-09-26 15:55:46] (XEN) [427158.510171] domain_crash_sync called from entry.S: fault at ffff82d080236a44 entry.o#create_bounce_frame+0xbc/0x13a
      [2016-09-27 11:09:31] (XEN) [496384.070132] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-09-28 11:37:28] (XEN) [584460.662464] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-10-04 03:48:18] (XEN) [1074710.520232] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-10-05 05:09:32] (XEN) [1165984.610292] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-10-05 13:18:36] (XEN) [1195328.410385] domain_crash_sync called from entry.S: fault at ffff82d080236a1b entry.o#create_bounce_frame+0x93/0x13a
      [2016-10-12 15:31:42] (XEN) [1808114.799880] domain_crash_sync called from entry.S: fault at ffff82d080236a44 entry.o#create_bounce_frame+0xbc/0x13a
      [2016-10-14 01:10:48] (XEN) [1929260.360172] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-10-14 15:15:47] (XEN) [1979959.800093] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-10-17 04:04:10] (XEN) [2198862.070361] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      [2016-10-19 09:59:53] (XEN) [2393005.780085] domain_crash_sync called from entry.S: fault at ffff82d0802369ee entry.o#create_bounce_frame+0x66/0x13a
      

        Attachments

        1. crash.log
          51 kB
          Anssi Kolehmainen

          Activity

            People

            Assignee:
            rosslagerwall Ross Lagerwall
            Reporter:
            akolehma Anssi Kolehmainen
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: