Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-1021

[PATCH] null pointer dereference in dmesg when SMB server unreachable

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • 8.2
    • Dom0
    • None
    • XSI-1225

    Description

      On Citrix Hypervisor 8.2 CU1, we reproduced a user issue involving a SMB ISO SR.

      Whenever the SMB share is down (to reproduce, we simply turn the TrueNAS server off), after a small duration (a few minutes max), there is a null pointer dereference in kernel logs. You might need to run `df -h` on the server to provoke it:

       [Mon Apr 11 15:17:02 2022] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [Mon Apr 11 15:17:02 2022] PGD 51dce067 P4D 51dce067 PUD 51928067 PMD 0
      [Mon Apr 11 15:17:02 2022] Oops: 0000 [#2] SMP NOPTI
      [Mon Apr 11 15:17:02 2022] CPU: 1 PID: 9654 Comm: df Tainted: G      D    O      4.19.0+1 #1
      [Mon Apr 11 15:17:02 2022] Hardware name: Dell Inc. Vostro 3550/0917G2, BIOS A07 07/18/2011
      [Mon Apr 11 15:17:02 2022] RIP: e030:SMB2_query_info_free+0x8/0x10 [cifs]
      [Mon Apr 11 15:17:02 2022] Code: c0 31 f6 48 c7 c7 80 d0 5e c0 31 c0 e8 55 80 b0 c0 eb d9 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 07 <48> 8b 38 e9 60 22 fe ff 66 66 66 66 90 48 83 ec 30 4d 63 c0 48 8d
      [Mon Apr 11 15:17:02 2022] RSP: e02b:ffffc90040fbfbc8 EFLAGS: 00010246
      [Mon Apr 11 15:17:02 2022] RAX: 0000000000000000 RBX: ffffc90040fbfd50 RCX: 0000000000000000
      [Mon Apr 11 15:17:02 2022] RDX: ffff8880529b6170 RSI: ffff8880522d0200 RDI: ffffc90040fbfd78
      [Mon Apr 11 15:17:02 2022] RBP: ffffc90040fbfe00 R08: 0000000000000000 R09: 0000000000000000
      [Mon Apr 11 15:17:02 2022] R10: 0000000000007ff0 R11: 00000108ebd988bc R12: ffff8880529b0800
      [Mon Apr 11 15:17:02 2022] R13: ffffc90040fbfc30 R14: ffff8880525cd600 R15: 0000000000000000
      [Mon Apr 11 15:17:02 2022] FS:  00007f2e09fc2740(0000) GS:ffff88805a280000(0000) knlGS:0000000000000000
      [Mon Apr 11 15:17:02 2022] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      [Mon Apr 11 15:17:02 2022] CR2: 0000000000000000 CR3: 000000004f854000 CR4: 0000000000040660
      [Mon Apr 11 15:17:02 2022] Call Trace:
      [Mon Apr 11 15:17:02 2022]  smb2_queryfs+0x13a/0x310 [cifs]
      [Mon Apr 11 15:17:02 2022]  ? lookup_fast+0xcb/0x2b0
      [Mon Apr 11 15:17:02 2022]  ? __follow_mount_rcu.isra.42+0x3c/0xf0
      [Mon Apr 11 15:17:02 2022]  ? walk_component+0x48/0x280
      [Mon Apr 11 15:17:02 2022]  ? legitimize_path.isra.44+0x28/0x50
      [Mon Apr 11 15:17:02 2022]  ? terminate_walk+0x55/0xb0
      [Mon Apr 11 15:17:02 2022]  cifs_statfs+0xb0/0x290 [cifs]
      [Mon Apr 11 15:17:02 2022]  statfs_by_dentry+0x99/0x120
      [Mon Apr 11 15:17:02 2022]  vfs_statfs+0x16/0xc0
      [Mon Apr 11 15:17:02 2022]  user_statfs+0x50/0x90
      [Mon Apr 11 15:17:02 2022]  __do_sys_statfs+0x20/0x50
      [Mon Apr 11 15:17:02 2022]  do_syscall_64+0x4e/0x100
      [Mon Apr 11 15:17:02 2022]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [Mon Apr 11 15:17:02 2022] RIP: 0033:0x7f2e09acf787
      [Mon Apr 11 15:17:02 2022] Code: 2d 00 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 48 8b 15 fd 66 2d 00 f7 d8 64 89 02 48 83 c8 ff c3 0f 1f 00 b8 89 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d9 66 2d 00 f7 d8 64 89 01 48
      [Mon Apr 11 15:17:02 2022] RSP: 002b:00007ffdb6353bf8 EFLAGS: 00000206 ORIG_RAX: 0000000000000089
      [Mon Apr 11 15:17:02 2022] RAX: ffffffffffffffda RBX: 00000000010df5a0 RCX: 00007f2e09acf787
      [Mon Apr 11 15:17:02 2022] RDX: 00007ffdb6353f30 RSI: 00007ffdb6353c00 RDI: 00000000010df5a0
      [Mon Apr 11 15:17:02 2022] RBP: 0000000000000001 R08: 00000000010df501 R09: 0000000000000000
      [Mon Apr 11 15:17:02 2022] R10: 0000000000000002 R11: 0000000000000206 R12: 00007ffdb6353d40
      [Mon Apr 11 15:17:02 2022] R13: 00007ffdb6353d40 R14: 0000000000000000 R15: 0000000000000001
      [Mon Apr 11 15:17:02 2022] Modules linked in: arc4 md4 sha512_ssse3 sha512_generic cmac nls_utf8 cifs ccm fscache bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc openvswitch nsh nf_nat_ipv6 nf_nat_ipv4 nf_conncount nf_nat 8021q garp mrp stp llc dm_multipath ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_multiport xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper dm_mod dcdbas dell_smm_hwmon psmouse sg i2c_i801 lpc_ich ip_tables x_tables sr_mod cdrom sd_mod xhci_pci xhci_hcd ahci r8169 libahci serio_raw realtek libata ehci_pci ehci_hcd video backlight scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod xen_wdt ipv6 crc_ccitt
      [Mon Apr 11 15:17:02 2022] CR2: 0000000000000000
      [Mon Apr 11 15:17:02 2022] ---[ end trace 613fe9e5e7f12df4 ]---
      [Mon Apr 11 15:17:02 2022] RIP: e030:SMB2_query_info_free+0x8/0x10 [cifs]
      [Mon Apr 11 15:17:02 2022] Code: c0 31 f6 48 c7 c7 80 d0 5e c0 31 c0 e8 55 80 b0 c0 eb d9 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 48 8b 07 <48> 8b 38 e9 60 22 fe ff 66 66 66 66 90 48 83 ec 30 4d 63 c0 48 8d
      [Mon Apr 11 15:17:02 2022] RSP: e02b:ffffc900418b3bc8 EFLAGS: 00010246
      [Mon Apr 11 15:17:02 2022] RAX: 0000000000000000 RBX: ffffc900418b3d50 RCX: 0000000000000000
      [Mon Apr 11 15:17:02 2022] RDX: ffff8880529b6170 RSI: ffff888004490200 RDI: ffffc900418b3d78
      [Mon Apr 11 15:17:02 2022] RBP: ffffc900418b3e00 R08: 0000000000000000 R09: 0000000000000000
      [Mon Apr 11 15:17:02 2022] R10: 0000000000007ff0 R11: 0000000000000000 R12: ffff8880529b0800
      [Mon Apr 11 15:17:02 2022] R13: ffffc900418b3c30 R14: ffff8880525cd600 R15: 0000000000000000
      [Mon Apr 11 15:17:02 2022] FS:  00007f2e09fc2740(0000) GS:ffff88805a280000(0000) knlGS:0000000000000000
      [Mon Apr 11 15:17:02 2022] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      [Mon Apr 11 15:17:02 2022] CR2: 0000000000000000 CR3: 000000004f854000 CR4: 0000000000040660

      This was reported to cause host unresponsiveness in the same situation on XCP-ng in production conditions.

      Applying the following patch from kernel.org's 4.19 branch solved it for us and lets `df -h` answer correctly, with a list of devices that don't include the SMB share while it's disconnected, and without blocking other operations:
      https://github.com/xcp-ng-rpms/kernel/commit/2dd7b1f8feca463393e01b491d7e95b6fb6b3615

      Attachments

        Activity

          People

            Unassigned Unassigned
            stormi Samuel Verschelde
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: