Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-887

Coalesce is corrupting volume group metadata

    Details

    • Type: Bug
    • Status: Backlog (View Workflow)
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 7.3, 7.4, 7.5
    • Fix Version/s: None
    • Component/s: Storage
    • Labels:
      None
    • Environment:
    • Internal JIRA Reference:
      XSI-88

      Description

      Hi,

      This bug report is related to -----XSO-837---- and XSO-855-. As I haven't received any constructive answer, I investigated the problem myself.

      Since I resolved some lvm metadata corruption in previous issue (-XSO-883-), I had another lvm metadata corruption this week (not related to vgs) : I losed 3 volume groups this week! It tooks me hours to restore a stable environment (unplug stale vbd, delete stale snaphots, copy some disks, ...).

      This time, the last process which wrote to lvm was a 'lvcreate' command with a tag 'journaler' :

      ['/sbin/lvcreate', '-n', 'coalesce_039c80ce-d70f-4b0d-a185-a95f6ce3b6aa_1', '-L', '4', 'VG_XenStorage-6010cef0-b5ef-a604-bfd3-a1fde94d0d6f', '--addtag', 'journaler']
      

      After investigating your code, it seems that the 'SR._coalesce' function in '/opt/xensource/sm/cleanup.py' is running some lvm commands on SR without any lock! These commands are creating/deleting lvm on SR without lock : 'self.journaler.create', 'self.journaler.remove'. These are trashing the volume groups while they run at wrong times :

         def _coalesce(self, vdi): 
             if self.journaler.get(vdi.JRN_RELINK, vdi.uuid): 
                 # this means we had done the actual coalescing already and just  
                 # need to finish relinking and/or refreshing the children 
                 Util.log("==> Coalesce apparently already done: skipping") 
             else: 
                 # JRN_COALESCE is used to check which VDI is being coalesced in  
                 # order to decide whether to abort the coalesce. We remove the  
                 # journal as soon as the VHD coalesce step is done, because we  
                 # don't expect the rest of the process to take long 
                 self.journaler.create(vdi.JRN_COALESCE, vdi.uuid, "1") 
                 vdi._doCoalesce() 
                 self.journaler.remove(vdi.JRN_COALESCE, vdi.uuid) 
      
                 util.fistpoint.activate("LVHDRT_before_create_relink_journal",self.uuid) 
      
                 # we now need to relink the children: lock the SR to prevent ops  
                 # like SM.clone from manipulating the VDIs we'll be relinking and  
                 # rescan the SR first in case the children changed since the last  
                 # scan 
                 self.journaler.create(vdi.JRN_RELINK, vdi.uuid, "1") 
      
             self.lock() 
             try: 
                 self.scan() 
                 vdi._relinkSkip() 
             finally: 
                 self.unlock() 
      
             vdi.parent._reloadChildren(vdi) 
             self.journaler.remove(vdi.JRN_RELINK, vdi.uuid) 
             self.deleteVDI(vdi)
      
      

      Actually, I'm running a patched version of 'cleanup.py' which locks the entire function '_coalesce', but it's suboptimal as only some functions need a lock on the SR : 

         def _coalesce(self, vdi): 
             if self.journaler.get(vdi.JRN_RELINK, vdi.uuid): 
                 # this means we had done the actual coalescing already and just  
                 # need to finish relinking and/or refreshing the children 
                 Util.log("==> Coalesce apparently already done: skipping") 
             else: 
                 self.lock() 
                 try: 
                     # JRN_COALESCE is used to check which VDI is being coalesced in  
                     # order to decide whether to abort the coalesce. We remove the  
                     # journal as soon as the VHD coalesce step is done, because we  
                     # don't expect the rest of the process to take long 
                     self.journaler.create(vdi.JRN_COALESCE, vdi.uuid, "1") 
                     vdi._doCoalesce() 
                     self.journaler.remove(vdi.JRN_COALESCE, vdi.uuid) 
      
                     util.fistpoint.activate("LVHDRT_before_create_relink_journal",self.uuid) 
      
                     # we now need to relink the children: lock the SR to prevent ops  
                     # like SM.clone from manipulating the VDIs we'll be relinking and  
                     # rescan the SR first in case the children changed since the last  
                     # scan 
                     self.journaler.create(vdi.JRN_RELINK, vdi.uuid, "1") 
                 finally: 
                     self.unlock() 
      
             self.lock() 
             try: 
                 self.scan() 
                 vdi._relinkSkip() 
      
                 vdi.parent._reloadChildren(vdi) 
                 self.journaler.remove(vdi.JRN_RELINK, vdi.uuid) 
                 self.deleteVDI(vdi) 
             finally:
                 self.unlock()
      

      Please, don't ask me to provide full logs of XS7.5, as the '_coalesce' function is the same for every version of Xenserver...

      I'm posting these bug reports to help you providing a better (open source) product, but I don't understand why there are some nasty bugs never reported before... These are really breaking your reputation and trust in your product.

      Regards,
      Nicolas Michaux

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              sarabanjina Nicolas Michaux
            • Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: