Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-884

vdi_activate should run locked on LVM backend storage

    Details

    • Type: Bug
    • Status: Backlog (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 7.3, 7.4, 7.5
    • Fix Version/s: None
    • Component/s: Storage
    • Labels:
      None
    • Environment:

      We are taking snapshots of running vm every night on a pool of 3 Xenservers 7.2 (fully patched) connected to 2 iscsi SR with multipath.

    • Team:
    • Internal JIRA Reference:
      XSI-76

      Description

      Since I resolved lvm metadata corruption in previous issue (XSO-883), I noticed another problem during nighly backups : sometimes (happens randomly), a vdi can't be activated with this error : 

      Raising exception [46, The VDI is not available [opterr=Command ['/sbin/lvchange', '-ay', '/dev/VG_XenStorage-ead98b75
      -7449-80e9-a54d-1b0a0c9449ac/VHD-98d8d764-49d1-4f4d-851d-0029a1767358'] failed (5): /dev/disk/by-scsid/23133613436326131/mapper: Checksum error
      

      The volume group has no 'Checksum error' and if I launch again the backup some minutes later, it works fine...

      I investigated this problem and I found that the commands 'vdi_activate' and 'vdi_deactivate' don't lock the SR while they are running on a LVM based SR (they are locked on a file based SR). I think these commands should lock the SR as they are reading metadata which could be being written by another process on another node (I think this is what happened to my backup).

      I added these 2 commands to the exclusive operations of LVM backend SR (/opt/xensource/sm/LVHDSR.py) :

      OPS_EXCLUSIVE = [ 
             "sr_create", "sr_delete", "sr_attach", "sr_detach", "sr_scan", 
             "sr_update", "vdi_create", "vdi_delete", "vdi_resize", "vdi_snapshot", 
             "vdi_clone", "vdi_activate", "vdi_deactivate" ]
      

      Until now, it works fine : I can see (in /var/log/SMlog) the sr lock being acquired on vdi (de)activation and it should prevent this kind of error.

      King regards,
      Nicolas Michaux

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              sarabanjina Nicolas Michaux
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: