Status: Done (View Workflow)
Affects Version/s: 7.2
Fix Version/s: None
We are using Xen Orchestra to take delta backups of running vm every night on a pool of 3 Xenservers 7.1 (fully patched) connected to 2 iscsi SR with multipath.
The backup (consisting of snapshots with xapi) is corrupting a volume group sometimes. It seems that there is a race condition happening while snapshotting multiple vm at the same time.
The backup is calling xapi to snapshot multiple vm at the same time (2 concurrent snapshots by default).
Sometimes, everything is running smoothly, but most of the times, we loose a volume group after the backup :
FAILED in util.pread: (rc 5) stdout: '', stderr: ' /dev/disk/by-scsid/23133613436326131/mapper: Checksum error
After analyzing logs, it seems that the master of the pool is corrupting itself the volume group (no slave has written to the volume group according to /etc/lvm/backup).
We have logs of the issue from all servers of the pool if necessary.