Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-633

Filesystem corruption due to iSCSI failure

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • None
    • 7.0, 7.1
    • Storage
    • None
    • XenServer 7.0 fully patched
      Ubuntu 16.10 VMs fully patched with tools installed
      Synology iSCSI server

    Description

      Hello,

      It's not the first time that due to a storage error, caused by a mere
      switch reboot, all the 6 linux VMs I have on a 2-hosts cluster failed
      with this kind of errors:

      [638272.534481] blk_update_request: I/O error, dev xvda, sector 7421440
      [638272.534526] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279237 (offset 0 size 0 starting block 927681)
      [638272.534527] Buffer I/O error on device xvda1, logical block 927424
      [638272.534557] blk_update_request: I/O error, dev xvda, sector 7421464
      [638272.534583] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279237 (offset 0 size 0 starting block 927689)
      [638272.534583] Buffer I/O error on device xvda1, logical block 927427
      [638272.534608] Buffer I/O error on device xvda1, logical block 927428
      [638272.534632] Buffer I/O error on device xvda1, logical block 927429
      [638272.534656] Buffer I/O error on device xvda1, logical block 927430
      [638272.534680] Buffer I/O error on device xvda1, logical block 927431
      [638272.534704] Buffer I/O error on device xvda1, logical block 927432
      [638272.534729] blk_update_request: I/O error, dev xvda, sector 7421592
      [638272.534754] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279237 (offset 0 size 0 starting block 927700)
      [638272.534755] Buffer I/O error on device xvda1, logical block 927443
      [638272.534779] blk_update_request: I/O error, dev xvda, sector 7421624
      [638272.534804] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279237 (offset 0 size 0 starting block 927704)
      [638272.534805] Buffer I/O error on device xvda1, logical block 927447
      [638272.534830] blk_update_request: I/O error, dev xvda, sector 7421680
      [638272.534854] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279237 (offset 0 size 0 starting block 927712)
      [638272.534855] Buffer I/O error on device xvda1, logical block 927454
      [638272.534880] blk_update_request: I/O error, dev xvda, sector 38365182
      [638272.534909] blk_update_request: I/O error, dev xvda, sector 8680560
      [638272.534935] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279241 (offset 0 size 0 starting block 1085071)
      [638272.534936] blk_update_request: I/O error, dev xvda, sector 35184032
      [638272.534961] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 261125 (offset 0 size 0 starting block 4398005)
      [638272.534963] blk_update_request: I/O error, dev xvda, sector 8391744
      [638272.534988] Buffer I/O error on dev xvda1, logical block 1048712,
      lost async page write
      [638272.535023] blk_update_request: I/O error, dev xvda, sector 7420448
      [638272.535048] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279231 (offset 0 size 0 starting block 927557)
      [638272.535050] Buffer I/O error on dev xvda1, logical block 1048608,
      lost async page write
      [638272.535083] Buffer I/O error on dev xvda1, logical block 1048592,
      lost async page write
      [638272.535116] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279231 (offset 0 size 0 starting block 927564)
      [638272.535118] EXT4-fs warning (device xvda1): ext4_end_bio:314: I/O
      error -5 writing to inode 279231 (offset 0 size 0 starting block 927489)
      [638272.535120] Buffer I/O error on dev xvda1, logical block 1059096,
      lost async page write
      [638272.535156] Buffer I/O error on dev xvda1, logical block 1049713,
      lost async page write
      [638272.535189] Buffer I/O error on dev xvda1, logical block 0, lost
      async page write
      [638272.535219] Buffer I/O error on dev xvda1, logical block 1, lost
      async page write
      [638272.535249] Buffer I/O error on dev xvda1, logical block 1049740,
      lost async page write
      [638272.535369] Aborting journal on device xvda1-8.
      [638272.535997] EXT4-fs (xvda1): previous I/O error to superblock detected
      [638283.389147] EXT4-fs error (device xvda1):
      ext4_journal_check_start:56: Detected aborted journal
      [638283.389187] EXT4-fs (xvda1): Remounting filesystem read-only
      [638283.390705] EXT4-fs error (device xvda1):
      ext4_journal_check_start:56: Detected aborted journal
      [638283.399867] EXT4-fs error (device xvda1):
      ext4_journal_check_start:56: Detected aborted journal
      [638283.400420] EXT4-fs error (device xvda1):
      ext4_journal_check_start:56: Detected aborted journal

      It is not okay but understandable, what's really not okay, however, is
      that when rebooted all the Linux hosts had filesystem corruption that
      required manual fsck.

      The only host I tried to fsck before reboot had far worse corruption to
      the point I suspected fsck was actually thrashing the filesystem so I
      stopped, rebooted and fsck-ed again finding... a thrashed file system to
      the point I lost all data on the box.

      I don't know if the fs was thrashed from the beginning or the fsck
      before shutdown while the fs was read-only did the damage but I suspect so.

      The three Windows VMs seemed unaffected.

      Let me know if/how I may help debugging the issue before the logs are
      rotated away.

      Bye,

      Attachments

        Activity

          People

            garyk Gary Kirkpatrick
            vihai Daniele Orlandi
            Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: