Uploaded image for project: 'XenServer Org'
  1. XenServer Org
  2. XSO-445

Import/Export speed is a nightmare

    Details

    • Type: Bug
    • Status: Done (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: All
    • Fix Version/s: 7.1
    • Component/s: API, Networking, Storage, XenCenter
    • Labels:
      None
    • Environment:

      XenServer 6.0-6.5 SP1 (fully patched), Windows Server 2008, XenCenter 6.0.x to 6.6.90.3063
      2x 10 GE (LACP trunk) to 1x 10 GE at the Backup System

      Description

      Since this topic is nearly as old as XenServer itself and noone cared about it until now, I declared it as a big bug, because it's impacts are from annoying time-waste to broken backups/backup windows.
      We have an MS Exchange 2007 VM with a database of about 400 GB + OS etc. and usually export the whole VM once a month to have a disaster recovery, which could be imported on any XenServer, if something really bad happens.
      The export-process now takes ridiculous 12+ (twelve!) hours.

      The environment details are as following:
      XenServer 6.5 SP1 (upgraded since about 6.0)
      2x Xeon E5-2690 (2.9-3.8 GHz octacore)
      128 GB RAM
      7x 600 GB 2.5" 10k SAS via P420i/2 GB as LVM, local storage repo
      Dom0 set to 4 GB (tested with 3 GB before, will see if it makes a difference, which I don't expect)

      Backupserver:
      2x Xeon E5450 (3 GHz QuadCore)
      24 GB RAM
      6x 600 GB 3.5" 15k SAS (Seagate Cheetah 15k[6/7]) as RAID 5

      To decrease the total time to export (it'd take days if not), I parallelize the exports to 2 exports at once. That means I start with Exchange and iterate through the list of VMs - with the result, that while exporting Exchange, all other VMs are getting exported and the Exchange export is still running and script is waiting for it, to continue moving stuff to tape then.

      I'm triggering the exports via xe.exe from XenCenter (currently tying Dundee Beta 2 v. 6.6.90.3063) by a small windows batch file.
      The main part is just basics:
      %XenServer% vm-export vm=%1 compress=false filename=%BackupPath2%%1.xva

      Means, for speed-reasons, I disable compression - as the tapedrive does it itself and the Exchange-DB and other VMs probably won't compress that good anyways.
      From time to time, I clean VMs by using "sdelete -z" to zero empty space.

      What I could see:
      xe.exe consumes only core at the backupsystem, means everything is single-threadded - whyever.
      xe.exe from Exchange export, is usually hitting it's one-core-max-usage, what seems to limit that.
      Dom0 has a load average of, currently: 3.12, 3.04, 2.98
      stunnel consumes most time by about 100-150% and xapi about 40-50%, another process, "fe" (whatever it is) takes ~10-12 and then some drops of performance for tapdisk/qemu-dm processes.

      Measured by iostat, tps are mostly between 300 and 400, with peaks up to 1200+ and a read-speed of 20-40 MB/s with short peaks up to over 350 MBs.
      Load on the backupsystem is jumping around a bit:
      from 200-800 mbit/s, disk at 20-100 MB/s and a total CPU-usage of about 33%.
      CPU-usage is mostly constat around 30%.
      Network currently holds 500-700 mbit/s, HDDs time of max. activity jumps between about 9 and 90%, very rarely hitting 100%.

      Attached a screenshot while exporting (backupserver at the left).
      The blue lines at "Datenträger" (Data disks) is the saturation of the RAID set, green is the amount of data transfered.
      What we could see from the screenshot:
      1. RAIDs are not constantly at their limits
      2. Network is far away from limiting anything

      That leaves, that there is some optimisation at the CPU-usage/process handling needed.
      Probably some multi-tasking stuff for xe.exe, that splits the various steps needet to convert data from XenServer(API?) to a .xva file on the harddisk.

      For an imagination of how many ppl. already tried to fix that or have been that much affected, that they tried fo find help at the forums, there is an over-the-years-growed thread:
      https://discussions.citrix.com/topic/237200-vm-importexport-performance/
      (So imagine how many ppl. just didn't post/only read but still are affected)

      I'm willing to make more analyzes or giving needed details/doing export tests with a prepared VM or whatever you need, but please, please, please work on that, since it's a major bug for using and maintaining XenServer and by growing VMs and datastores, that problem won't decrease.

      I hope you understand our (see the list of ppl. contributed to the thread) pain.

      Regards

      • Christof

        Attachments

        1. 2016-01-09 16_46_34-load_overview.png
          2016-01-09 16_46_34-load_overview.png
          125 kB
        2. 2016-04-21 14_19_00-XenCenter.png
          2016-04-21 14_19_00-XenCenter.png
          3 kB
        3. 2016-04-26 20_41_32-Import XVA.png
          2016-04-26 20_41_32-Import XVA.png
          8 kB
        4. BrokenCIFS-sambaShare.png
          BrokenCIFS-sambaShare.png
          15 kB
        5. BrokenCIFS-sambaShare.png
          BrokenCIFS-sambaShare.png
          15 kB
        6. Capture.JPG
          Capture.JPG
          55 kB
        7. export7.0.1.png
          export7.0.1.png
          40 kB
        8. exports-12790c-devSnapshot.png
          exports-12790c-devSnapshot.png
          14 kB
        9. ExportXEdom0_firstBroken_2ndAsExpected.png
          ExportXEdom0_firstBroken_2ndAsExpected.png
          23 kB
        10. ExportXEdom0.png
          ExportXEdom0.png
          75 kB
        11. ExportXEexe_win.png
          ExportXEexe_win.png
          90 kB
        12. ExportXenCenter.png
          ExportXenCenter.png
          89 kB
        13. TaskmanagerCIFSTaget_dom0-XEused.png
          TaskmanagerCIFSTaget_dom0-XEused.png
          12 kB
        14. TaskmanagerCIFSTaget_Win-XEused.png
          TaskmanagerCIFSTaget_Win-XEused.png
          22 kB
        15. TaskmanagerCIFSTaget_Win-XEused2.png
          TaskmanagerCIFSTaget_Win-XEused2.png
          48 kB
        16. xapi-strace1.tar.bz2
          132 kB
        17. xapi-strace2.tar.bz2
          701 kB
        18. XS701benchmarks.JPG
          XS701benchmarks.JPG
          121 kB
        19. xso-445-v1.zip
          14.59 MB
        20. xso-445-v1-for-build-127920.zip
          15.04 MB
        21. xso-445-v2-1MB.zip
          12.90 MB

          Activity

            People

            • Assignee:
              philippeg philippe gabriel
              Reporter:
              cgiesers Christof Giesers
            • Votes:
              12 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: