Skip to content

January 18, 2014


EMC XtremIO and VMware EFI

After a couple weeks of troubleshooting by EMC/XtremIO and VMware engineers, the issue was determined to be an issue with EFI boot handing off a 7MB block to the XtremIO array, which filled the queue, and which would never clear as it was waiting for more data to be able to complete communication (i.e. deadlock). This seems to only happen with EFI firmware VMs (tested with Windows 2012 and Windows 2012 R2) and the issue is on the XtremIO end.

The good news is that the problem can be mitigated by adjusting the Disk.DiskMaxIOSize setting on each ESXi host from the default 32MB (32768) to 4MB (4096). You can find this in vCenter > Host > Configuration > Advanced Settings (bottom one) > Disk > Disk.DiskMaxIOSize. The XtremIO team is working on a permanent fix in the meantime, and the workaround can be implemented hot with no impact to active operations (potentially minor host CPU load increase as ESXi breaks down >4MB I/O into 4MB chunks).

On the other points of latency spikes and small block I/O, these have been corrected through the installation of Service Pack 2 to the XIOS 2.2 code. The upgrade includes new Infiniband firmware, which addresses the cause of the latency (problems with active/active controller communication that cascaded out to host I/O processing), and tweaks to alert thresholds and definitions. This latter item relegates the ~20% small block IO alert to an information event so as to clean up the alerts dashboard. The net result of SP2: latency normally <0.5ms and “spiking” to 5ms at most (very momentarily) and an empty alerts pane.

Many thanks to the EMC/XtremIO team and VMware/Darius of the EFI group.

We’re still working on the capacity points from the prior post, but that’s not a technical problem, and word has it that compression is on the mid-term road map, which will even the feature count between XtremIO and Pure…


By Chris Gurley, MCSE, CCNA
Last updated: January 18, 2014

Read more from SAN, VMware
3 Comments Post a comment
  1. Sunil
    Apr 23 2014

    Thanks for your reviews. They were of great help to me.
    We are also planning to have XtremIO for our VDI environment with Win 7-64bit. We will be having both persistent & non-persistent (PVS) desktops & want to know on the actual dedupe ratio you got on both the scenarios, because the ratio mentioned by you is really less.
    I would appreciate if you could help me with the original dedupe figures & ANY CONCERN areas which I should take care….


  2. Chris
    Apr 23 2014

    Hey Sunil,

    Our XtremIO deployment runs a server virtualization load with heavy SQL emphasis. Even so, for your VDI use case, the dedupe answer may vary.

    Will your VMs be clones or individual builds from something like WDS or SCCM? That seems to affect things, namely, cloning being the most dedupe friendly.

    Regarding concerns, hopefully this won’t be the case when your XtremIO deployment comes around, but the term “non-disruptive” hasn’t been our experience so far with any servicing of the array. Prior to XIOS 2.2.3, all code updates were disruptive (array downtime of several hours); 2.2.4 is the next release and purported to be non-disruptive–wait and see. Also, during the replacement of a storage controller to address a bad FC port, the array panicked, which resulted in 3hrs downtime. Engineering did a debrief on it and says they know why and how to avoid it in the future, but again, wait and see.

    We haven’t had a smooth road with XtremIO yet but we’re holding out hope that it will get better. The first benchmark of that will be the next code update. Then everyday without issues after that will make it a bit better.

    Please let me know how your experience goes, too. I’d like to know how XtremIO works for others.



Trackbacks & Pingbacks

  1. EMC XtremIO Gen2 and VMware | bcTechNet

Share your thoughts, post a comment.


Note: HTML is allowed. Your email address will never be published.

Subscribe to comments