Windows Firewall applies least privilege policy


If multiple network connections exist on a server, Windows Firewall will apply the least privilege / most secure firewall policy to all connections. Thus, if a server has two network connections, one with domain access and one with private (no domain) access, Windows will see the second NIC as residing in a “Private” or “Public” network, not “Domain”. The impact of this is that Windows will then take the Group Policy firewall settings applying to “Private” or “Public” and apply them to the “Domain” connection as well. The only way to decrease/disable the firewall restriction is to configure the policy for the “Public” or “Private” network as well.

ESX 4.1: Local users cannot login


If you regularly SSH into your ESX hosts, this may be old news to you. But if you’re like me and mostly manage your ESX hosts via vSphere Client, you might have a surprise waiting for you when you upgrade to ESX & ESXi 4.1. With the advent of ESX Active Directory integration, VMware kindly decided to impose some new changes and requirements for local user accounts. What does this mean to you?

For me, it meant that when I tried to SSH into my ESX host, I ran into “Access is denied.” And with only one non-root user account on the system, this meant no remote access (on the host itself). Root is restricted to interactive access, so that wasn’t any help. Thankfully the Dell Remote Access Card (DRAC) put me on the console, so to speak, and let me poke around as root.

The solution, though, came from a Google search, a somewhat unhelpful VMware KB article (1024235), and a little connecting of the dots. AD integration places a new dependency on the local “Administrators” role. If local user accounts aren’t in that role, they can’t get in.

Oddly enough, vSphere Client has to be targeted directly at the ESX host (not vCenter) to edit the role and local users. Looking while connected through vCenter won’t get you anywhere. So, here we go:

VCE: Virtual Computing Environment


Are you familiar with VCE? If not, add it to your IT acronym dictionary, but it’ll be something you hear more about in the future if virtualization, shared storage, converged networks, and/or server infrastructure are in your purview. VCE stands for “Virtual Computing Environment” and is a consortium of Cisco, EMC, VMware, and Intel (funny…if you take three of those initials, you get V-C-E). The goal and objective, which they seem to be realizing, is to deliver a “datacenter in a box” (or multiple boxes, if your environment is large), and in a lot of ways, I think they have something going…

The highlights for quick consumption:

a VCE Vblock is an encapsulated, manufactured product (SAN, servers, network fully assembled at the VCE factory)
a Vblock solution is designed to be sized to your environment based on profiling of 200,000+ virtual environments
one of the top VCE marketed advantages is a single support contact and services center for all components (no more finger pointing)
because a Vblock follows “recipes” for performance needs and profiles, upgrades also come/require fixed increments
Cisco UCS blade increments are in “packs” of four (4) blades; EMC disks come in five (5) RAID group “packs”
Vblock-0 is good for 300-800 VMs; Vblock-1 is for 800-3000 VMs; Vblock-2 supports 3000-6000 VMs
when crossing the VM threshold for a Vblock size, Vblocks can be aggregated
Those are the general facts. So what does all that mean for interested organizations? Is it a good fit for you? Here are some takeaways I drew from the points above as well as the rest of the briefing by our VCE, EMC, and Cisco reps…

SANs: EMC VMAXe and HP 3PAR V400


If you’re in the market for a new enterprise-class storage array, both EMC and HP/3PAR have good options for you. Toward the end of 2011, we began evaluating solutions from these two vendors with whom we have history and solid relationships. On the EMC side, we’ve grown up through a CX300 in 2006 and into two CX3-40’s in 2008. At the end of 2008, we deployed a 3PAR T400 at our production site and brought back that CX3-40 to consolidate it with the one at our HQ. It’s been three years hence, and our needs call for new tech.

As is the nature of technology, storage has made leaps and bounds since 2008. What once was unique and elevating to 3PAR–wide striping and simplified provisioning from one big pool of disks–has become common place in arrays of all classes. We used to liken it to replacing the carpet in a room with furniture. It’s a real chore when you have to painstakingly push all the chairs and tables into a corner (or out of the room altogether!) when you want to improve or replace the carpet. With disk abstraction and data-shifting features, though, changes and optimizations can be made without the headaches.

SAN Winner: HP 3PAR V400


At the end of the day, it wasn’t the minor technological differences that made the decision for us. Sure, we believed that EMC’s VMAXe was the truly enterprise-class array. The ace, though, was product positioning.

We have two SANs. We have a CLARiiON CX3-40 from EMC, which is legacy and as the market sometimes calls it, monolithic. It needs to go. We also have a 3PAR T400, which is as flexible as the day we bought it and has plenty of life left in it, due to its architecture (even though we acquired it in 2008). Thus, when the cards were on the table, only HP had the ability to offer a “free” upgrade to our T400 as well as the new V400.

The upgrade turns our T-series into a multi-tier array with SSD, FC, and NL, and the V-series replaces our aging CLARiiON. EMC tried to compete, but all they could offer was a “deal” less appealing than the original single-array proposition.

Honestly, I felt bad for them, because there was nothing they could do unless they literally took a deep loss (no funny money about it). HP’s solution was the equivalent of two new, good, flexible, low-maintenance SANs. EMC just learned how to be flexible and match 3PAR, so their older arrays (one of which was part of their attempt at competition) just didn’t equate.

It’s going to be another hard sell in 2-3 years when we open the next RFP, because HP/3PAR will now have a monopoly on the floor. Who knows, though? Maybe HP will stumble with their new golden egg, or maybe EMC will figure out how to undercut HP with price while not sacrificing features. For now, the trophy goes to HP. Congrats.

HP 3PAR: AO Update…Sorta


I wish there was an awesome update that I’ve just been too preoccupied to post, but it’s more of a “well. . . .” After talking with HP/3PAR folks a couple months back and re-architecting things again, our setup is running pretty well in a tiered config, but the caveats in the prior post remain. Furthermore, there are a few stipulations that I think HP/3PAR should provide customers or that customers should consider themselves before buying into the tiered concept.

Critical mass of each media type: Think of it like failover capacity (in my case, vSphere clusters). If I have only two or three hosts in my cluster, I have to leave at least 33% capacity free on each to handle the loss of one host. But if I have five hosts, or even ten hosts, I only have to leave 20% (or for ten hosts, 10%) free to account for a host loss.Tiered media works the same way, though it feels uber wasted, unless you have a ton of stale/archive data. Our config only included 24 near-line SATA disks (and our tiered upgrade to our existing array only had 16 disks). While that adds 45TB+ to capacity, realistically, those disks can only handle between 1,000 and 2,000 IOPS. Tiering (AO) considers these things, but seems a little under qualified in considering virtual environments. Random seeks are the enemies of SATA, but when AO throws tiny chunks of hundreds of VMs on only two dozen SATA disks (then subtract RAID/parity), it can get bad fast. I’ve found this to especially be the case with OS files. Windows leaves quite a few alone after boot…so AO moves them down. Now run some maintenance reboot those boxes–ouch!

Installing ESXi 4.1 with Boot from SAN


We’ve been running ESX since the days of v2.5, but with the news that v4.1 will be the last “fat” version with a RedHat service console, we decided it was time to transition to ESXi. The 30+ step guide below describes our process using an EMC CLARiiON CX3 SAN and Dell hosts with redundant Qlogic HBAs (fiber environment).

Document network/port mappings in vSphere Client on existing ESX server
Put host into maintenance mode
Shutdown host
Remove host from Storage Group in EMC Navisphere
Create dedicated Storage Group per host for the boot LUN in Navisphere
Create the 5GB boot LUN for the host
Add the boot LUN to the host’s Storage Group
Connect to the host console via the Dell Remote Access Card (DRAC)
Attach ESXi media via DRAC virtual media
Power on host (physically or via the DRAC)
Press CTRL+Q to enter Qlogic FastUtil
Select Configuration Settings
Select Adapter Settings
Change “HBA BIOS” to Enabled [ESC]
Select “Selectable Boot”
Change to “Enabled”
Change primary from zeroes to the disk with the address of the owning SP
Compare the address with the EMC Storage Processor (SP) Front-End Port addresses in Navisphere
If the disk shows as “LUNZ”, do not enable Selectable Boot or configure a disk (skip to 21)
Escape and Save settings, but don’t leave the utility
At the main menu, select “Select Host Adapter”
Change to the next adapter
Repeat steps 12 through 20
Exit the utility, reboot, and press F2 to enter Setup
Change Boot Hard Disk Sequence to put the SAN disk first
Exit BIOS and reboot
Press F11 for boot menu
Select Virtual CD
In Setup on the “Select a Disk” screen, select the remote 5GB LUN
Press F11 to begin install
Press Enter to reboot the host
After setup completes, configure password, time, and network
Add host to vCenter and configure networking (per step 1)
Add LUNs to the host’s Storage Group in Navisphere
Rescan for storage on host in vCenter

EMC XtremIO Gen2 and VMware


Synopis: My organization recently received and deployed one X-Brick of EMC’s initial GA release of the XtremIO Gen2 400GB storage array (raw 10TB flash; usable physical 7.47TB). Since this is a virgin product, virtually no community support or feedback exists, so this is a shout out for other org/user experience in the field.

Breakdown: We are a fully virtualized environment running on VMware ESXi 5.5 on modern Dell hardware and Cisco Nexus switches (converged to the host; fiber to the storage), and originally sourced on 3PAR storage. After initial deployment of the XtremIO array in early December (2013), we began our migration of VMs, beginning with lesser-priority, yet still production guests.

Within 24 hours, we encountered our first issues when one VM became unresponsive and upon a soft reboot (Windows 2012 guest), failed to boot–it hung at the Windows logo. Without going into too much detail, we hit an All Paths Down situation to the XtremIO array and later after rebooting the host, we still could not boot that initial guest. Only when we migrated (Storage vMotion) back to our 3PAR array could we successfully boot the VM.

Over the past two weeks, we’ve seen unfortunately low levels of deduplication (1.4-1.6 to 1), which we expected to be on par with Pure Storage’s offering. Apparently, Pure’s roughly 4.0-5.0 to 1 reduction ratio is due, at least in part, to their use of compression on their arrays. We were unaware of this feature difference until mid-implementation (when the ratio disparity became clear).

Additionally, the case of VMs going dark in part and then whole after reboot (which is normally the solution to most Windows problems 🙂 has repeated on three VMs to date (one mere minutes ago), resolved only by migrating back to 3PAR, even if only temporarily. The guests in question have all been running Windows Server 2012 R2, using virtual hardware versions 9 and 10, and powered atop ESXi 5.5.0 build 1331820. HBAs are QLogic 8262 CNAs running 5.5-compatible firmware/drivers.

Specific to the XtremIO array, we’ve also experienced a non-standard amount of “small block IO” alerts, even though we have isolated vSphere HA heartbeats to two volumes/datastores. Furthermore, we see routine albeit momentary latency spikes into the 10s of milliseconds, which are as yet unexplainable, but could be hypothesized to correlate with the small block I/O, which XtremIO forthrightly does not handle well due to its 4KB fixed block deduplication. In contract, Pure Storage doesn’t handle large block I/O as well due to its variable block and process of deduplication.

Aside from these issues (which are significant), the XtremIO array has generally performed well during normal operations with a full load (we migrated as much of our production environment as it would hold–we ran shy of a full migration due to insufficient data reduction ratios).

Without further adieu, if you have personal field experience with an XtremIO array in a virtual environment (preferably server virtualization, but VDI is fine, too), please post your stories here. Perhaps a cause to these woes will arise from the collective exchange. Thank you.

Update (one hour later, 12/26/13): we seem to have determined that XtremIO has an issue with Windows Server 2012 R2 guests and/or EFI firmware boot. Our environment has both 2012 and 2012 R2 guests, but our 2012 guests exclusively use BIOS firmware and our 2012 R2 guests exclusively use EFI firmware. We have not yet been able to test EFI on 2012 (non-R2) or BIOS on 2012 R2. All guests (2008 R2 and later) use LSI Logic SAS controllers.

EMC XtremIO and VMware EFI

After a couple weeks of troubleshooting by EMC/XtremIO and VMware engineers, the issue was determined to be an issue with EFI boot handing off a 7MB block to the XtremIO array, which filled the queue, and which would never clear as it was waiting for more data to be able to complete communication (i.e. deadlock). This seems to only happen with EFI firmware VMs (tested with Windows 2012 and Windows 2012 R2) and the issue is on the XtremIO end.

The good news is that the problem can be mitigated by adjusting the Disk.DiskMaxIOSize setting on each ESXi host from the default 32MB (32768) to 4MB (4096). You can find this in vCenter > Host > Configuration > Advanced Settings (bottom one) > Disk > Disk.DiskMaxIOSize. The XtremIO team is working on a permanent fix in the meantime, and the workaround can be implemented hot with no impact to active operations (potentially minor host CPU load increase as ESXi breaks down >4MB I/O into 4MB chunks).