Skip to content

January 6, 2011

9

VMware & Link-State Tracking

If you’re running a VMware vSphere cluster on a two-tier (or greater) Cisco network, you might be in a situation like I was. You see, we built in redundancy when we planned our core and access switches, but the design had one significant flaw (see the simplified diagram to the right). Pretend all of those lines are redundant paths. Looks good so far, right? If CoreA goes down, ESX(i) can still send traffic up through AccessB to CoreB. The reverse applies if -B is down, and likewise for either of the Access- switches.

The catch comes for VMs on ESX(i) when one of the Core- switches goes down. ESX(i) balances VMs across the ports in the Virtual Machine port group(s). If a port goes down, it will smartly move the VM(s) to another port that is up. If an “upstream” hop like CoreB goes down, though, ESX(i) doesn’t know about that event, so it keeps its VMs in place, oblivious to the fact that the VMs on AccessB ports are as good as dead to the world. [Enter Link-State Tracking]

Link-state tracking (LST) is a feature in Cisco IOS 12.2(54)SG and later (and possibly a few minor revisions sooner) that enables Cisco switches and routers to manage the link state of ports based on the status of other port. In our case, LST can be configured on AccessB to watch the uplink port(s) to CoreB and act if something they go down. See the example config below:

switch# config t
switch (config)# link state track 1
switch (config)# int GigabitEthernet1/48
switch (config-if)# link state group 1 upstream
switch (config-if)# int GigabitEthernet1/1
switch (config-if)# link state group 1 downstream
switch (config-if)# int GigabitEthernet1/2
switch (config-if)# link state group 1 downstream

In this config, we have specified that the last port on our 48-port Cisco Catalyst switch (i.e. a 4900 series) is what links (“upstream”) to our core switch, CoreB. Then we add two other ports which our ESX(i) server is using for VMs as “downstream” ports. Once this configuration is in place, if GigabitEthernet1/48 goes down (unplugged, issues on CoreB, etc), AccessB will put GigabitEthernet1/1 and 1/2 into an “ErrorDisabled” state (down), so our ESX(i) server will know that it needs to choose new paths for the VMs that are traversing AccessB.

Of course, another solution to this topology would be to physically reconfigure it as a mesh with CoreA-to-AccessB and CoreB-to-AccessA links, but then you encounter spanning-tree and other factors at multiple levels. Even if that is your end game, link-state tracking is a great intermediate step in the mean time.

For more info on whether beaconing or link state tracking in your best fit, check out VMware’s blog:
Beaconing Demystified: Using Beaconing to Detect Link Failures

——————————————————

By Chris Gurley, MCSE, CCNA
Last updated: February 16, 2011

Read more from Cisco, VMware
9 Comments Post a comment
  1. Feb 16 2011

    Nice post! Many times overlooked when configuring ESX with redundant NICs especially in a +2-tier network.
    Rgds,
    Didier

    Reply
  2. Chris
    Feb 16 2011

    Thanks, Didier! ‘Appreciate the feedback. Yeah, it only took us one “non-impacting” network maintenance to realize how much we needed this :).

    ~Chris

    Reply
  3. fred
    Jun 11 2013

    What should be the according Network Adapter Failover Detection Policy on ESX(i) 5.1 DvS when having Link State Tracking enabled on Physical switches as you decribe?

    vSphere 5.1 offers 2 choices:
    1) Link Status Only
    2) Beacon Probing

    Reply
    • Chris
      Jun 13 2013

      Sorry for the late reply; it seems that our site email notifications haven’t been going out recently.

      I believe that “Link Status Only” is still appropriate since the pSwitch will shut the downstream ports when the upstream fails. That’s the beauty of LST on the pSwitches. Thanks!

      Reply
      • fred
        Jun 14 2013

        Thanks!

        Reply
  4. Ankit Soni
    Oct 25 2013

    Excellent Article, We too were the prey of the similar situation.. This article definately pointed us out to the right direction. Thanks a ton for sharing your experience

    Reply

Trackbacks & Pingbacks

  1. Network Troubleshooting 101 – vSphere VM Guest « Virtual Noob
  2. Beacon Probes with Port-Channel Causing Mac-Flapping | Virtualaholics
  3. Beacon Probes with Port-Channel Causing Mac-Flapping | VirtuallyHyperVirtuallyHyper

Share your thoughts, post a comment.

(required)
(required)

Note: HTML is allowed. Your email address will never be published.

Subscribe to comments