Great Minds Think Alike – Cisco and VMware Agree On Sharing vs. Limiting

Sept. 13th, 2010

While reading a very informative blog post by Aaron Delp regarding VMware’s new NetIOC capability, I realized that Cisco and VMware are on the same page when it comes to server network traffic control. VMware’s NetIOC best practices plainly state exactly what Cisco has been advocating for so long – the user should plan for network contention but should not needlessly limit bandwidth in the absence of contention. 

VMware’s NetIOC Best Practices states the following:  

  • Page 5: [Using shares instead of limits] means that unused capacity will be redistributed to other contending flows and won’t go to waste.
  • Page 24: “Best practice 1: When using bandwidth allocation, use “shares” instead of “limits,” as the former has greater flexibility for unused capacity redistribution. Partitioning the available network bandwidth among different types of network traffic flows using limits has shortcomings. For instance, allocating 2Gbps bandwidth by using a limit for the virtual machine resource pool provides a maximum of 2Gbps bandwidth for all the virtual machine traffic even if the team is not saturated. In other words, limits impose hard limits on the amount of the bandwidth usage by a traffic flow even when there is network bandwidth available.”

Brad Hedlund wrote an excellent blog post discussing how Cisco uses QoS (shares) vs. HP Virtual Connect only using rate limiting (limits) and he provides some great animated graphics and analogies. I encourage you to read it…and then come back. J  

Two questions: 1. What are shares vs. limits? 2. How does Cisco UCS + VMware compare to HP Virtual Connect + VMware?
  

Shares (minimums) vs. Limits (maximums)

  • A “share” is a VMware term that represents a relative value (relative to share values assigned to other flows) used to determine a particular traffic flow’s minimum bandwidth percentage during times of congestion. The greater the ratio of shares a traffic flow is assigned, the greater the amount of bandwidth it will receive during times of congestion. For example, let’s assume Flow A is assigned 10 shares, Flow B assigned 5 shares, and Flow C assigned 25 shares. The sum of shares is 40 (10+5+25). Flow A receives a minimum bandwidth share of 25% (10/40), Flow B receives a minimum of 12.5% (5/40), and Flow C receives a minimum of 62.5% (25/40). 25% + 12.5% + 62.5% = 100% of dvUplink bandwidth (whatever speed the link is operating at). During times of no congestion, each flow would have the ability to consume up to 100% of the link bandwidth. During times of congestion, each flow would be guaranteed their minimum ‘share’ of the bandwidth percentage.
  • A “limit” is a VMware term that represents a static value used to define, in absolute units of Mbps, the maximum bandwidth a particular flow is able to consume on the overall vDS. In other words, you define a set bandwidth in Mbps and the flow is not able to exceed that maximum setting even if there is no contention on the dvUplinks. The bandwidth can be available but the flow is not eligible to use it.

To spin Brad’s highway analogy a slightly different way, I think of these two approaches to traffic control as “HOV Lanes” (shares) vs. “metered on-ramps” (limits) on the highway:  

An HOV lane guarantees of a minimum of one lane of highway “bandwidth” for high occupancy vehicles. Can high occupancy vehicles use other lanes if those lanes aren’t being used? Absolutely. If there are lots of high occupancy vehicles and very little other traffic, are the high occupancy vehicles forced to crowd into one lane? Absolutely not. What is the maximum number of lanes high occupancy vehicles can use? No maximum. What is the minimum number of lanes they are guaranteed? At least one.  

A “metered on-ramp” guarantees a maximum number of cars entering the highway in an attempt to avoid congestion on the highway. The on-ramp light allows only so many cars onto the highway during a given time period, regardless of how busy the highway is. Do most on-ramp metering systems adjust to the actual presence of highway congestion? No. They are programmed to allow a maximum number of cars entering the highway per minute, period. Will you be frustrated sitting idle in line on the on-ramp waiting on the light while seeing that the highway is wide open? Absolutely.  

Highway Analogy: QoS and Rate Limiting

Questions You May Have:  

  • Which is more important? Metered on-ramps (limits) or HOV lanes (shares)?
    IMHO, if you can only have one, pick HOV lanes (shares). Metered on-ramps do not take into account congestion throughout the highway system. They also don’t guarantee any vehicle access to a highway lane free of congestion. HOV lanes (shares) can guarantee lanes to cars as the cars move through the highway system over great distances.
    _
  • Can metered on-ramps (limits) co-exist with HOV lanes (shares)? Absolutely!
    You may want to set a hard limit on transmit or receive (maximum) while letting different traffic classes (flows) fluctuate bandwidth usage during times of no congestion.
    _
  • Can there be more than one HOV lane?
    Absolutely! You didn’t read Brad’s post above, did you? ;)
    _
  • Can there be limits placed on on-ramps (egress traffic on NIC) and off-ramps (ingress traffic on NIC)?
    Absolutely! (But it depends on whose product you’re using. See below.)
    _

Cisco UCS + VMware vs. HP Virtual Connect + VMware Design Comparison

A proper highway design takes into account both bidirectional ramp metering and highway HOV lanes, not just on-ramp metering. The same goes for network design. Simply rate limiting what the NIC transmits only addresses part of the problem and doesn’t address the bigger picture – servers to server communication over the fabric.  

  • VMware provides HOV lanes for on ramps (shares per flow per dvUplinks) plus metered on and off ramps (vDS-wide egress rate limiting per flow).
  • Cisco UCS provides metered on-ramps (egress rate limiting per Palo logical NIC), metered off-ramps (ingress rate limiting per Palo logical NIC), HOV lanes for on-ramps and off-ramps (shares per priority on Palo/Menlo NIC ingress/egress), and HOV lanes on highways between ramps (fabric-wide shares per flow),
  • HP Virtual Connect provides metered on-ramps (egress rate limiting per Virtual Connect FlexNIC), but no metered off-ramps (no ingress rate limiting per Virtual Connect FlexNIC), no HOV lanes for on-ramps or off-ramps, and no HOV lanes between ramps.

  

In the HP Virtual Connect + VMware example, you don’t have any HOV lanes. Virtual Connect only provides metered on-ramps (FlexNIC Tx speed limit). Once frames leave the HP server NIC and enter the on-ramp towards Virtual Connect, all frames are equal…first in first out. Good luck if a backup or VMotion starves out your VM’s production traffic somewhere inside the Virtual Connect domain (red arrows in graphic below). No amount of configuration in VMware or in the external network can control how Virtual Connect prioritizes traffic during times of internal congestion. In addition, FlexNIC speed limits artificially limit traffic needlessly during times of no contention/congestion – like on-ramp metering at 2 am on a Sunday!?!  

  

In the Cisco UCS + VMware example, the user has full traffic control – rate limiting in and out for VMware dvUplinks, rate limiting ingress and egress for Cisco NICs plus traffic shaping (shares per priority) in the fabric from end to end. In addition, when a rate limit is applied to a Cisco Palo interface (IF), the rate limit speed is reflected in the OS as the speed of the individual Palo interface. For example, with one Palo card, I could have 58 Palo NICs presented to the OS and each could be running at a different speed based on its rate limit as defined in the UCS Manager Service Profile.  

  

In summary, let’s reread VMware’s NetIOC Best Practice #1. “When using bandwidth allocation, use “shares” instead of “limits,” as the former has greater flexibility for unused capacity redistribution. Partitioning the available network bandwidth among different types of network traffic flows using limits has shortcomings.”  

Like Cisco, they ‘get’ server networking. Kudos to them.  

(Special Thanks to Doron Chosnek and Brad TerEick for their input)

Sep 13th, 2010 | Posted in Cisco UCS
Tags:
  • http://www.eplus.com Don Mann

    Very nice post Sean. Just want to make sure a few things are clear. I am a big fan of Network IO Control. It also does not have inbound share prioritization. NetIOC only covers outbound and if you need inbound management – they recommend rate limiting if you need to manage inbound (found in NetIOC best practices doc). Also – it is important to note that NetIOC only prioritizes traffic types – VM Traffic vs vMotion, etc – you cannot define priorities of one VM traffic over another (COS/QOS flexibility). A very good direction we are moving in the VM space! I still think the datacenter QOS story is best – but I am still trying find doc/details on traffic queuing/priority at the NIC/CNA. I have not found this detail on Palo or the Q/E cards. I’m trying the Qlogic utility in our lab now – but dont have palo cards to test at the moment.

    -don

    • http://www.mseanmcgee.com M. Sean McGee

      Hi Don,
      Thanks for reading and for posting the clarifications. Good stuff for readers to know.

      Sounds like a “how to configure QoS on UCS” blog post is in order… :)

      Thanks again,
      -sean

      • Dusted

        Thanks for this post – a great read.

        “Sounds like a “how to configure QoS on UCS” blog post is in order… :)”

        That would be a fantastic blog post for this UCS customer trying to make sense of all this…

  • Pingback: BRAD HEDLUND .com » VMware 10GE QoS Design Deep Dive with Cisco UCS, Nexus()

  • http://twitter.com/dpironet Didier Pironet

    Hi,
    Great post tackling sharing of network bandwidth resources!
    I’m not familiar with Cisco UCS and therefore my question might be dumb :)

    On the Ciscso UCS + VMware Design diagram, there is no ‘connectivity’ between the two Cisco Fabric Extenders, thus a VLAN tagged packet targeted to another VM but hosted on the same hypervisor would have to route accross all the network up to the core switches then back to the same hypervisor, whilst with the HP diagram, the packet would have just gone through the edge switch, that is the VirtualConnect Flex10 and routed to the second VirtualConnect Flex10 inside the same enclosure to the target VM…

    That’s what the diagrams show and again I’m not familiar with Cisco UCS so I might be just wrong here….

    Thx,
    Didier

    • http://www.mseanmcgee.com M. Sean McGee

      Hi Didier,
      Thanks for reading and posting a comment.

      You are correct that in the UCS model, all server communication is through the Fabric Interconnects (the top of rack interconnect device for UCS blade servers). In other words, UCS treats blade servers like rack servers when it comes to connectivity – keep it simple and have the blade servers logically connected to the top of rack interconnect device just like rack servers are connected to a top of rack switch. The Fabric Extenders in the UCS model are not switches – they’re remote line cards off of the Fabric Interconnects. The great thing about this model is that it’s simple (no blade switches to manage/troubleshoot like in the HP model with Virtual Connect) and it doesn’t matter where two blades servers are in the UCS domain – the same latency between every server. The latency between server 1 and server 2 is the same as between server 1 and server 112. The latency is around ~5us between any two servers.

      All VM-to-VM traffic is not required to go through the Fabric Interconnects if you are using a vSwitch/DVS/Nexus 1000v on the ESX host.

      I hope this answers your question. Please post a follow up if you have any additional questions.

      Best regards,
      -sean

  • jtaylor

    Does the release of FlexFabric change these diagrams at all? specificially the traffic controls flexfabric modules are there any traffic controls in the updated module?

    • http://www.mseanmcgee.com M. Sean McGee

      Hi JTaylor,

      According to blog comments (link below) by an HP blade sales guy, Ken, it appears that FlexFabric does not offer any additional QoS capabilities over Flex-10 (meaning, no QoS features other than TX rate limiting at the NIC). That’s my assumption since Ken’s response is “QOS is not required for the vast majority of network connections” when David says QoS is needed in FlexFabric.

      http://h30507.www3.hp.com/t5/Eye-on-Blades-Blog-Trends-in/Storage-Networking-the-way-you-want-it/bc-p/82117#M531

      In addition, there is only one mention of QoS in the entire whitepaper (page 12) entitled “HP Virtual Connect FlexFabric Module and VMware vSphere 4″ – and the reference is to VMware’s NetIOC, not FlexFabric.
      http://h20195.www2.hp.com/v2/GetPDF.aspx/c02505638.pdf

      In summary, my ASSUMPTION is that FlexFabric provides no addition traffic control (QoS) to allow defining bandwith minimums within the Virtual Connect fabric for any type of traffic – IP or FC or iSCSI, etc. Personally, I think this is a huge oversight. How is FCoE traffic prioritized over regular Enet traffic on the Virtual Connect stacking links? However, I’ll defer to HP to authoritatively answer your question about their product.

      Thanks for stopping by,
      -sean