Ever Wonder WHY Your VMware Deployment Stalled?
It’s difficult to generalize the reasons for every situation, but we’re willing to venture a guess
By: James Houghton
Sep. 9, 2010 08:15 AM
The past week hosted the increasingly popular VMworld 2010 event at the Moscone Center in San Francisco. The title of the conference was "Virtual Roads, Actual Clouds", but the undercurrent of the show was all about ‘VM stall' and of course whatever vendor XYZ can do to help you get your VM deployment moving again. While we certainly want to un-stick any optimization effort, it's often prudent to understand WHY something happened before attempting to fix it.
It's difficult to generalize the reasons for every situation, but we're willing to venture a guess: failing to understand the workload.
The workload is not simply an application name, the programming language it's written in, and the hardware configuration it runs on today. Those things are a good start, but hardly provide enough information to determine the correct optimization approach. And yes, the implication in the preceding sentence is that there are multiple optimization approaches; that (gasp) the ideal target start may not be within a VM.
Before an army of angry VM believers spams us out of existence, let's make something clear: VMware is a phenomenal company with a great set of products that continue to get better with every release and acquisition they make. But saying it's the solution to every optimization project is akin to the amusing analogy of a carpenter with a single hammer - everything looks like a nail. To illustrate this point let's look at some simple examples. VMware is designed to optimize the utilization of a set of servers based on the assumption that none of the workloads running on those servers is capable of consuming an entire server. On the other end of the spectrum, if we look at a ghost of hype-cycles past - grid computing - we see a technology designed to optimize utilization of a set of servers for workloads that consume more than a single server (usually far more). Neither of these statements are absolutes; it's certainly possible to run small or large workloads on either technology. However, the relative benefits of attempting to optimize with the mismatched technology will be far smaller (if any).
What does it mean to understand the workload? To us it's more important to understand the business that the workload supports than to understand the technical deployment details. The reason there are so many stalled VM deployments is because ambitious project teams focused on the technical details and missed that business context. In doing so, those teams optimized for server utilization based on the existing IT environment. That works for the majority of workloads, but creates a problem for a few workloads where perhaps a cyclical business pattern or planned expansion of aspect of the business caused significant increases. When the business hits that cycle and IT is unable to respond quickly enough, the VMware program receives a black eye. The pre-optimized environment may well have had the same problem, but all the business executive is going to think about is that "it was working fine until IT moved my application to a VM." And, fair or not, the current state of the IT / business relationship is such that 10 (or more) great IT wins are instantly erased by one mistake...thus stalling the entire program.
So what can IT teams do to avoid this situation? Profile the workload:
Take that knowledge and codify in a Blueprint, take it to your business consumer and get their signoff, and then generate the additional pages of the Blueprint appropriate for your architects, engineers, and operations teams. Repeat this process for all of the workloads in scope, and as you proceed through this process begin to cluster workloads by type and business key performance indicators. This will allow you to understand which workloads can coexist nicely in a VM, and which ones should be kept isolated or targeted for alternate optimization approaches. Implement policy-based workload management - this helps optimize resource utilization for workloads with transient peaks, as well as ensuring the resources are automatically prioritized based on business importance in the event of severe traffic spike or outage. Finally, be sure to measure both the current state and the post-optimized environment. Use the Blueprint to manage expectations during the process and to empirically document your success.
This approach will unify teams and ensure the best possible environment is built, one that enables the delicate balance of business performance and IT efficiency.
Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week