Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News

SYS-CON.TV
Cloud Expo & Virtualization 2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Cloud Computing & Enterprise IT: Cost & Operational Benefits
How and Why is a Flexible IT Infrastructure the Key To the Future?
Click For 2008 West
Event Webcasts
Virtualization and Fault Tolerance
The dream couple who aren’t together enough

Virtualization and fault-tolerant technology are like the would-be ideal couple, a match made in heaven, but who never meet, even though they're constantly in the same place at the same time. That can be a funny conundrum in romantic comedies, but in the real IT world, virtualization and fault tolerance need to get together quickly and often. IT organizations that are virtualizing their server infrastructures need both technologies if they're going to succeed in building platforms that have virtualization's efficiency but also provide the continuous availability they need to support enterprise applications.

Virtualization and fault tolerance are both long-established technologies with roots in the mainframe era. They went through transformations that have made them especially relevant in today's IT markets, and they exponentially increase each others' value. Full-function fault tolerance - note the qualifier - provides the continuous availability that makes virtualized environments reliable enough to support the most demanding enterprise applications.

The growing interest in continuous availability computing to complement virtualization has led to the inevitable realization by vendors that they need to jump on the bandwagon before their opportunity passes. As a consequence, definitions, features and functions get stretched out of shape to mask the shortcomings of marketers' claims. That can lead IT managers into buying decisions that aren't going to work for them because they're buying less availability than they realize. Vendors have taken to calling any reliability solution "fault tolerant" when only a few products actually meet the criteria. Before committing to a fault-tolerant continuous availability solution to support a virtualization project, IT managers need reliable definitions of the terms so they know what they're getting. And maybe not getting.

Fault Tolerance or High Availability?
Fault tolerance is the apex of reliability technologies, and the only standards-based means for achieving continuous availability, or near-perfect 99.999 percent uptime (a.k.a. five-nines availability) in continuous, round-the-clock processing. Most of the availability solutions on the market today that are called fault tolerant are not. Many of them provide high availability, which is "four nines" or below, but not continuous availability. The difference is important.

Unlike fault tolerant systems, high availability systems recover from a problem by failing over, or switching to a standby system and restarting applications on another server. Server clusters, for example, are high availability solutions, but they can never be fault tolerant because they allow an interruption in processing during the failover period, which can be anywhere from a few minutes to an hour long. Critical applications, such as emergency 911 or financial trading, can't tolerate that much downtime, so a high-availability solution doesn't work for them. Therefore high availability ranks a category below fault tolerance in the availability stack. "Failover" and "restart" are never part of the fault tolerance lexicon, except to say they do not apply.

What, then, is fault tolerance? The answer depends on the type of fault tolerance - hardware or software. IT managers have to understand their similarities and differences to choose the best approach for a particular need.

At its most basic, "hardware" fault tolerance is designed to prevent unplanned downtime and data loss. All components are duplicated - not just power suppliers or fans - and run in complete synchronization so they appear as one logical server to the operating system and the application. Logic and diagnostic software cross-check every operation. If something is amiss within the server, the diagnostics will identify the problem and, if necessary, remove the broken part from service while the rest of the server and the application continue to run completely unaffected. Often knocked for being pricey, entry Intel-based servers can be purchased for less than $15,000 (USD).

After a generation of existing only as hardware, fault tolerance for x86 systems is now developing as a software technology. This can muddy the waters. These new software solutions are fault tolerant up to a point. They support continuous availability but only under certain workloads. They are not also able to harness the full power of virtualization and multiprocessor technologies.

The state-of-the-art today for software fault tolerance is linking two industry-standard x86 servers together with cable and software (or virtual machines mirrored by software across two, preferably three, identical x86 servers) so that they run in virtual lock-step, similar to the way fault tolerant hardware does, and deliver five-nines uptime. But, unlike hardware, applications and OSs must be licensed on each physical server.

Software Fault Tolerance: The Good and Bad
There are realities to software fault tolerance that limit its potential in corporate IT. Perhaps the most important of these is that software-based fault tolerance lacks symmetric multi-processing (SMP), which means applications cannot scale beyond a single core per server. In a two- socket server powered by quad-core processors, an application running in fault-tolerant mode is restricted to the compute power of just one of the eight server cores. Further, processor manufacturers are engineering virtualization capabilities into powerful new products that will be grossly underutilized in this scenario. Despite assertions that all applications will run in a software-fault-tolerant environment, physical or virtual, many true business-critical and mission-critical applications are simply too demanding to function properly, if at all.

This is not full-function fault tolerance; it's fault tolerance light, appropriate for workgroups or departments, but not for enterprise applications. It's unlikely these technological shortcomings can be overcome any time soon.

By the narrowest of definitions, the new generation of software solutions on the market is fault-tolerant. However, the end product of true fault tolerance is continuous availability at the highest levels of corporate IT. Mission-critical application availability requires more than saying you have fault tolerance. Continuous availability demands a combination of fault-tolerant hardware and software. That combination makes fault-tolerant technology an ideal match for virtualization, providing the continuous availability that makes virtualized environments a versatile, flexible, and economical platform for enterprise applications.

About Denny Lane
Denny Lane is director of product marketing and management at Maynard, Mass.-based Stratus Technologies.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

The combination of virtualization and fault tolerance has already been provided in cloud computing (IaaS, specifically), I guess.


Your Feedback
leonllyu wrote: The combination of virtualization and fault tolerance has already been provided in cloud computing (IaaS, specifically), I guess.
Latest Cloud Developer Stories
As a result, it said, of “customer feedback and evolving usage patterns,” Microsoft cut the price of its cloud-ified SQL Azure database 48%–75% for databases larger than 1GB and introduced a new entry-level 100MB model. It blogged that it’s noticed that many projects start smal...
Wide and cheap availability of cloud-based media services is upon us. With the transformations these services are already bringing to the consumption of music, video and interactive media, change has likewise come to professional workflows. Documents in 2012 are read, written, co...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical ...
Fresh off a happy quarter, Rackspace said Thursday that it’s bought SharePoint911, one of those you-never-heard-of-them outfits that does SharePoint consulting, training and JumpStart services so it can deliver newfangled SharePoint services along with its existing SharePoint hos...
Cloud is a shift from the focus on underlying technology implementation to leveraging existing implementations and further building upon them. Cloud orchestration or a network of clouds is the wave of the future where these clouds can operate with elasticity, scalability, and eff...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers
ADS BY GOOGLE