SOASTA's 10,000 Hours in the Cloud
Lessons Learned from SOASTA’s Experiences in Cloud Computing
By: SOASTA Blog
Dec. 16, 2009 02:15 AM
“Practice makes perfect” is an adage heard by everyone from budding pianists to ambitious athletes. In his book Outliers, Malcolm Gladwell focuses on exceptional people who don’t fit into our normal definition of achievers. A common theme in explaining the successes of these individuals is the “10,000 Hour Rule.” Gladwell’s theory suggests that high performance is, in large part, the result of putting in an enormous amount of time, and that it takes about 10,000 hours to gain the experience necessary to achieve greatness.
The rule, according to Gladwell, applies to more than just individuals. He points out that the Beatles performed live in Hamburg, Germany over 1200 times between 1960 and 1964, amassing the hours needed to catapult the band to stardom. Clearly, it applies to companies as well. That’s not to say that talent, intelligence and good timing don’t play a role; but nothing can replace good old-fashioned experience.
Having now executed well over 10,000 hours of testing in the Cloud, SOASTA has experienced many successes and more than a few challenges. This article is intended to share our experiences to help others understand the opportunities and avoid the pitfalls associated with cloud computing. Gartner has positioned cloud computing as being at the “Peak of Inflated Expectations”, and the fall into the “Trough of Disillusionment” can be a steep one, if you don’t know what to expect. According to Gartner it will be two to five years before cloud computing’s mainstream adoption. So how do you get ahead of the curve?
SOASTA has provisioned over 250,000 servers, most in Amazon EC2. Our application, CloudTest™, is a load and performance testing solution designed to deliver real-time, actionable Performance Intelligence that builds confidence in the performance, reliability and scalability of websites and applications. It leaves the baggage of client-server alternatives behind and is extremely well suited to the cloud. In fact, dev/test has been acknowledged as a logical entry point for leveraging cloud computing due, in large part, to the variability of resource requirements. It also improves web application testing quality by making it viable to simulate web-scale user volumes from outside the firewall.
The question isn’t if cloud computing will become mainstream, but when and how? What issues need to be addressed, and how do you make the decision to move to the cloud?
What is the Cloud?
No, we’re not going to create yet another definition for cloud computing. While both vendors and users often spin the definition(s) to match their agendas or points of view, most tend to describe it in terms of services, typically in three “buckets”:
Software-as-a-Service is characterized by providing application capabilities, such as backup, mail, CRM, billing or testing. Platform services, such as Force.com, Google Apps, Microsoft Azure and Engine Yard, are designed to make it easier to build and deploy applications in a specific multi-tenant environment without having to worry about the underlying compute stack. Infrastructure services allow companies to take advantage of the compute and storage resources delivered by vendors, such as Amazon, Rackspace and GoGrid, while providing a great deal of flexibility in how those resources are used.
According to a draft definition by the National Institute of Standards and Technology, which is heading up a government-wide cloud computing initiative, Infrastructure-as-a-Service is “the capability provided to the consumer to rent processing, storage, networks and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, deployed applications and possibly select networking components (e.g., firewalls, load balancers).”
SOASTA provides a software-based service, centered on reliability, that tests and measures the software, platform and infrastructure of both cloud-based and on-premise applications. This article focuses primarily on infrastructure, since the bulk of SOASTA’s experience is as a consumer of cloud services.
The diagram below illustrates how SOASTA leverages the cloud to deploy its Global Test Cloud components for on-demand testing.
Public and Private Clouds
With a private cloud, IT has greater control over issues, such as security and performance, that are common concerns about shared resources in the public cloud. Instead of renting the infrastructure provided by others, companies use commercial or open-source products to build their own cloud. Of course, this means they also need to address challenges such as repurposing servers, choosing a virtualization platform, image management and provisioning (including self-service) and capturing data for budgeting and charge back.
Thus, even companies that build their own internal cloud infrastructure will identify applications well suited to deployment in the public cloud, including externally managed hybrid environments like Amazon’s Private Cloud. Most IT shops start with applications that take advantage of the benefits delivered by Infrastructure-as-a-Service. This includes applications that are deployed for short periods of time, have increased use based on business cycles, need to scale on-demand to respond to known or unanticipated traffic spikes, have limited privacy and security concerns, and are simple to deploy.
Deploying Applications in the Cloud
SOASTA CloudTest’s first deployment environment was in Amazon EC2. The requirements for load and performance testing fit almost all of the characteristics above: a need measured in hours, not months or years; an often significant and varying amount of compute resources; and minimal security issues. EC2 was the first to provide a platform that dramatically changed the cost equation for compute resources and delivered an elastic API for speed of deployment.
For an application that depends on the swift provisioning and releasing of servers, it meant that we had to quickly identify bad instances and bring up replacement instances. In addition, because EC2 was initially confined to a single location, it reduced some of the benefits of external testing. For companies deploying business-critical applications in the cloud, a single location might create concerns over failover and disaster recovery. Amazon Web Services has since created geographic options, regions and availability zones, providing very attractive options for failover.
Speaking of Challenges...
This issue has been mitigated by automation and packaging of applications. Companies, such as RightScale, Appistry, 3Tera and others, provide tools to package and deploy applications leveraging relatively simple and intuitive user interfaces.
Vendors provide tremendous amounts of flexibility, but little application awareness. Others provide more platform-oriented capabilities and include value-added features, such as monitoring, auto-scaling, pre-configured stacks and application containers.
Virtualization technology has allowed infrastructure vendors and enterprises to optimize their use of physical resources and is one of the fundamental building blocks that has distinguished the cloud from what was often referred to as “grid” or “utility” computing. The second key building block is the implementation of APIs, popularized by Amazon as elastic APIs. A prerequisite for today’s cloud infrastructure vendors, these interfaces enable the ability to dynamically start, stop, query the status of and manage the available resources programmatically and without the intervention of the infrastructure vendor.
There’s much discussion about portability, lock-in and the need for an open API. A standard elastic API is probably years away, causing many of the added-value deployment vendors (such as those listed above) to do the work for you: build your app using their tools and you don’t have to worry about infrastructure vendor lock-in.
Of course, that begs the question of where lock-in occurs. You’re never really locked-in if you’re willing to do the work to change. That could mean changing your processes for deployment, writing to a new API, repackaging your application or changing the stack. There will always be choices to make, and to minimize the impact of lock-in, you need to determine which layer is going to be easiest to change should you need to switch vendors. Ultimately, the only way vendors should lock you in is by providing superior service.
When working with public cloud vendors, planning certainly helps in meeting large-scale resource requirements. In SOASTA’s case, one engagement may require only two or three servers, while another may need hundreds, or even thousands, of server cores. We often schedule our compute requirements a week or two in advance, which helps our infrastructure vendors respond when we’re provisioning hundreds of servers at a time. In most cases, you’ll know your resource requirements ahead of time. However, since on-demand access to a pool of resources is one of the benefits of the cloud, you’ll want to understand the limits of your vendor to respond to ad hoc requests.
One way to minimize this issue is to spread the risk among multiple vendors. With CloudTest’s distributed architecture, we have the capability to concurrently provision across multiple cloud providers to help mitigate availability issues. Commercial deployment technologies can help reduce the complexity of deploying components of an application to more than one provider if this is a design criterion.
Another variable to consider is persistence of the application: for how long and how often does it need to be up and running? This has a particular impact on pricing. Is it only going to be in place for a few weeks, or is it a business-critical application that needs to be available 24/7 for an extended period of time?
Cloud vendors are trying to strike the balance between the traditional offerings of a managed service provider and a more flexible, scalable platform that is available and billed on demand. For example, Amazon now offers reserved instances, as well as other offerings, that are a blend of on-demand and monthly rental; while traditional hosting companies, like Rackspace and Savvis, race to provide fast, affordable and easy access to short-term and elastic compute and storage resources to complement their standard hosted offerings. Some, it should be noted, are much further along than others in providing a range of options.
Security is the number one concern when it comes to the public cloud. Testing, particularly load and performance testing, rarely requires the use of or access to real customer data. Since our application is in the cloud for the express purpose of simulating external load to a web application, our experience has not often included governance, compliance, data or other security concerns an organization might have when moving applications to the cloud. As a result, we’ll not cover security in-depth in this article.
There are organizations, such as the Cloud Security Alliance and many commercial and open-source companies that are working to address security concerns. Private and hybrid clouds, VPN access to cloud resources and industry-specific cloud initiatives are just some of the ways to minimize risk.
It’s worth noting that many organizations have become comfortable running their applications and storing their data at managed service providers because they comply with various regulations and auditing statements, such as SAS 70.
Cloud providers have also focused on securing the assets of their customers. For the most part, cloud and managed service providers have far more secure environments than many small- and medium-sized businesses that don’t always invest in basic security measures, much less comply with standards, such as PCI, the Payment Card Industry’s Data Security Standard, that is intended to protect customers’ personal information. As a result, for many organizations, the cloud is actually a more secure environment than on-premises.
The politics and competition within an organization for resources and funding are always a challenge. Managing issues like compliance with corporate standards and protection of data require careful attention in larger IT shops. In the past, the unavailability of hardware has been a limiting factor in the potential sprawl of application development. With the cloud, the ease-of-use and low cost described earlier make “skunk works” projects more feasible. This means individual businesses or groups within the company can be nimble and responsive, but they can also increase fragmentation and management risks.
It’s not dissimilar to the challenges posed years ago by the proliferation of the PC, or programs like Excel that clearly led to a wide variety of issues. However, we’ve learned to manage these challenges because the value proposition is compelling enough to deal with the governance concerns.
According to Forrester, 74% of app performance problems are reported by end-users to a service/support desk rather than found by infrastructure management. If this is true of applications using discrete resources under the control of the IT department, what happens when the infrastructure is in the cloud? In addition, much of today’s web applications are delivered by third parties (CDNs, web analytics, news feed providers and other aggregators) and thus are out IT’s direct control.
When talking about comparing cloud vendors, Alan Williamson, editor-in-chief of Cloud Computing Journal, wrote, “The first comparison point is performance. This is a common question asked at our bootcamps, and it is important to understand that you are never going to get the same level of throughput as you would on a bare-metal system. As it’s a virtualized world, you are only sharing resources. You may be lucky and find yourself the only process on a node and get excellent throughput. But it’s very rare. At the other extreme, you may be unlucky and find yourself continually competing for resources due to some ‘noisy neighbor’ that is hogging the CPU/IO. The problem is that you never know who your neighbors are.”
Spec-based comparisons are difficult as even machines that ostensibly have the same specifications may be using different processors, disk subsystems, etc. These differences are further obscured by how some cloud vendors refer to the options they provide. It can be like decoding the difference between venti, grande and tall if you’re not a Starbucks regular. The more specifics you can get from your provider, the better.
Ultimately, the best way to ensure good results is to understand and measure the factors that impact performance. Not all virtualized resources are created equal. While access to elastic resources is good for responding to peak requirements, simply scaling hardware does not solve underlying performance problems or the impact of a poorly designed application. It doesn’t tell you if you have too few load balancers, missing database indexes or improper settings in your firewall.
Just as with infrastructure you own, test and measurement is the only way to confidently optimize your application’s performance in the cloud: an area where SOASTA can help.
Criteria for Selecting a Cloud Provider
Range of Offerings
Quality of Service
SOASTA’s Test Cloud is designed to leverage all of the major infrastructure providers. We have access to multiple locations around the world and can provision servers across geographies in minutes. We bring up hundreds of servers simultaneously and tear them down hours later. And, of course, we endeavor to do so at the lowest possible cost for our customers. While perhaps a bit unusual in its scope, we hope that sharing our 10,000+ hours in the cloud will help you when considering your unique situation.
Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week