Comments
Matt McLarty wrote: For more info... Follow me on Twitter See our website
Cloud Expo on Google News

SYS-CON.TV
Cloud Expo & Virtualization 2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Cloud Computing & Enterprise IT: Cost & Operational Benefits
How and Why is a Flexible IT Infrastructure the Key To the Future?
Click For 2008 West
Event Webcasts
The In-Memory DBMS Opportunity: IMDB and Performance Enhancement
Why are IMDBs important? Why is In-Memory DBMS so different?

The database has changed. Or at the very least it is changing, possibly quite rapidly. One of the major influences driving this change is the In-memory DBMS (IMDB).

Some argue that the IMDB has emerged specifically to meet the needs of embedded systems and as is evidenced by the very name, IMDBs exist entirely in memory without ever going to disk. In fact, in-memory systems have blossomed in recent times and evolved from a period when they were only used for caching, or in high-speed data systems, to a place now in 2010 when they may form a far more prevalent part of the mainstream IT landscape.

But should we consider an IMDB as nothing more than a traditional database that has been loaded into memory? The answer is no - and it's not just about boosting performance and scalability while keeping a tight rein on storage costs (although those are super important too) so let's find out why.

IMDSs are really a rather different proposition compared to traditional databases, as they are inherently less complex. As well as the removal of disk I/O considerations, IMDBs have a smaller number of moving parts and dependent processes. What this means is that both RAM and CPU power are more efficiently preserved and a faster level of overall performance is reached - what I mean is, the performance will be faster than deploying a traditional DBMS in memory.

In-memory DBMS Technology Defined
Dictionary definitions of in-memory DBMS (IMDB) point out that these systems rely on main memory for data storage and that this contrasts with traditional database management systems as these employ a disk storage mechanism.

IMDB databases are generally regarded to be faster than disk-optimized databases since the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory provides faster and more predictable performance than disk. In applications where response time is critical an IMDB is often perceived to be the best choice possible. An IMDB usually features a strict memory-based architecture and direct data manipulation.

All its data is stored and manipulated exactly in the form used by the application, removing overheads associated with caching and translation. Read and write accessibility is somewhere around just a few microseconds. IMDB technology can support real-time data management, application-tier deployment and the ACID properties. See later box out for a definition of ACID.

In-Memory Database Systems 3-key USPs
I said that things have changed and this is true. Whether you want to call it the balance of power, the economics of computing or quite simply the state of the technology nation, three resounding truisms (or undeniable facts if you prefer) are governing the emergence of IMDBs and they are as follows:

  • Processors have become so much cheaper over the last decade - while at the same time they have become cheaper too. This is Moore's Law taken to the extreme and we now have the possibility of multi-core processors driving multicore computing configurations that we could not have envisioned at the turn of the millennium.
  • Along with cheaper processing power, we also enjoy cheaper memory (and 64-bit addressability at that!) so deployment of a large in-memory consideration in any given system is a far more practical reality than ever before.
  • More processing power and memory is great news, but hardware plus hardware equals lots of hardware and more power internal management software must be brought into play here if the sum total of this equation is to equal success.

Why IMDB Is Important
In-memory database technology may be quite a beautiful coming together of both hardware and software development that gives us power advantages, economic advantages, ecological advantages and operational advantages. Significant IT shifts generally only occur when a combination of technologies, usage patterns and market conditions all align consecutively. Ex-Intel CEO Andy Grove would call this a strategic inflexion point, or a game changing technology development if you prefer.

Put simply, today there is a specific need for high-performance DBMSs that can support the data centre to a new level of operational excellence. There will be a trade off here between increased performance and the wider robustness (or if you prefer ‘durability') of the system in question. With specific relevance to Sybase technology, of course ASE 15.5 does boast in-memory database capability and there are considerations here if full ASE application compatibility is to be retained.

Why Are IMDBs Important?
"The new economics of computing, which derive from large memory models, 64-bit addressability, fast processors, and cheap memory, make it possible to design core database technology that is far faster and more scalable than was possible when the only option was to base data management on spinning disks,"
said Carl W. Olofson [1] who performs research and analysis for IDC's Information Management and Data Integration Software service within the Application Development and Deployment research group.

"In-memory databases are managed in main memory instead of on disk, so the disk is relegated to the role of a recovery, rather than data management, platform. This greatly reduces both the dependency on disk storage and the volume of disk storage required, especially when multiple copies of data are used in distributed architectures," added Olofson.

Key Challenges for Datacenters Today
Alongside considerations centered around cost, many of which will be taken up by physical storage space and cooling costs, many of the key challenges for datacenters today revolve around keeping their operations attuned to the ACID requirements - (and we'll come on to ACID in depth in just a moment as promised) but this is not easy.

Writing on IT Today, journalists Marty Ward and Sean Derrington said that, "Data center managers are caught between a rock and a hard place. They are expected to do more than ever, including protecting rapidly expanding volumes of data and a growing number of mission-critical applications, managing highly complex and wildly heterogeneous environments, meeting more challenging service level agreements (SLAs), and implementing a variety of emerging "green" business initiatives."

ACID - Atomicity, Consistency, Isolation and Durability
The ACID acronym representing Atomicity, Consistency, Isolation and Durability as it does describes the four fundamental pillars of a database if it is to be trough of as reliable. No database that fails to meet any of these four goals can be considered reliable.

  • Atomicity - this term is used to specify the fact that that database modifications must follow an "all or nothing" rule. In turn, this means that each transaction can be said to be "atomic." So if one part of the transaction fails, the entire transaction fails.
  • Consistency - The database must be in a consistent and constant state both before and after the database transaction. No surprise then that this also means that a database transaction must not break the database integrity constraints.
  • Isolation - this factor means that when multiple transactions occurring at the same time, they must not interfere with or impact each other's execution. We can also state from this point that the partial results of an incomplete transaction must not be usable for other transactions until the transaction is successfully committed.
  • Durability - Finally, let's come to durability. This factor means that all the database modifications of a transaction will be made permanent even if a system failure occurs after the transaction has been completed. Durability is ensured through the use of database backups and transaction logs that facilitate the restoration of committed transactions in spite of any subsequent software or hardware failures.

Figure 1: Data flow in a typical DBMS. Red lines show data transfer. Grey lines show message path.
[Copyright McObject, 2010]

Figure 1 depicts a typical DBMS scenario that will be familiar to you all. This is of course in some contrast to an in-memory database system. An IMDB encapsulates a system that has comparatively low data transfer needs due to the fact that the system gives the application the direction it needs to directly refer to the data item in the database. Although this perceived ‘gap' in the transport layer then exists, the data itself is still fully protected because this direction (or pointer if you prefer) only exists as part of the database API - and this ensures its correct use.

What we see at this point is an eradication of the need for multiple data transfers and the follow-on of this benefit is that processing will directly benefit from a streamlining process. Removing multiple data copies reduces memory consumption and the straightforwardness of this design allows for greater reliability.

Why In-Memory DBMS Is So Different
Depending on the workload demands made of it, an IMDB is typically as much as 10 times faster than a disk-based database and much of the reason for this is down to the way the internal storage is architected.

For an insight into why an IMDB system should perform so much faster, we turn again to IDC's Carl W. Olofson for an explanation of what happens in a traditional system so that we can get a proper grip on the inherent differences, "A disk-based database is designed from the ground up to optimize its data management based on an optimal disk layout and I/O minimization strategy. For this reason, data is assigned to preallocated spaces on disk that are mapped to files or, if the access method is direct, to designated segments on designated disk volumes. Often, to minimize disk head movement and contention, the data is spread across many volumes. When data is required for a database operation, the DBMS must identify the database page required, determine where that page is stored on disk, access the appropriate volume, retrieve the data, allocate the data to a buffer, and update a page buffer table."

Olofson goes on to point out that even when a required piece of data is in the buffer memory and its database key (the actual physical location of the data) is already known, the system must dereference the database key, determine that the page is in memory, locate the page and find the referenced location in the buffer where the requested data is located. In contrast to this, when an IMDB looks for data by its location, it simply uses a memory offset to load a pointer and gets the data directly.

"When data is updated on a disk-based system, the buffer is updated and marked as ‘dirty. A lock is set to prevent another user from reading the old data from disk until the transaction can be committed," said Olofson. "When the transaction is committed, a record is written to the log. In some systems, the buffers are flushed at this point. In others, the buffer page table is updated to indicate that requests for that data should be referred to the now committed data in buffer, which is  eventually written to disk either when the buffer must be flushed to make way for another database page or when the I/O optimization algorithm finds an idle moment to perform the write. In an IMDB, most of these operations do not exist. A user has a temporary image of updated data until it is committed, at which point the temporary image replaces the permanent image," he added.

Coming of Age: Is 2010 The Year of the IMDB?
Is this the year of the IMDB? As I have hinted already, we are drawing into question the proposition that this is a technology that has been around for a while now and is about to make a greater impact upon the technology landscape at a deeper level all round.

"It was about 7 years ago when I first was introduced into the concept of in-memory databases. At the time it was less known database vendor called Times-Ten that offered an in-memory database with blazing performance metrics, hence times ten. It was the perfect answer to solid-state disk drives that could drain an IT budget in a hurry. 

Just recently Sybase announced its Sybase ASE server, in version 15.5, will have an in-memory engine equivalent that will provide the same functionality and manageability as the standard Sybase ASE server. This is a remarkable step, because it provides performance gains transparent to client applications and the database engine will not challenge DBAs to learn new skills. To me this is a win-win situation, says Peter Dobler, President of Dobler Consulting, a database consulting services and application support services firm serving clients in the southeast region of the United States.

Dobler says that the answer lies in the architecture of in-memory databases. They are designed to improve transaction-processing volume for classic OLTP applications. While data warehouses would not necessarily get any benefits from in-memory databases, IMDBs do provide extreme high-speed transaction processing without the need to confirm disk write success. Traditional databases have one thing they have to do to ensure data integrity. They all need to wait for the disk I/O to confirm a write to disk. Database vendors came up with very complex and sophisticated caching techniques to overcome this performance challenge. But they cannot ignore this fundamental requirement. 

In-memory database bypass this disk writing requirement and that's what improves the speed.

"If you think about it, IMDBs are designed for high volume transaction systems, like e-commerce shopping carts, in-memory databases are unbeatable when it comes to writing transaction data, says Dobler. "This is fundamentally different to data caching of traditional database engines. Data caching improves read performance, but does nothing to improve write performance. 
There is a downside to these databases as well; they offer alternatives to performance problems in poorly written applications. Like powerful hardware, in-memory database have the potential to mask poor application development. We might see an explosion of in-memory database implementations due to this matter."

In-Memory DBMS and Scalability
While disk-based databases will always suffer from the inherent restrictions of their own storage backbones, these systems have to rely on sharing techniques to spread data around different volumes if scalability demands are made upon the system as a whole. In this scenario, these disk-based databases will typically demand that the data is shared across different volumes (on highly-priced tier 1 level storage resources) so that a healthy I/O throughput is achieved.

An IMDB on the other hand will only need to use the disk in the case of a data recovery situation. It writes the log (if there is one) to disk, and it dumps its memory contents to disk from time to time," says IDC's Olofson. "Because the disk is not involved in database transactions, the data that is dumped to disk can be packed together. As a result, there is no problem with filling a volume right up to the brim. Also, because the disk has no impact on performance, cheap lower-tier storage may be used. In scale-out situations, where clustered databases use partitioned and redundantly stored data, this savings effect is compounded," he added.

But why is tier 1 storage expensive?

The 3 Tiers of Storage
According to Green Pages, a Maine-based consulting and integration company, a business can reduce total storage costs by assigning different data sets to different storage mediums. The three tiers break down as follows:

Tier 1 Storage - Suitable for business-critical 24x7 databases; file servers and email applications; and data warehouses; this is a redundant, cache-based tiered storage model with fast response time and fast data transfer and availability rates.

Tier 2 Storage - This model is best suited to "seldom-used, non-critical databases" such as historical data held for reference purposes only. It uses less expensive media in storage area networks (SAN), but does not feature 24x7 availability or extensive backup.

Tier 3 Storage - This is used for data that is relatively rarely accessed at all. It may even feature inexpensive media such as DVD ROMs and Compact Disks.

Sybase ASE and IMDB
The IMDB option in Sybase ASE 15.5 enables data virtualization and scaling critical to meeting the needs of high data volume and high concurrent user organizations, whether deployed in public cloud or private data center environments. Unlike other in-memory products, the ASE 15.5 IMDB is fully integrated within ASE, eliminating the need for application changes and providing the flexibility that allows in-memory databases to be configured to meet application requirements.

ASE 15.5 also increases efficiency in the data center with the integration of the ASE Backup Server with Tivoli® Storage Manager (TSM) from IBM. With this support, ASE databases can be backed up on any TSM supported media, providing faster backups and restores with less network and storage resources required. This new integration provides a cost effective solution for storage management.

"The database management features offered by Sybase ASE 15.5 coupled with the storage management features offered by IBM TSM provide a powerful solution to overcome the challenges of data protection faced in today's business environment," said Richard Vining, Product Marketing Manager, IBM Tivoli Storage. "We are pleased to provide Sybase ASE with the Ready for Tivoli software designation, which shows customers that the solution meets or exceeds IBM compatibility criteria and successfully integrates with one or more IBM Tivoli Software products."

"In data centers, the IT challenge is to increase efficiency and availability while lowering data center costs. At the same time, application deployments in grid and cloud computing environments are increasing the requirements of transaction processing systems to support large volumes of concurrent users with high transaction rates," said Brian Vink, vice-president, data management products, Sybase. "ASE 15.5 addresses these extreme requirements by delivering increased data throughput and greater concurrent activity while elevating productivity and uptime."

Conclusion
There is a sea change at stake here. A quantum- or even a paradigm-shift if you prefer the term; either way, this is a game changer for sure. Disk storage as we it know it may eventually come to an end in the not too distant future. As the functionality required by today's computing environments (even the high performance ones) can be increased shouldered more efficiently by a resource that is housed within main memory, the allure of speed, efficiency and storage and operational cost savings is too good not to take notice of.

•   •   •

Resouce
1. Renowned analyst Carl W. Olofson of IDC is quoted in the text. His comments were made in an IDC White Paper sponsored by Sybase entitled, Breaking the Disk Barrier with In-Memory DBMS Technology: Sybase Adds a Big Performance Boost for ASE 15.5, Doc # 221540, January 2010. The complete IDC paper can be downloaded from the following URL: http://bit.ly/b8C2dW

About Adrian Bridgwater
Adrian Bridgwater is a freelance journalist and corporate content creation specialist focusing on cross platform software application development as well as all related aspects software engineering, project management and technology as a whole.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Latest Cloud Developer Stories
Cloud computing is creating the new Wall Street boom, according to NIA. The only industry that is as bright as cloud computing on Wall Street is social networking, NIA said in a recent report. 2012 will be known as the year cloud computing became widely adopted worldwide. Cloud ...
Whatever your course, meet Cloud complexity head on with a unified approach to handle extreme performance, reliability, availability, and simplicity. In her session at the 10th International Cloud Expo, Ayalla Goldschmidt, Senior Director of Product Marketing at Oracle, will re...
As a Bronze Sponsor of Cloud Expo New York, Appcore is offering special passes to SYS-CON's 10th International Cloud Expo, which will take place on June 11–14, 2012, at the Javits Center in New York City, New York. Appcore manufactures the business of cloud computing. Appcore de...
Assuming you haven’t spent the last couple of years living under a rock, you’re bound to have been bombarded with all sorts of propaganda about “The Cloud.” “The Cloud,” according to the marketing types, is the greatest thing since the invention of bread, surely able to solve all...
According to a 2011 survey by the Independent Oracle User Group, over 50% of Oracle’s customers have deployed or are considering deploying private clouds. Most private clouds today support non-production workloads because enterprises are unable to deploy mission-critical applicat...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers
ADS BY GOOGLE

Breaking Cloud Computing News
Wyse Technology, the global leader in cloud client computing, today announced that the Wyse P20 PCoI...