Comments
Richard Davies wrote: The UK has a good crop of technology pioneers in cloud computing - for example ElasticHosts, FlexiScale, Flexiant, OnApp - and also some strong government initiatives such as G-Cloud. We will have to see whether this kind of technical leadership converts into swift mass-market adoption or not.
Cloud Expo on Google News

SYS-CON.TV
Cloud Expo & Virtualization 2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Cloud Computing & Enterprise IT: Cost & Operational Benefits
How and Why is a Flexible IT Infrastructure the Key To the Future?
Click For 2008 West
Event Webcasts
Recognizing and Eliminating Errors in Multithreaded Java
Recognizing and Eliminating Errors in Multithreaded Java

Errors in multithreaded programs may not be easy to reproduce. The program may deadlock or encounter other thread-related errors under only very specific circumstances, or may behave differently when running different VMs.

If you use multithreading in your client- or server-side Java, you should seriously consider a detection solution for the most common problems with threaded programming, including:

  • Deadlocks
  • Potential Deadlocks
  • Data Races

    Deadlocks
    A deadlock is a situation where threads are blocked because one or both are waiting for access to a resource that will not be freed. The application can never terminate because the threads are blocked indefinitely.

    This behavior results from improper use of the synchronized keyword to manage thread interaction with specific objects. The synchronized keyword ensures that only one thread is permitted to execute a given block of code at a time. A thread must therefore have exclusive access to the class or variable before it can proceed. When it accesses the object, the thread locks the object, and the lock causes other threads that want to access that object to be blocked until the first thread releases the lock.

    Since this is the case, by using the synchronized keyword you can easily be caught in a situation where two threads are waiting for each other to do something.

    A classic example for a deadlock situation is shown in Listing 1. Now consider this situation:

    • One thread (Thread A) calls method1()
    • It then synchronizes on lock_1, but may be preempted at that point.
    • The preemption allows another thread (Thread B) to execute.
    • Thread B calls method2().
    • It then acquires lock_2, and moves on to acquire lock_1, but can't because Thread A has lock_1.
    • Thread B is now blocked, waiting for lock_1 to become available.
    • Thread A can now resume, and tries to acquire lock_2. It can't because Thread B has acquired it already.
    • Thread A and Thread B are blocked. The program deadlocks.
    Of course, most deadlocks won't be quite so obvious simply from reading the source code, especially if you have a large multithreaded program. A good thread analysis tool, like Sitraka's JProbe Threadalyzer, finds deadlocks and points out their location in the source code so that you can fix them.

    Potential Deadlocks
    Potential deadlocks are caused by problematic coding styles that might not cause a deadlock in every test execution. For that reason, they are perhaps more dangerous than deadlocks, as they may remain hidden until after the application is deployed. We'll discuss two types of potential deadlocks: Lock Order and Hold While Waiting.

    Lock Order
    Lock order violations can occur when concurrent threads need to hold two locks at the same time. The potential for deadlock develops when one thread holds a lock needed by another. Consider the situation where Threads A and B both need to hold locks 1 and 2 at the same time.

    It is possible that events could unfold as follows:

    • Thread A acquires lock_1.
    • Thread A is preempted and the VM scheduler switches to Thread B.
    • Thread B acquires lock_2.
    • Thread B is preempted and the VM scheduler switches to Thread A.
    • Thread A attempts to acquire lock_2 but is blocked because lock_2 is held by Thread B.
    • The scheduler switches to Thread B.
    • Thread B attempts to acquire lock_1 but is blocked because lock_1 is held by Thread A.
    • Threads A and B are now deadlocked.
    It's important to note that this deadlock might not occur in some situations. The VM scheduler might allow one of the threads to acquire lock_1 and lock_2 in succession, without preempting the thread. In such a case, regular deadlock detection would not report it.

    A fully featured thread analysis tool would track the order in which locks are acquired, and warn of any problematic lock ordering. A lock order analysis feature should issue warnings whenever the VM scheduler might deadlock, while deadlock detection should report only actual deadlocks.

    Hold While Waiting
    Another type of potential deadlock occurs when a thread holds a lock while waiting for notification from another thread. Consider the example shown in Listing 2.

    This code is problematic in that Consumer can hold the lock on the queue, denying Pro- ducer the access it needs. This can occur even if Consumer is waiting for Producer to send notification that another item has been added to the queue. Since Producer can't add items to the queue, and Consumer is waiting on Producer for new items to process, the program is effectively deadlocked.

    Locks held while waiting are only potential deadlocks because events could transpire in such a way that the notifying thread does not need the lock held by the waiting thread. However, such programming practice is risky unless you are absolutely sure that the notifying thread will never need the lock. Locks held while waiting can also cause cascading stalls, where one thread idles while holding a lock needed by another thread, which in turn holds a lock needed by yet another thread, and so on.

    To correct the previous example, modify the Consumer class by moving wait() outside of synchronized(), as follows:

    public class Consumer
    {
    synchronized void consume()
    {
    while (! done) {
    wait();
    synchronized(queueLock_) {
    removeItemFromQueue
    AndProcessIt();
    }
    }
    }
    }

    Data Races
    A data race results from a lack of synchronization or the improper use of synchronization when accessing shared resources such as variables. Data races occur when the developer fails to specify which thread has access to a variable at a given time. In such a case, whichever thread wins the race gets access to the data, with unpredictable results.

    Because threads can be preempted at any time, you can't safely assume that a thread executing at start-up will have accessed the data it needs before other threads begin to run. As well, the order in which threads are executed may differ from one VM to the next, making it impossible to determine a standard succession of events.

    Sometimes, data races may be insignificant in the outcome of the program, but more often than not they can lead to unexpected results that are hard to debug. In short, data races are concurrency problems waiting to rear their ugly heads. A good thread analysis tool will identify any data race it encounters while executing your program, and flag it for you to fix.

    A Benign Data Race
    Not all race conditions are errors. Consider the example in Listing 3. Assuming that getHouse() returns the same house to both threads, you might conclude that a race condition is developing because the BrickLayer is reading from House.foundationReady_ and the FoundationPourer is writing to House.foundationReady_.

    However, the Java VM specification dictates that Boolean values are read and written atomically, meaning that the VM can't interrupt a thread in the middle of a read or write, and that once the value has been changed, it's never changed back. This is a benign data race, and the code is safe.

    A Malignant Data Race
    Now, consider the following scenario in Listing 4.

    What happens if a wife and husband simultaneously attempt to deposit money to a joint account, from two different banking machines? Let's call them Alice and Bob. At the beginning of our scenario, their joint account has $100.

    Alice deposits $25. Her banking machine starts to execute deposit(). It gets the current balance ($100), and stores that in a temporary local variable. It then adds $25 to that balance, and the temporary variable holds $125. Then, before it can call setBalance(), the thread scheduler interrupts her thread.

    Bob deposits $50. While Alice's thread is still in limbo, his thread starts to execute deposit(). The getBalance() returns $100 (remember, Alice's thread hasn't written the updated balance yet), and the thread adds $50 to obtain a value (in its temporary local variable) of $150. Then, before it can call setBalance(), Bob's thread is interrupted.

    Alice's thread now resumes, and writes its temporary local variable's contents ($125) to the balance. The banking machine informs Alice that the transaction is complete. Bob's thread resumes, and writes the contents of its temporary local variable ($150) to the balance. The banking machine informs Bob that the transaction is complete.

    Net effect? The system has lost Alice's deposit.

    Your first instinct might be to protect the Account.balance_ field by making getBalance() and setBalance() synchronized methods. This will not solve the problem. The synchronized keyword will ensure that only one thread can execute getBalance() or setBalance() at a time, but that won't prevent one thread from modifying the balance of an account while the other is halfway through a deposit.

    How to Fix the Race
    The key to successful use of the synchronized keyword is to realize that you need to protect entire transactions from interference by other threads, not just single points of data access.

    In our example, the developer must ensure that once a thread has obtained the current balance no other thread can alter that balance until the first thread has finished using that value. This can be accomplished by making deposit() and withdraw() synchronized methods.

    The Synchronized Keyword
    Deadlocks, potential deadlocks, and data races are common multithreading errors made by developers of all levels of experience. The correct use of the synchronized keyword is essential to writing scalable, multithreaded Java code. A good thread analysis tool like Sitraka's JProbe Threadalyzer makes error detection much less laborious and is particularly valuable for finding problems that might not arise in every test execution.

    This article is meant to be an introduction to the most common Java multithreading development errors. For more information on concurrent programming, refer to the References. Christian Jaekl was particularly helpful and I am grateful for his support and advice.

    References

    1. Jaekl, C. (1996). "Event-Predicate Detection in the Debugging of Distributed Applications," University of Waterloo. www.sitraka.com/jaekl96eventpredicate.pdf
    2. Lea, D. (1999). Concurrent Programming in Java: Design Principles and Patterns, 2nd Edition, The Java Series.
    3. Oaks, S., and Wong, H. (1999). Java Threads, 2nd Edition, O'Reilly.
    4. Hartleys, S. (1998). Concurrent Programming: The Java Programming Language, 1998. Oxford University Press.
      About Mark Dykstra
      Mark Dykstra is Web Content Manager at Sitraka and has been working as a Web developer and technical writer for the past five years.

  • In order to post a comment you need to be registered and logged in.

    Register | Sign-in

    Reader Feedback: Page 1 of 1

    Latest Cloud Developer Stories
    Rackspace Hosting, the service leader in cloud computing, on Thursday announced its acquisition of SharePoint911, an industry leader in SharePoint consulting, training, and "JumpStart" services within SharePoint. The unification of both companies provides capabilities to deliver ...
    With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have techn...
    Nimble, the social CRM platform has announced the launch of Nimble 2.0, billed as the “most social” CRM platform on the market today. Nimble was designed entirely with social CRM in mind and is the first social business platform that empowers companies with the ability to get clo...
    2011 was a year of rapid adoption for public and private cloud services. Instant and on-demand server provisioning was the driving force behind the massive growth. On top, cloud server templates and script automation simplified application installation for simple and pre-defined ...
    "Having been in the IT field for many years, I believe the cloud computing chapter in the industry is an exciting one and I am proud to be a part of it," said National Reconaissance Office (NRO) Chief Information Officer Jill T. Singer Tuesday, as it was announced that she was on...
    Subscribe to the World's Most Powerful Newsletters
    Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
    Click to Add our RSS Feeds to the Service of Your Choice:
    Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
    myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
    Publish Your Article! Please send it to editorial(at)sys-con.com!

    Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

    SYS-CON Featured Whitepapers
    ADS BY GOOGLE

    Breaking Cloud Computing News
    The future of U.S. optoelectronics manufacturing will be spotlighted during a one-day industry-centr...