|
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
Product Reviews JFluid: A New Way to Profile Java Applications
JFluid: A New Way to Profile Java Applications
By: Gregg Sporar
Jun. 3, 2004 12:00 AM
Anyone who develops production applications eventually spends some time profiling. JFluid is an experimental new technology for profiling Java code. It was developed at Sun Microsystems Laboratories and can be a handy tool in your profiling toolbox. Your application should run fast and not overconsume valuable resources such as memory. For production applications it's important to "scale well" by running quickly and within a reasonable memory footprint when the workload increases. Profiling tools help identify bottlenecks in your code - sections that contribute the most to execution time and resource consumption. Profiling can reveal unanticipated facts about application performance. In the world of Java application development, all too often performance assumptions tend to be wrong. Until now there have been two ways to profile Java code. You can modify the code by hand or use a profiling tool that leverages the Java Virtual Machine Profiler Interface (JVMPI). The first approach (e.g., inserting calls to System.currentTimeMillis()) is tedious and error prone. The second approach requires learning one of the many JVMPI profiling tools. Some are open source; others are commercial products. JFluid provides a third alternative. Features The JFluid tool is a Swing application that is currently a bit rough around the edges - hopefully it will improve over time. The application to be profiled can be started by the tool. Alternatively, you can attach the tool to a JVM that is already running. Attaching to a running JVM is useful when you want to profile applications that run in a container such as a Web or application server. An important JFluid feature is that no special command-line settings are required when starting the JVM to which you eventually attach. Until the JFluid tool is attached, the application runs at full speed with no profiling overhead. Think about the implication of that. It means that applications can be profiled in their normal deployed environment without advance preparation and without incurring any overhead when profiling is not being done. In other words, JFluid allows profiling to be turned on and off at will. Minimizing overhead is a cornerstone of JFluid's design. Its CPU profiling allows you to profile a subset of the application's methods. By not profiling the rest of the methods you can dramatically reduce profiling overhead. To profile a group of methods, first select one or more methods manually. JFluid treats each selected method as the root of a "call graph". Starting at the root method JFluid determines which methods are called by it, repeating the process recursively until the entire call subgraph is identified. Only the methods that belong to the call subgraph are instrumented and profiled; the rest of your code runs unchanged at full speed. Method selections can be changed arbitrarily while the program is running. This allows you to perform a "drill down" with decreasing overhead, or to successively profile different code areas for which there would be too much overhead if profiled together. Tools that allow only a selection of methods for profiling by name or package are unable to provide this functionality in such an easy-to-use form. JFluid also tracks memory allocations with an eye on minimizing overhead. By default JFluid does not record complete information on every object allocation; instead it does statistical sampling by keeping a counter for each class. It decrements the class's counter when an object is allocated. When the allocation counter reaches zero, JFluid does detailed profiling and resets the counter. Detailed profiling of an allocation consists of capturing a stack trace and monitoring whether the object is still on the heap, in other words the object's liveness. By default the counter is 10 so 10% of object allocations are tracked. In production applications, particularly server applications, the number of objects allocated is so high that the information that is discarded is usually not significantly different from what is retained. The allocation counter can be decreased to improve precision or increased to reduce overhead. For tracked allocations JFluid records the age of each object, where "age" means number of garbage collections the object has survived. It also reports the number of different ages for all tracked allocations of a class; this is called "surviving generations." The surviving generations value is useful for detecting memory leaks caused by objects that are constantly generated, but only partially garbage collected. In other words, the group of objects grows steadily. "Steadily" is the key word: it is not a group of objects that has been generated once within a short period of time in the past and remains fixed since then. Nor is it a group of objects that may grow for quite some time, but then get collected at once. In both of those cases, the number of different ages of objects within a group would be small, no matter how young or old the objects themselves are. It is typically in a leaking object group that the number of different ages grows steadily. JFluid can also tell you where in your code an object was allocated. However, it does not help answer the question of why an object has not been garbage collected. In other words, no display is provided to show which objects are pointing to the suspected leaking object. Sometimes just knowing where an object was allocated is enough of a clue to help you figure out why it has not been garbage collected. When that is not enough, a heap graph would be helpful. We have yet to see what JFluid is going to offer in this area. In addition to the instrumentation of CPU usage and heap allocations, JFluid provides four monitors. These graphs show thread count, current heap size and usage, time spent doing garbage collection, and the count of the number of different ages for all heap objects. The count of the number of different ages is the surviving generations value for the entire heap. Installation To replace the JVM, download the binary files (libjvm.so or jvm.dll, depending on your platform) and put them in place of the corresponding files in your standard JVM. Alternatively, you can download a complete JDK that has already been modified to support JFluid. To install the JFluid tool, unzip the file into any directory. Edit the shell script or batch file that's used to start the JFluid tool so that your JVM and JFluid directories are specified, and you are ready to go. Sample Program I ran JFluid on both a SunBlade 2000 running Solaris 9 and a Pentium3-based PC running Windows 2000. All screenshots in this article are from the PC, which is where I ran the example profiling session. Example Profiling Session To begin a profiling session I chose Run>Attach from the menu. JFluid prompted me for the name of the working directory for the JVM process - with my Tomcat installation this is the directory in which I started Tomcat. After specifying the directory I clicked on the console window that displayed when I started Tomcat; that allowed me to send the JVM a signal by pressing Ctrl-Break on my keyboard. I then returned to JFluid and verified that it connected to the JVM. This process is awkward but painless. The steps for connecting to the JVM running on a Unix system are a bit different; I have some tips on that below. After JFluid attached to the JVM I clicked the Monitors tab to see the graph of heap usage. This graph is always available, even when no detailed instrumentation of memory allocations is being done. My next task was to look into the slow performance of the servlet. Using the Select and add root method> From binary menu entry, I specified my PrimeServlet.doPost() method. With that done, I chose the Instrument> Selected root methods transitively from the menu and I was ready to go. In a browser window I brought up my HTML form, typed 123456 as the maximum value, and then clicked Submit. After a pause my browser displayed results. Meanwhile, JFluid was profiling my code. Clicking on the Profile> Get latest results menu took me to the CPU results tab, which displayed the window in Figure 2. The top line showed that my doPost() method took 409 milliseconds (ms). Of that time, however, only 2.16 ms were actually spent in the code of doPost() itself; the rest of the time was spent in methods called by doPost(). In particular, the java.lang. String.split() method took up a huge amount of time - 243 ms. That's almost twice the amount of time used by PrimeNumbers.getEratosthenes() to generate the prime numbers. Drilling a little deeper, I saw that most of the time in PrimeNumbers getEratosthenes() was actually spent in a method I wrote to convert the answer to a string. Wasteful string processing was hurting performance. In addition to wasting CPU cycles, the code was also wasting heap space. Using the Monitors tab to watch the graph of heap usage I noticed that the heap continued to grow as I used the servlet to request additional prime numbers. Some of that growth was expected, but the servlet was caching answers so requests with a previously requested value should not have caused much additional memory usage. But they did. This is due to a bug I put in on purpose: the cache was broken because the key used to add to the cache was not the same key used to request entries from the cache. So if 123456 is requested four times, there will be four instances of PrimeNumber in the cache, even though only one is needed. The extra instances of PrimeNumber cannot be garbage collected from the heap because the cache holds references to them. To identify this memory leak I selected Object liveness from the Instrument menu. This turned off JFluid's CPU profiling and turned on detailed profiling of object allocations. Since this was a small application, I set the allocation counter to 1 to get complete profiling. To reduce the overhead incurred, I set the stack trace depth that it records to 3. Then I went back to my browser and did four more requests of prime numbers. When I clicked on Profile > Get latest results, the Memory results tab displayed with the list of objects allocated since my selection of Object liveness. The line of interest was this one: 4 live obj. - 4 alloc obj. - 3.8 avg. age - 2 surv. gen. - 4 total alloc obj. - Each time I requested a prime number I used the same value: 123456. Only one live instance of the PrimeNumbers class should exist. There are three extra objects because of the cache bug. After "live objects," the other figures reported are:
Using JFluid on Unix Systems Sun Microsystems Laboratories Conclusion References Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week
Breaking Cloud Computing News
|
|||||||||||||||||||||||||||||||||||||||||||||||||