|
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
General Java Generalized Enumeration Mechanism for Java
Generalized Enumeration Mechanism for Java
By: Myung Ho Kim
May. 1, 1998 12:00 AM
Summary
Introduction Some programming languages, such as CLU[1], Sather[2] and Icon[3], provide a mechanism called iterator (also known as enumerator) to handle this situation. In these languages, an iterator is invoked like a procedure but, instead of terminating with a result, it has many results which it produces one at a time. The produced items can be used in other parts that specify actions to be performed for each item. Iterators are more important in object-oriented programming than they were in other paradigms because programmers routinely construct abstract data types for collections in this paradigm. The author of the data abstraction must provide an iterator because the representation of the objects of the type is not known to the user.
Java [4] provides an iterator-like interface called Enumeration (java.util.Enumeration). It is used by many classes in the standard Java libraries. In the case of the Vector class, the member that returns an enumeration is elements(). A typical loop using Enumeration to step through the elements of a Vector vec is as follows: The Vector class, its related iterator class and the client program are interrelated, as illustrated in Figure 1. The same technique can be applied to many other sorts of collections as well, if they provide members that return enumerations like the elements() member of the Vector class.
Problems with Java Enumeration It is argued that iterators can be developed for all the common data structures using only cursors [6]. However, it is by no means a simple task to develop cursors for non-linear data structures like trees or graphs [7]. An iterator class for enumerating elements of binary trees will make the points clear (see Listing 1). Since the current node does not provide sufficient information to determine what the next node in an inorder sequence is, a stack is introduced to supplement the missing information. This may be viewed as a non-recursive inorder traversal procedure tailored to meet the Enumeration interface. Even though it is possible, in principle, to remove all the recursions in any algorithm, the burden on the programmer is overwhelming when the algorithm is intertwined with complex control and/or data structures. What's worse is that the resulting program is far less readable than the original recursive one because control information for the enumeration procedure is dispersed at various places of the program. The iterator construct of CLU is quite different from the Java Enumeration. A procedure-like module called an iter represents iterators. Since an iter module can be implemented using arbitrary algorithms, there is no need to explicitly maintain the current state to determine the next element. As a result, it is relatively easy to convert a traversal procedure into an iter module.
Iterator as a Generalized
Enumeration Iterator developers should derive an iterator class from the Iterator class. The members hasMoreElements() and nextElement() are implemented already in the Iterator class and iterate() is the only member that should be implemented further. Iterate() corresponds to the iter module in CLU. It implements the enumeration procedure using arbitrary control structures including recursive calls and infinite loops. Any construct permissible within a plain member is also allowed. Additionally, it may contain many invocations of yieldElement() whose role is to enumerate an object and make it available to clients. Figure 2 illustrates the definition of the Iterator class. The effect of calling an iterator is as follows. The first time an iterator is called via hasMoreElements() and nextElement(), it commences to execute iterate(). It does this until it reaches an invocation of yieldElement(), when it suspends execution and makes the value of expression visible to the corresponding call. Upon subsequent calls, it resumes execution just after yieldElement(), suspending whenever it reaches another yieldElement() until it exits. An implementation of an iterator class for the inorder traversal of a binary tree using the Iterator mechanism is shown in Listing 2. Even if the main use of iterators is to implement enumerations of data types, they are also useful in their own right. This is indicated by the next example (see Listing 3). Sort is a class for iterators that enumerates elements of an array in ascending order using the quicksort algorithm [8]. Once again, the recursive nature of the original algorithm is well reflected by the enumeration procedure. Our final example is a typical producer/consumer problem known as the Grune filter [9]. A filter accepts characters from an input stream and puts them out, replacing all occurrences of aa by b. Another filter replaces all occurrences of bb by c. Input is fed into the first filter, the output of which is fed into the second filer. What is actually observed is the output of the second filter. The filtering process can be programmed as in Listing 4. It may seem quite simple at first, but programming an iterator class like Compact as a cursor is very difficult and error-prone. This is because it has many points of control that should be saved to determine what to enumerate next upon consecutive requests. In the Iterator model, there is no need to save control points explicitly. All that is needed is to invoke yieldElement() at various points. Iterators are first-class objects and can be used in place of ordinary enumerations. This property exhibits maximum flexibility in programming with iterators [3]. Useful iterator-related programming idioms (such as multiple iterators per loop or collection, parameterization of iterators to procedures or other iterators, binding iterators to variables and pipeline programming) can be applied. The Grune filter is a typical example of pipeline programming. A pipeline is established when an iterator, acting like the consumer, embeds another iterator as a producer. In the following example, iterator AAB embeds an iterator for the standard input stream. By the same token, BBC embeds AAB.
Compact AAB = new Compact('a', 'b', It would be fair to say that programming techniques of this kind could also be applied to plain Java enumerations, but they are more strongly supported when iterators of arbitrary control can be developed easily.
Implementation The buffer size chosen for the implementation of Java iterators is zero; thus, the enumeration procedure is activated directly by the clients upon every request for a new object. The buffer mechanism is implemented as a class named Buffer. The basic idea of the implementation is as follows. An iterator and its client are executed in two different threads communicating through a buffer. The request for a new object is ultimately interpreted as an invocation of the get() member, which suspends the thread for the client until new data arrives at the buffer. The enumeration procedure takes control at this point, generating a new object and delivering it to the buffer by making a request for yieldElement(). It then invokes put(), which wakes up the thread for the client. Since there is neither true parallelism nor preemption between these two threads, they behave as if they were co-routines [12]. Figure 3 depicts the overall structure of the proposed implementation. The thread for the iterator is responsible for starting the iterate() member that contains an enumeration procedure written by the iterator developer.
private class IteratorThread The constructor creates and starts a thread for the enumeration procedure. The thread is marked as a daemon thread to allow terminating the enumeration procedure when the client stops requests for further enumeration.
public Iterator() { HasMoreElements() should determine whether or not the enumeration procedure is still pending. This problem is harder to solve than it looks. It is insufficient to simply check the liveness of the thread using isAlive() because there can be a race-condition between the two threads. The solution adopted here is pre-fetching an item delivered by the enumeration procedure. Pre-fetching is achieved by the get() request to the shared buffer. private Buffer data = new Buffer(); The thread issuing that request is not woken up until some other thread (running the enumeration procedure) passes back the control. If the value delivered to the buffer is a meaningful item, it is a clue that the thread for the enumeration procedure is actually running and that the item stored in the buffer is one written by a yieldElement() call. The value null was chosen for the representation of a meaningless item. The iterator thread delivers this value when there is a request for a new item even after the enumeration procedure is finished.
private class IteratorThread Lookahead is a temporary store for the pre-fetched item, which is shared between hasMoreElements() and nextElement(). private Object lookahead = null; It should be a trivial task, by now, to determine whether the enumeration procedure is still pending. All that needs to be done is to check that the pre-fetched value is not null.
public final boolean hasMoreElements() {
NextElement() has only to return the value pre-fetched by the corresponding hasMoreElements() call.
YieldElement() records the value to be enumerated next to the buffer which, in effect, passes the control back to the client.
protected final void yieldElement(Object x) { The implementation has a shortcoming in that every invocation of nextElement() should be preceded by exactly one corresponding invocation of hasMoreElements(). There is also a small problem: the iterator thread tends to stay alive longer than it actually needs to when the iterator is used within a context that requires premature termination. This problem can be resolved if the thread could be terminated (stopped in the Java sense) explicitly by the programmer. A new member stop() is introduced to allow the required intervention.
abstract class Iterator implements Enumeration The implementation of the stop() member is accomplished by terminating the iterator thread if it is still alive.
public final void stop() { Iterator threads that are neither finished nor stopped are maintained until the end of the whole program and then discarded when the main thread exits. Marking the iterator thread as a daemon already ensured this. The complete source code for both the Iterator and the Buffer is shown in Listing 5.
Performance Table 1 provides timings (in milliseconds) for example programs that created up to 500,000 random elements and consumed the same number of elements entirely. The "?" mark represents an OutOfMemoryError exception. Cursors performed better than iterators in every test case. The overhead incurred during the synchronization of buffer operations was the dominant factor of the overall execution time when the enumeration procedures were very simple. When a large amount of computation was required for the enumeration of a single data item, however, iterators could run nearly as fast as cursors, which can be observed near the lower right corner of the table. Even if cursors are desirable for efficiency reasons, it would still be a useful programming strategy to use iterators during the initial development stages and then translate them into cursors later.
Resources
Conclusion Iterators developed using the proposed mechanism are first-class objects and can be used in whichever context an Enumeration makes sense. This property exhibits maximum flexibility in programming with iterators. The implementation is still a prototype but it faithfully realizes all the essential features of the proposed mechanism. Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week
Breaking Cloud Computing News
|
|||||||||||||||||||||||||||||||||||||||||||||||||