Java SE 6
Plug in Your Command Processor Now
Plug in Your Command Processor Now
By: Richard Ross
Dec. 1, 2002 12:00 AM
This article details the implementation of a tool called the Command Processor. This tool takes a Java object and creates a command-line interface to its public methods.
These public methods are essentially your Application Programming Interface (API). During the course of this article we'll get a good look at the java.lang.reflect package and a chance to kick the tires on the Regular Expression package included in the 1.4 JDK. I often find myself with fresh code and no convenient way to try it out. The GUI is not ready or there is no requirement for one. Even writing the argument processing for a main function is often far more work than it's worth. I want to be able to work with my code without modifying the API or writing a throwaway UI. In the long run, all the solutions I've tried were either too much work or required significant modifications to the class. Simply put, I want the Command Processor to create a command-line UI for any given Java class. Here are my requirements, in order of importance.
Cmd:> myvar = createUser "John", "\"fingers\" Doe", 'C', 34
The entire syntax is described in the javadoc comments for the Command- Processor class. It's very Java-like, but notice that parentheses are not required (they're actually not allowed) around the argument list. This makes it easier to handle argument casting later on. If you've ever tried writing a Lexer/Parser, you'll probably agree that parsing these lines by hand would be fairly difficult. Embedded quotes, for example, can cause a ton of grief to the programmer. Similar tools (DJava, for example) use a full-blown parser generator like Antlr or JavaCC to parse Java strings. I didn't need that level of sophistication, so I used the tools available to me in the 1.4 JDK. I chose to write a simple pseudo lookahead parser and found that the StreamTokenizer class gave me a good head start.
One of the keys to this tool (and requirement number 4) is that it must not be a burden to use. I want to create an instance of the Command Processor, hand it an object, and start working. As much as possible, I want to avoid configuration files and coded dependencies. Introspection comes to the rescue here as it allows us to examine the declared and inherited members (which include fields, constructors, and methods) of a Java class. It also allows us to fulfill requirement number 3, complete decoupling. As you'll see in the code API.java, the Command Processor and the class it processes (the processee, if you will) are completely unrelated. (The source code and Listings 1-5 can be downloaded from www.sys-con.com/java/sourcec.cfm.)
Starting with a given command-line string, I first break it into tokens. Even for our simple language, this quickly becomes a difficult task to do by hand. You end up with a giant decision tree and an awful lot of if/else and switch statements. However, lexical analysis is a well-understood field and Java provides a class that you can use. The java.io.StringTokenizer class takes some kind of Reader as its input and provides functions for retrieving tokens and setting various parameters. Essentially, all the if/else and switch statements are there. Most of them are in the 230-line nextToken() function! It's not without its quirks, but since James Gosling is listed as the original author, I'll just assume I didn't fully understand it.
Running the code in StreamTokTest.java will give you a good idea of how useful the StreamTokenizer is right off the bat (the output is shown in Listing 1). It handles comments, quoted strings with nested quotes, and chars in single quotes without even setting a parameter. It's not a perfect Java Lexer out of the box, but it wasn't meant to be. The tokens returned by StreamTokenizer have three properties: type, sval, and nval. Type is an int that represents the predefined type found or the single nonwhitespace character that follows the last token. sval and nval are string and double representations, respectively, of the token and are only valid if the token type was TT_WORD or TT_NUMBER. Notice that the valid Java double 2.01e3 was broken into two tokens, the number 2.01 and the word e3.
There are three other things worth noting: a word and a quoted string have distinct types, the single quote is treated the same as the double quote, and the numeric value is not erased between tokens. However, the only thing that really matters to me is that Java numbers don't all parse correctly.
How do I tell the Tokenizer that 2.01e3 is a number or 2L for that matter? I can't. The Tokenizer just keeps adding characters to the token until it finds a character that's not in the current token type. It doesn't know anything about context, so we'll put off numeric identification until we get to the parser. To do that, I'll need the Tokenizer to treat numbers the same way it treats letters. Strangely, the Tokenizer won't let me unset the numeric attributes it has set. I have to clear everything with a call to resetSyntax() and then add everything back into the tokenizer, except I add numbers as word characters. The Tokenizer now returns only TT_WORDs, but that's exactly what I want. Now that I have a sequence of tokens, I can do my parsing.
variable = object.method argument, argument, ...
The variable, object, and arguments are all optional. When the StreamTok- enizer gives me the first token, I don't know whether I've been given a variable, object, or method. Before I can decide, I'll need to know something about the next token. There isn't any way to look ahead with the StreamTokenizer, so I'll substitute a caching mechanism.
In my parser, the first call to nextToken() must return a word. I can't assign this word to the sMethodName field because I might have a variable assignment. I need to know if the word is followed by "=", a word, or nothing. I have to cache the current token's string value and then call nextToken() again, examining the value. If I find an equals sign, the cached value belonged to a variable. If not, the first word was either "object.method" or "method" and I'll have to put back the token I just took so it will be processed correctly.
Arguments and Types
For reasons that will be clear later, I need to turn an argument into a Class type and an object representation. When I ran the arguments through the Tokenizer, I got back a string of data, a string type, or both. In the case of the argument string "(float)12", I received "float" and "12". I'll pass these both to my Argument class and let the class handle all the conversions. If I pass a null type to the Argument class, it will try to match the data to a primitive type.
To do that, I match the data given to me from the Tokenizer against the patterns that define the different types of primitive literals in the Java language. For example, "true" and "false" are the only allowed Boolean literals in Java. If I'm asked to construct an argument with a null type and data = "true", I should be able to easily detect that this is a Boolean argument. To examine the data, I'll use regular expressions as provided in JDK 1.4.
There has been a fair amount of grumbling about the inclusion of the java.util.Regex package in J2SE 1.4. Some people claim that since Java Regex packages have been widely available for some time, Sun is just adding unnecessary code bloat to the JDK. Personally, I wouldn't be as likely to use them if they weren't so readily available to me. No matter how you feel, regular expressions are extraordinarily useful things with which every programmer should have more than a passing acquaintance.
There are three basic functions that you use to apply regexes to your strings: Find() is used to match a substring in a string, matches() is used to determine whether an entire string matches the regex, and split() splits a string wherever it finds a match (similar to StringTokenizer). Unless you're receiving your regex strings dynamically, you'll want to precompile your regexes and reuse them, as this will greatly speed up your code. Create a Pattern object by calling its static compile() method. Since it's static, you can declare Pattern fields this way.
public Pattern p = Pattern.compile("(true|false)");
My regex in this example is quite simple. It must find either "true" or "false". To apply it to a string, I must first create a Matcher. Then I call the matches() method, which only succeeds if it matches the entire string. So both "tru" and "truly" would fail.
Matcher m = p.matcher( stringData );
I've created regular expressions for most of the primitive literals. I don't need them for string literals or char literals because those are wrapped in "" or ''. If no type has been found for the data, it's checked to see if it's a valid identifier. If so, it will be presumed that the argument passed is a variable that has previously been created in the CommandProcessor. Its type will be discovered just prior to finding the desired method.
To recap, we have now parsed a command line and retrieved a variable name if one was requested, a method name with an optional target object, and a list of arguments. The arguments have one or two pieces of string information. The type can be given by an explicit cast, can be implicit in a literal, or can be implicit from a stored variable. The data is just the string data typed in at the command prompt. The Argument class will have to create an object of the type specified by the argument and "set" it with the given data. It also creates a Class object of the type specified. Why and how Argument does this will be discussed later. First, a little discussion about reflection is in order.
Intro to Reflection
Unsurprisingly, methods, constructors, and fields are all represented by objects in the java.lang.reflect package and stored as arrays of these objects in the Class object. To get an array of methods for your object, get your object's Class object and call the appropriate get method:
Method mymethods = myobj.getClass().getMethods()
The same formula works for the fields and constructors. In fact, as I mentioned earlier, I've created a generic toString method utilizing this feature. It's static and I always use it. I simply override my standard toString( ) method with this line:
return Util.toString( this );
My generic toString method takes the given object and introspects all its fields. Listing 2 shows my toString method being called on the Command Processor. Notice that the RegexMethodFilter also gets introspected. This is because it uses the new toString method. The method will output the field name and the value, if it can get it. Since the field is likely protected or private, toString( ) shouldn't be able to get the value, but that's where the AccessibleObject comes in. Methods, Fields, and Constructors all inherit from AccessibleObject. Simply call setAccessible(true) on the object. This is basically there for things like serialization, but it's worth noting that your private variables are only private if you provide a security manager with your application.
The one thing I should make clear here is the difference between getDeclaredXXXs( ) and getXXXs( ). getDeclaredXXXs gets all the XXXs declared in the class, regardless of access modifiers (public, private, or protected). getXXXs gets only public items as well as all inherited items. We'll go into more detail about making calls on these objects later.
To get a list of things to filter, I make a call to one or more of the four "getMethod" members of java.lang.Class. As mentioned before, I have a choice of getting declared or inherited public methods. In addition, I can attempt to find a single method instead of an array of methods. For the Command Processor, this is the most convenient. Any given command line will make clear the name of the method, any arguments, and possibly even the object on which to execute the desired method. The get methods of Class are just the thing for finding a specific method:
public Method getMethod(String name, Class parameterTypes)
As noted, this searches only public members of the class. Notice that the second argument is an array of class types. This method will return only methods with the exact signature specified by the name and the parameter types. Calls to getMethod are exactly why we needed the Argument class to provide us with a Class object for its type. getMethod must have the exact argument types or it will fail to find a method.
If a method is returned, I still have to check to see if its use is allowed. It's not a good idea to call methods like wait() and run() from the Command Processor, so they should probably be filtered out. The MethodFilter interface abstracts this functionality. The Command Processor instantiates its internal method filter, called RegexMethodFilter, and all objects will use this filter unless another one is provided. The RegexMethodFilter class adds one essential method to the implementation of MethodFilter, addExpression(). This method adds the given expression to an internal list of regular expressions, each of which will be tested against the given method. If a match is found, the method is rejected. This time, we use the find() method of the matcher class because we want to match any substring in the method signature.
The Command Processor, for example, does not want to expose the main( ), run( ), or wait( ) methods, so the internal filter will need to exclude them. The patterns "main\( .* \)", "run\( .* \)", and "wait\( .* \)" will reject main(), run(), and wait(), but not maintain(), runtimeTarget(), or waitlist().
Running the Command Processor
There is quite a bit of chicanery involved with handling parsed parameters and it's all because of primitive types. The reason, as I mentioned earlier, is that the "getMethod" members of Class all expect a Class to describe the arguments of the method you want. The "invoke" member of Method requires an Object with correct types and data. Making classes and objects for primitive types is a bit tricky.
Let's examine the process for parsing "(float)21". We see that we have a parameter with the type "float" whose data is "21". I quote them here to remind you that they're still strings. float is a primitive type and there's no facility to turn a primitive into a Class object. In a perfect world, you'd be able to dynamically create your primitive the same way you do any other class: Class myClass = Class.forName("java.lang.StringBuffer";. Once you have a class, it's trivial to create an object if it has no argument constructor: Object myObj = myClass.newInstance();. Unfortunately, this is not allowed for primitives. There are static class objects available for the primitive types and they must be used here. They are members of the classes that wrap primitives. For int, there's the java.lang.Integer class; use its TYPE field as shown in Listing 3.
Now I'm faced with a nice long list of if/else statements. I actually used a static hashtable instead, which may be a bit faster and is much more flexible. At this point, I should have enough information in CommandLine to find the named method from a class. In the Command Processor "run" method, I try to get the method from the Command Processor, from the named object ( if specified), or from the "target" object defined when the Command Processor was constructed.
If the target wasn't named (i.e., myobj.myMethod), the processor goes first. This way, the processee can't accidentally override the exit command and get you stuck in a loop (experience teaches me yet another hard lesson). Once I have a method, it's a simple matter of invoking the method and providing feedback to the user.
The second parameter to the invoke method is an array of objects, one for each parameter and each representing an argument. The catch is you can't magically create an object for a primitive type. I really don't even have a primitive type. I have a string representing a primitive type and a Class object. As we've already noticed, I can't create an object of a primitive type via the Class object. I have to determine the type and create an instance of the appropriate wrapper class. For the int type this is java.lang.Integer. Therefore, it looks like I'll have another battery of if/else statements.
Once again, rather than a long if/else block, I use a static hashtable. This time it's a bit more complicated, but I can take advantage of the fact that all primitive wrapper classes have string constructors (except char, which is handled as a special case). Notice that I dynamically search for the string constructor to invoke. The advantage of that strategy is that any object with a string constructor can be instantiated in the same step. For example, one of the API class's methods, test(java.lang.StringBuffer), works automatically with this setup.
As you can see, the method I found was located with a class object of type int (from the static Integer.TYPE), not with a Class object of type Integer. They are different. They have to be able to discriminate between foo(int I) and foo(Integer I). However, when I invoke the method, I use objects of the wrapper types. The JVM will handle the conversion for me, but it's important not to confuse the methods. In essence, you get around strong typing here so be sure you're calling the right method. Again, the full source provides sample code that shows argument conversion at work.
One final note, the invoke method declares three exceptions. Of course, when you dynamically call a method, you can't specify any exceptions thrown by the method, because you don't know what method you'll be calling until runtime. InvocationTarget- Exception wraps the exceptions thrown inside the method. If you want to know which exceptions the method threw, you have to call the new method in Throwable, "getCause". This is part of the enhanced "Chained Exception Facility" in JDK 1.4, which is a standardization of chaining in the Throwable class.
Kicking My Own Tires
Since I don't want the Argument class to know anything about the variables or even the Command Processor, I can't resolve this in the Argument class. The smart thing to do would be to turn the returned object into an Argument and store arguments in the variable map rather than objects. This is left as an exercise for the reader (I've always wanted to say that).
This code is easily adaptable to an internal console similar to those found in most PC games these days. Anyone who has played a PC game in the last five years has likely seen the drop-down command prompts that are becoming ubiquitous. The console, while not new by any stretch of the imagination, provides a very useful tool for developers and power users. Normally, a console would allow only the getting and setting of parameters and the reloading of configuration files. However, even viewing and setting properties in the running system can be extraordinarily useful.
Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week