Java SE 6
Programming Neural Networks in Java
Programming Neural Networks in Java
By: Jeff Heaton
May. 1, 2002 12:00 AM
Computers can perform many operations a lot faster than humans. However, there are many tasks in which the computer falls considerably short. One such task is the interpretation of graphic information. A preschool child can easily tell the difference between a cat and a dog, but this simple problem confounds today's computers.
In the field of computer science, artificial intelligence attempts to give computers human abilities. One of the primary means by which computers are endowed with humanlike abilities is through the use of a neural network, which the human brain is the ultimate example of. The human brain consists of a network of over a billion interconnected neurons. These are individual cells that can process small amounts of information and then activate other neurons to continue the process. However, the term neural network, as it's normally used, is actually a misnomer. Computers attempt to simulate a neural network. However, most publications use the term neural network rather than artificial neural network.
This article shows how to construct a neural network in Java; however, they can be constructed in almost any programming language. Most publications about neural networks use such computer languages as C, C++, Lisp, or Prolog. Java is actually quite effective as a neural network programming language. This article shows you a simple, yet particle, neural network that can recognize handwritten letters, and describes the implementation of a neural network in a small sample program. (All sample programs and source code for this article can be downloaded from www.sys-con.com/java/sourcec.cfm.)
java -classpath OCR.jar MainEntry
When the letter-recognition program begins, there's no data loaded initially. A training file must be used that contains the shapes of the letters. An example training file (sample.dat) is preloaded with the 26 capital letters. To see the program work, click the "Load" button, which loads the sample.dat file. Now 26 letter patterns are in memory and the network must be trained. Click the "Begin Training" button; now the network is ready to recognize characters. Draw any capital letter you like and click "Recognize"; the program should now recognize the letter.
Training the Sample Program
Once you've entered all the letters you want, you must now "train" the neural network. Up to this point you've simply provided a training set of letters known as input patterns. With these input patterns, you're now ready to train the network, which could take a lot of time. However, since only one drawing sample per letter is allowed, this process will be completed in a matter of seconds. A small popup will be displayed when training is complete. When you save, only the character patterns are saved. If you load these same patterns later, you must retrain the network.
You'll now be shown how this example program is constructed, and how you can create similar programs. The file MainEntry.java contains the Swing application that makes up this application, which is little more than placing the components at their correct locations.
The three areas this article focuses on are downsampling, training, and recognition. Downsampling, an algorithm used to reduce the resolution of the letters being drawn, is used for character recognition and training, so we'll address this topic first.
Downsampling the Image
When you draw an image, the program first draws a box around the boundary of your letter. This allows the program to eliminate all the white space around your letter. This process is done inside the "downsample" method of the Entry.java class. As you draw a character, this character is also drawn onto the "entryImage" instance variable of the Entry object. To crop this image and eventually downsample it, we must grab the bit pattern of the image. This is done using a PixelGrabber class:
int w = entryImage.getWidth(this);After this code completes, the pixelMap variable, which is an array of int datatypes, now contains the bit pattern of the image. The next step is to crop the image and remove any white space. Cropping is implemented by dragging four imaginary lines from the top, left, bottom, and right sides of the image. These lines will stop as soon as they cross an actual pixel. By doing this, these lines snap to the outer edges of the image. The hLineClear and vLineClear methods both accept a parameter that indicates the line to scan, and returns true if that line is clear. The program works by calling hLineClear and vLineClear until they cross the outer edges of the image. The horizontal line method (hLineClear) is shown here.
protected boolean hLineClear(int y)As you can see, the horizontal line method accepts a y coordinate that specifies the horizontal line to check. The program then loops through each x coordinate on that row, checking for any pixel values. The value of -1 indicates white, so it's ignored. The "findBounds" method uses "hLineClear" and "vLineClear" to calculate the four edges. The beginning of this method is shown here:
protected void findBounds(int w,inth)You can see how the program calculates the top and bottom lines of the cropping rectangle. To calculate the top line, the program starts at 0 and continues to the bottom of the image. As soon as the first nonclear line is found, the program establishes this as the top of the clipping rectangle. The same process, only in reverse, is carried out to determine the bottom of the image. The processes to determine the left and right boundaries are carried out in the same way.
Now that the image has been cropped, it must be downsampled. This involves taking the image from a larger resolution to a 5x7 resolution. To reduce an image to 5x7, think of an imaginary grid being drawn over the high-resolution image. This divides the image into rectangular sections, five across and seven down. If any pixel in a section is filled, the corresponding pixel in the 5x7 downsampled image is also filled. Most of the work done by this process is accomplished inside the "downSampleQuadrant" method shown here.
protected boolean downSampleQuadrant(int x,int y)The "downSampleQuadrant" method accepts the section number that should be calculated. First the starting and ending x and y coordinates must be calculated. To calculate the first x coordinate for the specified section, first the "downSampleLeft" is used; this is the left side of the cropping rectangle. Then x is multiplied by "ratioX", the ratio of how many pixels make up each section. This allows us to determine where to place "startX". The starting y position, "startY", is calculated by similar means. Next the program loops through every x and y covered by the specified section. If even one pixel is determined to be filled, the method returns true, which indicates that this section should be considered filled.
The "downSampleQuadrant" method is called in succession for each section in the image. This results of the sample image are stored in the "SampleData" class, a wrapper class that contains a 5x7 array of Boolean values. It's this structure that forms the input to both training and character recognition.
Neural Network Recognition
Through the output neurons, the neural network communicates which letter it thinks the user drew. The number of output neurons always matches the number of unique letter samples that were provided. Since 26 letters were provided in the sample, there will be 26 output neurons. If this program were modified to support multiple samples per letter, there would still be 26 output neurons, even if there were multiple samples per letter.
In addition to input and output neurons, there are also connections between the individual neurons. These connections are not all equal. Each is assigned a weight, which is ultimately the only factor that determines what the network will output for a given input pattern. To determine the total number of connections, multiply the number of input neurons by the number of output neurons. A neural network with 26 output neurons and 35 input neurons would have a total of 910 connection weights. The training process is dedicated to finding the correct values for these weights.
The recognition process begins when the user draws a character and then clicks the "Recognize" button. First the letter is downsampled to a 5x7 image. This image must be copied from its two-dimensional array to an array of doubles that will be fed to the input neurons.
entry.downSample();This code does the conversion. Neurons require floating point input. As a result, the program feeds it the value of 5 for a white pixel and -5 for a black pixel. This array of 35 values is fed to the input neurons by passing the input array to the Kohonen's "winner" method. This returns which of the 35 neurons won and is stored in the "best" integer.
int best = net.winner ( input , normfac , synth ) ;Knowing the winning neuron is not too helpful because it doesn't show you which letter was recognized. To line up the neurons with their recognized letters, each letter image the network was trained from must be fed into the network and the winning neuron determined. For example, if you were to feed the training image for "J" into the neural network, and the winning neuron were neuron #4, you would know that it's the one that had learned to recognize J's pattern. This is done by calling the "mapNeurons" method, which returns an array of characters. The index of each array element corresponds to the neuron number that recognizes that character.
Most of the actual work performed by the neural network is done in the winner method. The first thing the winner method does is normalize the inputs and calculate the output values of each output neuron. The output neuron with the largest output value is considered the winner. First the "biggest" variable is set to a very small number to indicate there's no winner yet.
biggest = -1.E30;Each output neuron's weight is calculated by taking the dot product of each output neuron's weights to the input neurons. The dot product is calculated by multiplying each of the input neuron's input values against the weights between that input neuron and the output neuron. These weights were determined during training, which is discussed in the next section. The output is kept, and if it's the largest output so far, it's set as the "winning" neuron.
As you can see, getting the results from a neural network is a quick process. Actually determining the weights of the neurons is the complex portion of this process. Training the neural network is discussed in the following section.
How the Neural Network Learns
Once the initial random weight matrix is created, the training can begin. First the weight matrix is evaluated to determine what its current error level is. This error is determined by how well the training input (the letters that you created) maps to the output neurons. The error is calculated by the "evaluateErrors" method of the KohonenNetwork class. If the error level is low, say below 10%, the process is complete.
When the user clicks the "Begin Training" button, the training process begins with the following code:
int inputNeuron = MainEntry.DOWNSAMPLE_HEIGHT*This calculates the number of input and output neurons. First, the number of input neurons is determined from the size of the downsampled image. Since the height is 7 and the width is 5, the number of input neurons will be 35. The number of output neurons matches the number of characters the program has been given.
This is the part of the program that could be modified if you want it to accept and train from more than one sample per letter. For example, if you wanted to accept four samples per letter, you'd have to make sure that the output neuron count remained 26, even though 104 input samples were provided to train with (4 for each of the 26 letters).
Now that the size of the neural network has been determined, the training set and neural network must be constructed. The training set is constructed to hold the correct number of "samples." This will be the 26 letters provided.
TrainingSet set = new TrainingSet(inputNeuron,outputNeuron);Next, the downsampled input images are copied to the training set; this is repeated for all 26 input patterns.
for ( intt=0;t<letterListModel.Finally the neural network is constructed and the training set is assigned, so the "learn" method can be called. This will adjust the weight matrix until the network is trained.
net = newKohonenNetwork(inputNeuron,outputThe learn method will loop up to an unspecified number of iterations. Because this program only has one sample per output neuron, it's unlikely that it will take more than one iteration. When the number of training samples matches the output neuron count, training occurs very quickly.
n_retry = 0 ;A method, "evaluateErrors", is called to evaluate how well the current weights are working. This is determined by looking at how well the training data spreads across the output neurons. If many output neurons are activated for the same training pattern, then the weight set is not a good one. An error rate is calculated, based on how well the training sets are spreading across the output neurons.
evaluateErrors ( rate , learnMethod, won ,Once the error is determined, we must see if it is below the best error we've seen so far. If it is, this error is copied to the best error, and the neuron weights are also preserved.
totalError = bigerr ;The total number of winning neurons is then calculated, allowing us to determine if no output neurons were activated. In addition, if the error is below the accepted quit error (10%), the training stops.
winners = 0 ;If there is not an acceptable number of winners, one neuron is forced to win.
if ( (winners <outputNeuronCount) &&Now that the first weight matrix has been evaluated, it's adjusted based on its error. The adjustment is slight, based on the correction that was calculated when the error was determined. This two-step process of adjusting the error calculation and adjusting the weight matrix is continued until the error falls below 10%.
adjustWeights ( rate , learnMethod , won , bigcorr, correc ) ;This is the process by which a neural network is trained. The method for adjusting the weights and calculating the error is shown in the KohonenNetwork.java file.
Neural networks provide an efficient way of performing certain operations that would otherwise be very difficult. Consider how a character recognition program would work without neural networks. You'd likely find yourself writing complex routines that traced outlines, analyzed angles, and did other graphical analysis. Neural networks should be considered anytime complex patterns must be recognized. These patterns don't need to be graphical in nature. Any form of data that can have patterns is a candidate for a neural network solution.
Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week