Assignment 5: Eigenspace Digit Recognition
Due 15 December 2007
For this project, we'll do the same task as lab 4, but using an eigenspace recognition technique. The basic idea is to use a training set to create a set of basis images. Then use the basis images to convert the training set into a set of models, where the feature vector for each model is its projection into the eigenspace. Finally, classify new images by projecting them into the eigenspace and comparing them to the models.
Work on this project with your partner from the last lab.
Download the following code files:
- util.c: a set of useful functions
- calcEigenSpace.c: an almost complete program for calculating an eigenspace from a set of images.
- svdcmp.c: an implementation of SVD.
Put the util.c and svdcmp.c in your image library and add them to the makefile. You will also want to add prototypes for the functions in util.c and for svdcmp() to your vision.h include file.
- Create a program that reads in an image with a digit in it (use the training set from lab 4), extracts the oriented bounding box of the region, and writes it out to a 40 row by 20 column greyscale (PGM) image. Most of what you need to do is implemented by the functions in util.c. The rest is just thresholding and region segmentation, just as in lab 4.
- Execute the above program on half of the single digit data (5 examples for each digit), which should generate 50 images.
- Finish implementing the calcEigenSpace executable, as noted in the code. The program should take as input the number of eigenvectors to produce and a list of images on the command line and write out the top N eigenvectors and the average image. It is useful to write them out both as binary files and as PGMs that can be visualized. Based on a preliminary analysis, you probably want about 10 eigenvectors to capture about 70% of the data variation.
- Use the executable to create an eigenspace using the 50 single digit images extracted in step 2. You should now have a set of files for the eigenvectors and one file for the average vector.
Write an executable that takes in a 40 row by 20 column image, the
average image, and the eigenvectors. The program should then:
- normalizes the image to a length of 1,
- subtract the average vector, and
- project it into the eigenspace by taking the dot product of the different image and each of the eigenvectors.
The output should be a set of N numbers, where N is the number of eigenvectors. Think of the set of N numbers as a feature vector, similar to the feature vectors from lab 4. You probably want to create a function that takes as input the required information to convert a 40x20 image into a feature vector.
- Using the above executable, create a single model file for each digit (use the same format as the last lab). You can use a single example for each digit, or you can average the feature vectors for several examples of each digit.
- Create a program that reads in a new image of one or more digits, extracts the oriented bounding box from each digit and converts it to a 40x20 image. Then for each digit, the extracted image is converted into an eigenspace feature vector, which is compared with the models to generate a label. You can use any pattern classification method you wish, but start with nearest neighbor.
- Generate a confusion matrix of your results on the testing set, which is the remaining 5 images per digit.
- Get the system running in real time and compare the two recognition systems. If you can profile your code, try to evaluate which of the methods is faster.
- Use a different pattern recognition method (e.g. K-nearest neighbor or a neural network) to classify your data.
- Do some interesting visualizations with your real time system.
- Try a different pre-processing routine. For example, rather than using the original greyscale image, try using the region map so that each pixel is either 255 (in region) or 0 (out of region). It ought to simplify the eigenspace considerably.
Follow the writeup instructions to create a web page for your assignment. Send the instructor an email with the code in a zip or tar file along with instructions on how to compile and run it as well as a pointer to the URL for the writeup.