CS 397

Assignment 4: Real-time 2-D Object Recognition

Part 1: due 6 November
Part 2: due 15 November


This project will introduce you to both 2-D object recognition and real-time image processing. The first week will focus on building the recognition system. The second week will focus on the real-time processing aspects. Because of the real-time goal, it is important to think about efficiency and memory management in week 1.


Work on this project with a partner.

Download the data set. The data set is a collection of pictures of the ten objects we will be trying to differentiate. The data consists of ten 320x240 pictures of each of the ten objects plus a set of images that contain multiple objects. For the procedure below, use the first six images of each object for training and the last four images of each object for testing.


The following procedure walks you through a gallery method of pattern recognition. You will use the training set to build a mean feature vector for each object and then compare new inputs to the set of mean feature vectors to find the closest match. The new input then gets the label of the closest match or is labeled as unclassified if the match is not good enough.

Once you have a working baseline system, design a second method that uses a different method of classification. The simplest thing to do is change to a k-nearest neighbor classifier and represent each object with multiple gallery examples. A more interesting second method would be to build something like a decision tree. You should feel free to capture more training images if you feel your method needs them.

Baseline procedure:
  • Write a function that takes the input image and thresholds it into foreground and background. This will likely need to be a dynamic function, as the illumination levels may change dynamically in the real-time version of the system. The isodata algorithm in greyscale or color ought to work fine. The background will be uniformly white, and all of the objects will be darker. The output of this step should be a binary image.
  • Apply any pre-processing to the binary images that may be useful to close holes or get rid of noise in the background. Your code from assignment 2 may be useful here. The output of this step should be a binary image.
  • Apply a connected components algorithm to the resulting binary image to get a set of regions. Filter the regions by size so get rid of any regions that are unlikely to be objects of interest. Your code from assignment 2 may be useful here. The output of this step is a region map and a set of region attributes (e.g. size and bounding box).
  • Write a function that takes a region map and region id and calculates a set of features for the region. Your feature vector may be any length or form. Moments, region size, and percent of bounding box filled are simple characteristics you could use here. You can also build histograms of color or other characteristics. Shape alone or color alone will not differentiate all of the objects. Your code from assignments 2 and 3 can be useful here.
  • Write a standalone program that takes as input a set of training images that contain only the object of interest and builds a model for that object using all of the code developed above. Use the mean feature vector for the set of training images as the object model. Store the object model as a file that is human-readable as well as easy to parse.
  • Generate a model for each object so you have ten model files. For each feature, calculate its variance and build a variance file so you can use scaled Euclidean distance for comparing feature vectors. If you want to get fancy, save the covariance matrix so you can use Mahalanobis distance.
  • Write a standalone program that takes as input an image and the set of model files, processes the image to obtain a feature vector, and classifies the input using a nearest-neighbor classifier with either scaled Euclidean distance or Mahalonobis distance. If the input is not close enough to any of the models, classify as other.
  • Generate a confusion matrix of your results on the testing set.
  • Write a standalone program that uses a different method to classify a new input. K-nearest neighbor is a simple modification that ought to perform better.
  • Generate a confusion matrix of your results on the testing set.
  • Week 2: make your code run in real-time so that you can insert objects in front of the camera and get out a label for each one.
  • Extensions


    Follow the writeup instructions to create a web page for your assignment. Send the instructor an email with the code in a zip or tar file along with instructions on how to compile and run it as well as a pointer to the URL for the writeup.