CS 251: Lab #7

Lab 7: Naive Bayes Analysis

Project due Monday night Apr 14, 2014

The purpose of this lab is to give you the opportunity to implement a Naive Bayes classifier. You will do so by adding two functions to your analysis module.


Tasks

  1. Write analysis.naive_bayes_build_classifier

    The role of this function is to compute the parameters we need to classify new points. We assume the data distributions are Gaussian and that the features are independent.

            # Perform the statical analysis needed for a Naive Bayes classification
            # mat is the data (N x F matrix) where N is the number of rows in the data set and F is 
            # the number of features.
            # class_vals is an N x 1 matrix with values in the set 0, 1, ... Num_Classes-1.
            # It will be cast to an Nx1 matrix of ints if those values happen to be stored
            # as floats.
            # Returns the class_means, class_variances, and class_scales, each of which 
            # is a C x F matrix where C is the number of different class values
            def naive_bayes_build_classifier( mat, class_values ):
    
  2. Test it with naive_bayes_test1.py, which uses iris_proj7_test.csv. The output Stephanie gets for her code is here.
  3. Write analysis.naive_bayes_classify

    The role of this function is to classify new points, according to the parameters developed by the previous function.

            # Perform a Naive Bayes classification
            # mat is the data (N x F matrix) where N is the number of rows in the 
            # data set and F is the number of features.
            # class_vals is an N x 1 matrix of ints, with class vals 0, 1, ... Num_Classes-1 
            # class_means, class_variances, and class_scales are all C x F matrices
            #  where C is the number of different class values
            # Returns a predicted_class_vals as a N x 1 matrix       
            def naive_bayes_classify(mat, class_means, class_variances, class_scales):
    
  4. Test it with naive_bayes_test2.py, which also uses iris_proj7_test.csv. The output Stephanie gets for her code is here.

When you are done with the lab exercises, you may start on the rest of the project.