CS 251: Assignment #9

Data Analysis and Visualization

Due May 16, 2011


  1. Pick a data set that can be used to do one of the following.
    1. answer a scientific question,
    2. answer a question with practical significance, or
    3. generate a software tool with a practical application.

    As example of a scientific question, consider a data set on purple finch banding and recovery (when a banded bird is re-captured one or more times). There are observational questions about the mean, median, and range of distances the birds travel. There are also relationships of interest, such as whether the distances the birds travel is related to the season. This data set is one option to consider.

    As an example of a practical question, consider loads on the Colby network. There are observational questions about the levels of traffic and relational questions about how traffic changes over time at periods of a day, a week, and a semester.

    As an example of a tool with practical application, consider the handwriting digits data set from the last assignment. You could, using that data set as a training set, train a software tool for reading hand-written digits. There are many other examples of this, including data from a flow-cytometry machine in Biology that needs to be classified into categories.

  2. After choosing your data set, and discussing it with the professor, pick 1-2 questions you want to answer. Outline a set of visualizations and analyses that will let you answer the question. Order them by priority and tailor your analysis and visualization GUI to the tasks. You should feel free to simplify the system you have.

Data Sets


Your writeup should be in the form of a paper, no more than 6 pages in length, that follows the format of a traditional conference paper. You can do your writeup on the wiki or using latex. If the latter, ask and I'll give you a latex style file you can use.

  1. Abstract: 350 words or less.
  2. Introduction: a high-level description of the question or application and your approach to answering the question or building the application.
  3. Theory/Methods: a description of theoretical concepts and methods you used.
  4. Experiments/Design: a description of the process you used to answer the questions or the design process you used to build your tool.
  5. Results/Demonstration: a description of your results, graphs, visualizations, or capabilities.
  6. Discussion: a discussion of your results or the utility of your tool
  7. Conclusions/Summary: any final comments on your results, identifying the most important ones.


Once you have written up your assignment, give the page the label:


Put your code in the COMP/CS251/yourname/private/ folder on fileserver1/Academics in a project9 folder. Make sure the program runs properly and has all of the necessary files, data and otherwise.