CS 251: Assignment #6

Exploring Visualization

Due Thursday, 31 March, 2011

The goal of this week's lab is to create visualizations and run analyses on two data sets from active research projects.

As with the last project, you may work with a partner, if you wish. Note that, if you work with a partner, the standards for extensions will be higher.

Data Sets

We have three data sets for visualization. You need to demonstrate your program on all three. You are free to substitute a data set from an active research project at Colby for one of them, if you wish. All three data sets are available on the Academics server in csv format in the CS251/Data folder.


The main goal this week is to explore how to visualize these data sets. A secondary goal is to enable simple analysis, such as providing means, standard deviations, and ranges in easy-to-use forms.

Read through all of the tasks and plan your design before you start writing code.

  1. For both data sets, it will be necessary to use one column to filter which data to display. In the case of bird arrivals, users may wish to filter data by year, species, observation region, or some combination. Users may also wish to view which species have arrival observations within a certain window of dates.

    In the case of the eye-tracking data, a user may wish to view durations filtered by subject, image, coded location, or some combination of those variables.

    Implement the capability for the user to specify which columns to view, which columns to use as a filter, and what the parameters of the filter should be. You may want to begin by working with your DataSet class to create a filteredSelect method and then build the interface to provide the required inputs to the function. Note that the column being viewed may also be the column being used as a filter. For example, a user may want to view the bird arrival DOY distribution for values of DOY between 100 and 150.

    You can assume the user is reasonably friendly and intelligent for this task. To let the user specify the range of values in a column to use as a filter you can use simple text boxes. The filtering capability should, ideally, apply to 1D, 2D and 3D plots.

  2. For both data sets, the user probably wants to see multiple 1-D plots overlaid on one another or in the same visual figure (like a multi-plot). For example, the user may want to see multiple bird species plotted as histograms of DOY, or for the eye-tracking data, the user may want to see a histogram of durations for one set of images compared to a second set of images.

    Implement the capability for the user to view multiple 1-D plots within the same figure (the kinds of plots you are making for task 1). This will probably be a check box to indicate whether to expand or overlay a new plot on the current figure or whether to create a new figure. Restrict this task to plots you are building with matplotlib, such as histograms.

  3. For the last project you implemented 2D or 3D viewing with color to provide a 4th dimension to the visualization.

    This week, implement using size or shape to the visualization to enable interactive viewing a 4th (2 spatial + color + size) or a 5th (3 spatial + color + size) dimension. Demonstrate this on the AustraliaCoast data set.

  4. Pick one of the following visualizations and add them to your system. Talk with Prof. Maxwell about the visualization before you start.
    1. For the bird arrivals data, calculate the mean and standard deviation of the DOY for each bird across all years and regions. Then create a new data set where each data point is the name of the bird, its mean DOY, and the standard deviation of its DOY. Then create a 2D plot of all the bird species using the mean and standard deviation values as the axes. See if you can attach text with the bird's name to the data point or have it pop up when the user clicks or moves their mouse over a data point (neither is required, though). Expand the derived data set, and the range of possible visualizations, by calculating DOY mean and standard deviation values by region or by year.
    2. For the bird arrivals data, enable plotting the mean arrival time in one region versus the mean arrival time in a second region across all birds. For example, calculate the mean arrival time in region one for all birds, then calculate the mean arrival time in region two for all birds. Plot the results in a 2D graph. Calculate the mean arrival times for a 3rd region and make it a 3D graph.
    3. For the Eye-Tracking data, create a visualization that plots the fixations on top of the stimulus image (for one subject or multiple subjects). The images are available in the Data directory on the Academics server.
    4. For the Eye-Tracking data, create a heat map that overlays the density of fixations over the stimulus image.
    5. Create a visualization that, over all subjects, shows the location of the first fixation on the foreground object (coded category F). Use color to indicate the position of the fixation in the ordering (e.g. bright green if it's number 1, bright red if it's the last fixation).
    6. Create a visualization that, over all subjects, explores where subjects looked first. See if you can create a plot that captures when they made the first fixation (start time), how long it lasted (duration), and what they were viewing (coded category).
    7. Be creative and develop something you think would be useful for exploring the Bird Arrivals or Eye-Tracking data or demonstrating some aspect of the relationships within it.



For this week's writeup, create a wiki page that shows your visualizations. Explain how you made them, what they mean, and what they show. Your audience is the people who are working on the research.


Once you have written up your assignment, give the page the label:


Put your code in the COMP/CS251 folder on fileserver1/Academics. Please make sure you are organizing your code by project.