Spring 2017

Learning Goals

1. Students understand and can write programs to store and manipulate data and measurements.
2. Students understand and can implement the fundamental concepts of interactive visualization of data.
3. Students understand and can implement common data transformations and statistical analysis.
4. Students understand and can make appropriate use of current machine learning techniques for prediction and knowledge discovery.
5. Students present methods, algorithms, results, and designs in an organized and competently written manner.
6. Students gain experience working with real data from disciplines outside computer science.

Textbooks

Witten, Frank, and Hall, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2011, 3rd Ed.

 Weekly Projects 45% Quizzes 25% Homeworks 5% Class Participation 5% Final Exam 20%

This course covers the analysis and visualization of scientific data. Topics will include data management, basic statistical analysis, data mining techniques, and the fundamental concepts of machine learning. Students will also learn how to visualize data using 2-D and 3-D graphics, focusing on techniques that highlight patterns and relationships. Course projects will use data from active research projects at Colby.

1:
• Visualization
• Coordinate Systems and Transformations
• Homogeneous Coordinates
Tkinter tutorials
2:
• Defining 2D and 3D Viewing Systems
• 2D and 3D Viewing Pipeline
• User Interaction and View Parameters
Numpy tutorials
3:
• Interactive camera control
• Data Representations and Storage
• Defining Views Given a Data Set (Quiz)
Lecture notes
4:
• Numeric Computation: Computing a Sum
• Normalizing Data
• Representing Data Using Color
Lecture notes
5:
• Computing Ranges
• Histograms and Bucket Selection
• (Quiz)
Lecture notes
6:
• Linear Regression
• Multiple Linear Regression
• Covariance Matrix and PCA
Lecture Notes
7:
• SVD and Eigenspaces
• Eigenvalues and Eigenvectors
• Measurement, Variation, and Noise (Quiz)
Lecture Notes

Spring Break
8:
• Clustering
• Kmeans, ISODATA, Online Clustering
• Estimating the Number of Clusters
Maxwell & Buddemeier, Lecture Notes
9:
• Fuzzy C Means
• Distance Metrics
• Dynamic Time Warp Matching(Quiz)
Lecture Notes
10:
• Pattern Recognition, No Free Lunch / Ugly Duckling Theorems
• NN, KNN, Naive Bayes, Decision Tree Classifiers
• K-D Trees,
Moore K-D Trees, Lecture Notes
11:
• Decision Trees
• Regression Trees
• Locally-Weighted Linear Regression (Quiz)
Lecture Notes
12:
• Artificial Neural Networks
• Training ANNs
• Modern ANN Systems
Lecture Notes
13:
• Topics in Bioinformatics
• BLAST Algorithm
• Review (Quiz)
Lecture Notes

Policies

Attendance and Participation

For this course to be truly successful, your presence and participation are important. This course covers material that is new enough that the lectures and materials provided by the professor will be the primary resource for the course. Asking questions in class is an important part of learning. When you have a question, ask it. It is highly probable that one of your classmates has the same question. When you have an opportunity to share your opinion or your answer, please speak up. Your professor wants to hear what you have to say. And, of course, to participate in class you must attend class. If you must miss a class, you are responsible for making up the material covered in that lecture.

The short homework assignments must be turned in on time. No late short assignments will be accepted because we will refer to their solutions in class. The longer programming assignments must be turned in on time for maximal credit. Late projects (programming assignments) will be accepted, but will be given reduced grades. It is better to hand in a mostly functional project on time than to be late with something that might be better.

Computer science, both academically and professionally, is a collaborative discipline. In any collaboration, however, all parties are expected to make their own contributions and to generously credit the contributions of others. In our class, therefore, collaboration on homework and programming assignments is encouraged, but you as an individual are responsible for understanding all the material in the assignment and doing your own work. Always strive to do your best, give generous credit to others, start early, and seek help early from both your professors and classmates.

In addition to the ethical implications of dishonesty, you undermine your ability to learn when you cheat. Honesty, integrity, and personal responsibility are cornerstones of a Colby education and provide the foundation for scholarly inquiry, intellectual discourse, and an open and welcoming campus community. These values are articulated in the Colby Affirmation and are central to this course. Students are expected to demonstrate academic honesty in all aspects of this course.

• If you have had a substantive discussion of any programming project with a classmate, then be sure to cite them in your write-up. If you are unsure of what constitutes "substantive", then ask me or err on the side of caution. As one rule of thumb, if you see more than 10 lines of someone else's code, then you should cite them. You will not be penalized for working together.
• You must not copy answers or code from another student either by hand or electronically. Another way to think about it is that you should be using a natural languagewith one another, not a computer language.

Sexual Misconduct/Title IX Statement

Colby College prohibits and will not tolerate sexual misconduct or gender-based discrimination of any kind. Colby is legally obligated to investigate sexual misconduct (including, but not limited to sexual assault and sexual harassment).

If you wish to speak confidentially about an incident of sexual misconduct, please contact Colby Counseling Services (207-859-4490) or the Director of the Gender and Sexual Diversity Program, Emily Schusterbauer (207-859-4093).

Students should be aware that faculty members are considered responsible employees; as such, if you disclose an incident of sexual misconduct to a faculty member, they have an obligation to report it to Colby's Title IX Coordinator. "Disclosure" may include communication in-person, via email/phone/text, or through class assignments.