Lab Exercise 3: Modular Design and Lists
The goal of this week's project is to start building more general functions that may be useful in many contexts. We're going to start using a concept called modular design where we build more complex programs on top of simpler functions.
The purpose of this lab time is to build a set of general functions in a library and then use that library in a separate Python file. This will be the first time you call a function from another Python file that you wrote by importing one file into another. You will also do a little experimentation with lists, which are a data structure for holding sequences of values.
For the project, you'll continue to work with the live buoy data, doing some more sophisticated calculations and analyses, aided by our function library.
Lists are an important data structure in Python. A list is an ordered sequence of values. In Python the values of a list can be of any type, including other lists, and the list does not have to all be of the same type (that is not necessarily the case for all programming languages). For this assignment, we'll mostly be dealing with lists of numbers.
Create a new file in TextWrangler and put the following assignment into it.
a = [5, 3, 6, 1, 2]
In Python a list is delimited by square brackets, and the elements of the list are separated by commas. To access one element of a list, use syntax called bracket-notation. The first element of a list has index zero and is specified with the following expression.
Add two more lines to your file to print out the 5 and the 1 (the first element in the list and the fourth element in the list).
To add items to a list, you use the append method. That means that you use the name of the list and then add .append() with the item to append inside the parentheses. For example, add the following two lines to your Python code.
a.append(7) print a
You should now see a 7 at the end of the list.
To change the value of an element of a list, you use an assignment. Use bracket notation to specify in which element of the list to store the new data. For example, the following changes the first element of the list to a 4 and then prints the updated list.
a = 4 print a
Try editing two other locations in the list in your Python file, then print out the list and make sure it did what you think it should have.
- Building Lists
So far, all of our programs have processed data while we read it from an incoming stream. For this project, we want to store all of the incoming data in a list and then manipulate the data in the list. The concept is straightforward: start with an empty list and as your program reads through the input stream store each value in the list.
Start a new Python file called storedata.py. Put your name, date, and CS 151S at the top of the file. Then copy and paste the following algorithm.
import sys def main(stdin): # assign to mylist the empty list  # assign to buf the result of calling readline on stdin # while buf.strip() is not equal to the empty string # append to mylist the result of casting buf to a float # assign to buf the result of calling readline on stdin # print mylist return if __name__ == "__main__": main(sys.stdin)
Download this file and test your program using the following command.
cat 2015-08-01-dosat.csv | python storedata.py
Take a look at the data file and make sure you're happy with the result.
- A Library of Useful Functions
Now we're going to create a file that holds a library of useful functions. In Python, a library, or package, or module, is a file that contains functions. When you import the module into another Python file, you can use those functions. We've already done this by importing the sys package into our programs, which enables us to use the stdin functions. Now we're going to create our own module and then import it into other Python files.
Create a new file called stats.py. Put your name, date, and CS 151S at the top of the file.
Create a new function called sum that takes one argument.
You can assume that argument will be a list of numbers. This
function should add together all of the values in the list and
return the sum.
The algorithm is as follows. Create a variable to hold the sum and initialize it to 0.0 (explicitly make it floating point number). Then loop over the list provided as the function parameter. Inside the loop, add each number to the variable holding the sum. Once the loop is complete, return the sum.
To test your function, make a second function called test at the bottom of your stats.py file. The function does not require any arguments. As the first instruction in the test function, assign the list [1, 2, 3, 4] to a variable. For the second instruction assign to a second variable the result of calling sum with the list as the argument. For the third instruction, print out the variable holding the result. Run your program and make sure you get the value 10.0 as an output.
Put the following at the end of your stats.py file to call the test function only when you run the stats.py file directly. The test function will not run when stats.py is imported into another Python file.
if __name__ == "__main__": test()
- Create another function called mean that computes the mean value of the numbers in the list. The only difference between the mean function and the sum function is that you want to return the sum divided by the number of values in the input list. You can use the len function to get the number of elements in a list. Add a test of your mean function to the test function at the bottom of the file. Run it and make sure you get the right answer (2.5).
Create two more functions, max and min that
compute the minimum and maximum values in the list. This
algorithm has the following steps: (1) assign the first value in
the list to a variable (e.g. maxval), (2) loop over the rest of
the list, (3) inside the loop, if the value is greater than
maxval, assign the new value to max val, (4) once the loop is
complete, return maxval. The min function is identical
except you want to update minval only if the value in the list is
less than the smallest value seen so far.
Test your max and min functions by adding the appropriate code to your test function. Make sure the answers you get make sense.
Create two more functions, variance and stdev,
that compute the variance and standard deviation of the values in
the list, respectively. The variance is defined as the following.
You can use the mean function to calculate the mean (x bar), then use a loop over the list of numbers (the x_i) to compute the sum of the squares of the difference between each number and the mean. Note that a summation in a mathematical equation is simply a loop over the list of numbers, summing the calculated value each time through the loop.
The standard deviation is the square root of the variance. You can call the variance function from the standard deviation function and return the square root of the variance. Note how you are building the standard deviation function on top of two layers of other functions: the stdev function calls the variance function, which calls the mean function. This avoids code duplication, which reduces the likelihood of coding errors, and makes your coding task faster and easier. Also, note that you can use math.sqrt to compute the square root (import math first).
Test your variance and standard deviation functions by adding code to your test function. You should get 1.67 for the variance and 1.29 for the standard deviation (rounding to two decimals).
- Create a new function called sum that takes one argument. You can assume that argument will be a list of numbers. This function should add together all of the values in the list and return the sum.
- Using a Library
The last step in the lab is to import your stats library into another file and use the functions. Reopen your storedata.py file. At the top, after the import sys, put the following line of code.
To call a function in your stats library, all you have to do is put stats. in front of the name of the function you want to use. Calculate the mean of the values in the list you read from stdin. To do this, after the line in the function that prints the list, assign to a variable the result of calling the mean function from your stats library with the list of values as the argument. Then print out the result.
If you run your storedata.py program using the same command as above in section 2, you should get 21.071875 as the answer.
When you are done with the lab exercises, you may begin the project.