Due: , 11:59 pm

The goal of this week's project is to start building more general functions that may be useful in many contexts. We're going to start using a concept called modular design where we build more complex programs on top of simpler functions.

The purpose of this lab time is to build a set of general functions in a library and then use that library in a separate Python file. This will be the first time you call a function from another Python file that you wrote by importing one file into another. You will also do a little experimentation with lists, which are a data structure for holding sequences of values.

For the project, you'll continue to work with the live buoy data, doing some more sophisticated calculations and analyses, aided by our function library.

Lists are an important data structure in Python. A list is an ordered sequence of values. In Python the values of a list can be of any type, including other lists, and the list does not have to all be of the same type (that is not necessarily the case for all programming languages). For this assignment, we'll mostly be dealing with lists of numbers.

Create a new file in TextWrangler and put the following assignment into it.

`a = [5, 3, 6, 1, 2]`

In Python, a list is delimited by square brackets, and the elements of
the list are separated by commas. To access one element of a list,
use syntax called bracket-notation. The first element of a list has
index zero. To access the first element of a list named `a`:

`a[0]`

Add two more lines to your file to print out the 5 and the 1 (the first element in the list and the fourth element in the list).

To add items to a list, you use the append method. That means that you
use the name of the list and then add `.append()`

with the item to
append inside the parentheses. For example, add the following two
lines to your Python code:

`a.append(7)`

`print(a)`

You should now see a 7 at the end of the list.

To change the value of an element of a list, you use an assignment. Use bracket notation to specify in which element of the list to store the new data. For example, the following changes the first element of the list to a 4 and then prints the updated list.

`a[0] = 4`

`print(a)`

Try editing two other locations in the list in your Python file, then print out the list and make sure it did what you think it should have.

So far, all of our programs have processed data while we read it from an incoming stream. For this project, we want to store all of the incoming data in a list and then manipulate the data in the list. The concept is straightforward: start with an empty list and as your program reads through the input stream store each value in the list.

Start a new Python file called `storedata.py`. Put your name, date, and
course and project information at the top of the file. Then copy and paste the following
algorithm, and use the comments to guide you as you fill in the code.

import sys def main(stdin): # assign to mylist the empty list [] # assign to buf the result of calling readline on stdin # while buf.strip() is not equal to the empty string # append to mylist the result of casting buf to a float # assign to buf the result of calling readline on stdin # print mylist return if __name__ == "__main__": main(sys.stdin)

Download this file and test your program using the following command.

cat 2015-08-01-dosat.csv | python3 storedata.py

Take a look at the data file and make sure you're happy with the result.

Now we're going to create a file that holds a library of useful
functions. In Python, a library (or "package", or "module") is a file
that contains functions. When you import the module into another
Python file, you can use those functions. We've already done this by
importing the `sys` package into our programs, which enables us to use
the `stdin` functions. Now we're going to create our own module and then
import it into other Python files.

Create a new file called `stats.py`. Put your name, date, and
course and project information at the top of the file.

Create a new function called `sum()` that takes one argument.
You can assume that argument will be a list of numbers. This
function should add together all of the values in the list and
return the sum.

The algorithm is as follows. Create a variable to hold the sum and initialize it to 0.0 (explicitly make it a floating point number). Then loop over the list provided as the function parameter. Inside the loop, add each number to the variable holding the sum. Once the loop is complete, return the sum.

To test your function, make a second function
called `test()` at the bottom of your `stats.py` file. The
function does not require any arguments. As the first
instruction in the test function, assign the list `[1, 2, 3, 4]`
to a variable. For the second instruction assign to a second
variable the result of calling sum with the list as the
argument. For the third instruction, print out the variable
holding the result. Run your program and make sure you get the
value 10.0 as an output.

Put the following at the end of your `stats.py` file to call the
test function only when you run the `stats.py` file directly. The
test function will not run when `stats.py` is imported into
another Python file.

if __name__ == "__main__": test()

Create another function called `mean()` that computes the
mean value of the numbers in the list. The only difference
between the `mean()` function and the `sum()` function
is that you want to return the sum divided by the number of values
in the input list. You can use the `len()` function to get
the number of elements in a list. Add a test of your mean
function to the test function at the bottom of the file. Run it
and make sure you get the right answer (2.5).

Create two more functions, `max()` and `min()` that
compute the minimum and maximum values in the list. This
algorithm has the following steps:

- assign the first value in the list to a variable (e.g.
`maxval`), - loop over the rest of the list,
- inside the loop, if the value is greater than
`maxval`, assign the new value to`maxval`, - once the loop is complete, return
`maxval`.

The `min()` function is identical
except you want to update `minval` only if the value in the list is
less than the smallest value seen so far.

Test your `max()` and `min()` functions by adding the appropriate code
to your test function. Make sure the answers you get make sense.

Create two more functions, `variance()` and `stdev()`,
that compute the variance and standard deviation of the values in
the list, respectively. The variance is defined as the following:

You can use the `mean()` function to calculate the mean `x` (pronounced "x bar"),
then use a loop over the list of numbers (the `x _{i}`) to
compute the sum of the squares of the difference between each
number and the mean. (The mean

Note that a summation in mathematical sigma notation is simply a loop over the list of numbers, summing the calculated value each time through the loop.

The standard deviation is the square root of the variance. You
can call the variance function from the standard deviation
function and return the square root of the variance. Note how
you are building the standard deviation function on top of two
layers of other functions: the stdev function calls the variance
function, which calls the mean function. This avoids code
duplication, which reduces the likelihood of coding errors, and
makes your coding task faster and easier. Also, note that you can
use `math.sqrt()` to compute the square root (`import math`

first).

Test your variance and standard deviation functions by adding code to your test function. You should get 1.67 for the variance and 1.29 for the standard deviation (rounding to two decimals).

The last step in the lab is to import your stats library into another
file and use the functions. Reopen your `storedata.py` file. At the
top, after the `import sys`

, put the following line of code:

`import stats`

To call a function in your `stats` library, all you have to do is
put `stats.`

in front of the name of the function you want to
use. Calculate the mean of the values in the list you read from `stdin`.
To do this, after the line in the function that prints the list,
assign to a variable the result of calling `mean()` from your
`stats` library with the list of values as the argument. Then print out
the result.

If you run your `storedata.py` program using the same command as above
in Section 2, you should get 21.071875 as the answer.

When you are done with the lab exercises, you may start on the rest of the project.

© 2018 Caitrin Eaton.