Objectives

The purpose of this project is to give you chance to break down simple problems into precise steps that solve the problem. In computer science speak, you will be developing algorithms. In this case you will write your algorithms as sequences of Python commands that calculate a value or otherwise manipulate data.


Tasks

Setting Up

If you haven't already set yourself up for working on the project, then do so now:

  1. Mount your directory on the Personal server.
  2. Open the Terminal and navigate to your Project1 directory on the Personal server.
  3. Open your text editor (referred to hereafter as "TextWrangler"). If you want to look at any of the files you have already created, then open those files.

Working with Numbers

The first part of the project is to continue the example from lab of adding three numbers. You will write six different versions of the code (you have already written one) that become increasingly sophisticated.

Version 2

Run your add3.py file from lab:

python3 add3.py

It should be clear that Python is executing integer math (no decimals in the answer). Note that we did not include any decimals in our code, either, so Python automatically assumed that we wanted it to do integer math.

For version 2, copy your version 1 lines of code and paste them below version 1. Change the first print statement to 'version 2', and then add a decimal point and a zero to each number in the code. Now you are telling Python that each of the numbers is a floating point number, so it should do floating point math.

Run add3.py again and see if it gives you a different answer than version 1.

Testing Your Code

Sometimes your code doesn't work. The first thing to do is to carefully read any error message. It should direct you to the line of the file that failed to work. Some common errors include the following.

Version 3

Copy your version 2 code and paste it at the bottom of the file. Change the comment and first print statement to 'version 3'. Then remove the .0 from all of the numbers except the 3. Leave the 3.0 alone.

Run add3.py again and see what answer it gives you.

The lesson from version 3 is that when Python does a computation, it uses the most flexible representation present in the mathematical expression to hold the result.

Version 4

If we wanted to use the program to add a different three numbers, how many places to we have to change the code? Because we have to change all instances of a number everywhere in the code, we have to change each number in two different places. From a coding point of view, that is a bad idea. It is inefficient (takes too much time and effort) and prone to create errors in the code (lots of opportunities to type the wrong thing).

To make our code more efficient, start a version 4 at the bottom of the file. Make three assignments, assigning to the variable a the number 42, assigning to variable b the number 21, and assigning to variable c the number 5. The equals sign is the assignment operator and it copies the information on the right side of the assignment to the variable on the left side of the assignment. So to assign to the variable a the number 42 you would write the following code:

a = 42

Another way of describing that statement is to say the a "gets" the value 42.

After you have made the three assignments, you can use the variables a, b, and c in mathematic expressions, just like we used 42, 21, and 5 in the prior versions.

Have Python print out the sum of the three variables. Just change the expression 42 + 21 + 5 in version 1 to a + b + c for version 4. Do the same for the average, using the expression (a + b + c) and then dividing that expression by 3.0

Make sure your code prints out the same set of values as version 3.

Version 5

In version 4, if we want to add a different set of three numbers, we still have to edit the code. Version 5 will let us enter the numbers on the terminal when we run the program, letting us add any three numbers without changing the code.

Copy version 4 to the bottom of the file. But instead of assigning numbers to the variables a, b, and c, assign the result of calling the input() function. Your first assignment might look like this:

a = input("Enter first number :")

Modify the assignments for b and c to accept input as well.

The input() function prints out the prompt and then waits for the user to type something and hit the return key. In this example, whatever the user types after the prompt is stored in a variable.

What happens when you run this program? Is Python happy with the last line of code?

The basic problem with this program is that input() returns a string of characters, not a number. Therefore, we have to convert the string into a number before we can add and divide the numbers properly. This is a process called casting. The following assignment converts the string in variable a into an integer and then assigns the integer back to the variable a, overwriting its old contents.

a = int(a)

Modify version 5 so that it converts the three variables to integer values before executing the sum or average expressions.

help( )

Note that you can find out more about a function like input() by using the Python help function. If you start Python in a terminal (type python3 and hit return), then you can ask Python for help about functions and types. If you type

help(input)

then Python will give you more information about the function, including the fact that it returns a string. You can type q to get out of the help function. Likewise, if you type

help(int)

then Python gives you a lot of information about the int type, including that it will convert a number or string to an integer. Again, type q to exit the help viewer. To exit the Python interpreter, you can use Ctrl-d, or you can type exit().

Version 6

For version 6, create a new file. Call it three.py. Put your name, a date, and the project at the top in comments. After that, import the sys package by writing the following line of code:

import sys

Importing packages is something we do a lot when using Python, because there are lots of packages people have written that do useful things. The sys package lets us communicate with information from the terminal.

Copy version 5 into the new file and call it version 6. Instead of using input() to get information from the Terminal, however, use the function sys.stdin.readline():

a = sys.stdin.readline()

Do the same for variables b and c. You will still need to cast the variables to integers, just as in version 5.

To run version six, we're going to use a file named threenumbers.txt that contains three numbers, each on a different line. Download it into your Project1 directory. You are going to use the cat terminal command to dump the contents of the file and then you will pipe the contents to your three.py program. The terminal command to do this is as follows:

cat threenumbers.txt | python3 three.py

The vertical slash is called a pipe in Unix terms and it sends the output of one program to the input of the next. In this example, it dumps the contents of the threenumbers.txt file into the input stream of three.py. When you run the program, you should get the same output as the prior versions. To change what numbers you want to sum, however, you just need to change the input file, not the Python code.

One last change to this version. Make it so that your code can read in floating point numbers (data type float) instead of just integers (data type int). Figure out how to cast something to type float instead of type int.

Unix Tools

For the second main task, you will explore some standard Unix tools for requesting data from a web page and then extracting information from that data stream. The source web page we will be using is the Goldie Buoy Data from Great Pond.

schupflab.colby.edu/buoy/Goldie2016.csv

This page contains a comma-separated value (CSV) file with data every 15 minutes from May through July 30th, 2016. The data include information about the buoy, information about temperature, information about how much chlorophyll is in the water, and information about how much visible light is available.

To see the contents of this web page, use the following Terminal command:

curl schupflab.colby.edu/buoy/Goldie2016.csv

To be able to scroll through the data, pipe the output of curl to the program less. You can type q at any time to exit less.

curl schupflab.colby.edu/buoy/Goldie2016.csv | less

To select data from a particular file, you can pipe the curl output to a program called grep that searches for lines that contain a particular string. For example, the following finds all of the data from July 4th, 2016:

curl schupflab.colby.edu/buoy/Goldie2016.csv | grep 07/04/2016

In a CSV file, all of the different fields are separated by commas. Another useful Unix command is cut which allows us to chop up lines of text into fields using a specified separator. To find out more about the cut command, you can enter man cut into the terminal, or you can take a look at this overview of cut. Field 11 happens to be the 1m temperature measurement. So we could get the 1m temperature measurement for July 4th, 2016 using the following.

curl schupflab.colby.edu/buoy/Goldie2016.csv | grep 07/04/2016 | cut -d ',' -f 11

Now you have a single stream of numbers coming from the buoy data. The diagram below shows the whole process.

Data Pipeline

Try piping the output of the last command to your Python program that adds and averages three values. To do that, type the line above, then add the pipe symbol, then add python3 three.py. Does your answer seem reasonable? It should sum and average the first three numbers from the buoy.

Take a screenshot of the Terminal with the command and output. You should include this in your wiki page write-up as an image to demonstrate the correctness of your code. This is required image 1.

Writing a Function

Writing a function: An important concept in programming is the idea of a function. A function is a set of instructions with a name. A function will sometimes take one or more inputs, and sometimes it will have one or more return values. When a program executes a function it stops what it is doing, executes the function, then goes back to where it was in the code.

Right now we're using the expression (a + b + c) twice in our code, when what we want is the sum of three numbers. To explore how to write a function, let's edit our file three.py.

To define a function, we use the keyword def followed by the name of the function. You can call a function whatever you like, so long as it starts with a letter (or an underscore) and contains letters, numbers, or the underscore character _. Go ahead and define a function called sum3() using the line:

def sum3(x, y, z):

The parameters x, y, and z in parentheses tell Python that the function takes three arguments and the colon tells Python to begin a block of code. A block of code must be indented relative to its parent, and the end of the indentation indicates the end of the block.

On the first line of the function, indented relative to the def statement, assign to sum the expression x + y + z. This puts the sum of x, y, and z into the variable sum.

The second and last line of the function is a return statement. In order to use the value that is in the variable sum outside of the function sum3(), we have to return the value, which means moving data. To do that, we need to return sum as the last line of the function. A function ends when it hits a return statement, such as:

return sum

Once you have finished the function, the rest of your code is similar to the prior code. Put a comment after your function that says # main code

Then everwhere you have the expression a + b + c, you can replace it with the function call sum(a, b, c). The values in a, b, and c will get copied into x, y, and z inside the sum3() function, and their sum will replace the function call in the original expressions.

Test your program. You can pipe the contents of three.txt to it, or alternatively, you can use the Unix commands to access the buoy data and pipe it to python3 three.py. Is your answer still reasonable? If not, then examine your code carefully to determine what went wrong and fix the problem.

Adding All the Numbers

The last coding task is to write a program that adds all of the numbers coming from the standard input and outputs the average. Since we don't know how many numbers there will be, we will have to use a simple loop and keep going until the numbers run out. That also means we have to count how many numbers are in the input.

Create a new file addlots.py. Put your name and a date at the top. Your main program will have three parts: initialization, a loop, and the final calculations and print commands. Each of the following comments corresponds to one line of code.

# import sys

# assign to sum the value 0.0
# assign to count the value 0

# assign to nextval the result of calling sys.stdin.readline()
# while nextval.strip() != '':
  # assign to sum the value of sum plus the result of casting nextval to a float.
  # assign to count the result of count + 1
  # assign to nextval the result of calling sys.stdin.readline()

# print an appropriate string and the value of count
# print an appropriate string and the value of sum / count
      

Using the final Unix command from above that grabs all of the 1m temperature values from July 4th, pipe that to your program and see what you get. You should get an N of 96 and an average of 22.556. (It might print more decimal places.) Take a screenshot of the Terminal with this output. This is required image 2.

Comparison

The final task is to compare the average temperature at 1m on the 4th day of the month in May, June, and July 2016. This means you should run your program 3 times, once to compute the average temperature at 1m for all times on May 4, 2016, once to compute the average temperature at 1m for all times on June 4, 2016, and once to compute the average temperature at 1m for all times on July 4, 2016. Each program run should output one number. Report all three numbers in your write-up, along with a description of any trend you find and a sentence about whether or not the data make sense.

Report your results as part of your writeup. You can do this by changing the argument to the grep function on the terminal.

Take a screenshot of the Terminal with this output or copy-paste the text. This is required image 3.

Trouble-shooting

If you have a problem with files looking as though they have just one line (the last one), then check out these instructions for fixing the problem.

Extensions

Each assignment will have a set of suggested extensions. The required tasks constitute about 83% of the assignment (25 points out of 30). So, if you do only the required tasks and do them well you will earn a B+. To earn a higher grade, you need to undertake one or more extensions. The difficulty and quality of the extension or extensions will determine your final grade for the assignment. One complex extension, done well, or 2-3 simple extensions are typical.

The following are a few suggestions on things you can do as extensions to this assignment. You are free to choose other extensions.

Turn in your code

You will turn in your code (all files ending with .py) by putting it in a directory in the Courses server. On the Courses server, you should have access to a directory called CS152, and within that, a directory with your user name. Within this directory is a directory named private. Files that you put into that private directory you can edit, read, and write, and the professor can edit, read, and write, but no one else. To hand in your code and other materials, you will create a new directory, such as Project1, and then copy your code into the project directory for that week. Note: This directory will not be available during lab, but will become available during the week before the projects are due.

As with the Personal server, there are two ways to mount the appropriate directory:

Turn in your code by copying your entire Project1 directory from your Personal server to the Courses server. The easiest way to do this is to drag and drop the folder from one Finder (one open to Personal) to another (one open to Courses).

Write about the project on the wiki

In lab, you made a new wiki page for your assignment. Put the label cs152f17project1 in the label field on the bottom of the page. But give the page a meaningful title (e.g. Caitrin's Project 1).

Next, expand on the wiki page you began in lab. In general, your intended audience for your write-up is your peers not in the class. Your goal should be to be able to use it to explain to friends what you accomplished in this project and to give them a sense of how you did it. Follow the outline below.


When you are done with the lab exercises, you may start on the rest of the project.


© 2017 Caitrin Eaton.