Due: , 11:59 pm
The purpose of this project is to give you chance to break down simple problems into precise steps that solve the problem. In computer science speak, you will be developing algorithms. In this case you will write your algorithms as sequences of Python commands that calculate a value or otherwise manipulate data.
If you haven't already set yourself up for working on the project, then do so now:
The first part of the project is to continue the example from lab of adding three numbers. You will write six different versions of the code (you have already written one) that become increasingly sophisticated.
Run your add3.py file from lab:
It should be clear that Python is executing floating point math; there is a decimal in the result. Note that we did not include any decimals in our code. But, when it saw the division sign, Python automatically assumed that we wanted it to do floating point math, so as not to lose information in case the result of the division was not an integer value.
For version 2, copy your version 1 lines of code and paste them below
Change the first print statement to 'version 2', and then change the
division sign to a double slash:
//. Now you
are telling Python that the result of the division should be
Run add3.py again and see if it gives you a different answer than version 1.
Testing Your Code
Sometimes your code doesn't work. The first thing to do is to carefully read any error message. It should direct you to the line of the file that failed to work. Some common errors include the following.
Tabbing/White Space error: Having inconsistent tabbing. All code should be lined up carefully. The main code should have no spaces or tabs at the beginning of the line. The code "inside" a function definition should be tabbed in once, with all lines tabbed in the same amount.
Note that Python considers tabs and white space to be different things, which is sometimes hard to debug because all of your code looks correct. If you think your code is correct, then select all of your code (Cmd-a) and then choose Text::Entab in TextWrangler to convert all of your white space to tabs or Text::Detab to convert all of your white space to spaces. That will generally correct the problem if the error is mis-matched white space.
If we wanted to use the program to add a different three numbers, how many places do we have to change the code? Because we have to change all instances of a number everywhere in the code, we have to change each number in two different places. From a coding point of view, that is a bad idea. It is inefficient (takes too much time and effort) and prone to create errors in the code (lots of opportunities to type the wrong thing).
To make our code more efficient, start a version 3 at the bottom of the file. Make three assignments, assigning to the variable a the number 42, assigning to variable b the number 21, and assigning to variable c the number 5. The equals sign is the assignment operator and it copies the information on the right side of the assignment to the variable on the left side of the assignment. So to assign to the variable a the number 42 you would write the following code:
a = 42
Another way of describing that statement is to say the a "gets" the value 42.
After you have made the three assignments, you can use the variables a, b, and c in mathematic expressions, just like we used 42, 21, and 5 in the prior versions.
Have Python print out the sum of the three variables. Just
change the expression
42 + 21 + 5 in version 1 to
+ b + c for version 3. Do the same for the average, using
(a + b + c) and then dividing that
expression by 3.0
Make sure your code prints out the same set of values as version 1.
In version 3, if we want to add a different set of three numbers, we still have to edit the code. Version 4 will let us enter the numbers on the terminal when we run the program, letting us add any three numbers without changing the code.
Copy version 3 to the bottom of the file. But instead of assigning numbers to the variables a, b, and c, assign the result of calling the input() function. Your first assignment might look like this:
a = input("Enter first number :")
Modify the assignments for b and c to accept input as well.
The input() function prints out the prompt and then waits for the user to type something and hit the return key. In this example, whatever the user types after the prompt is stored in a variable.
What happens when you run this program? Is Python happy with the last line of code?
The basic problem with this program is that input() returns a string of characters, not a number. Therefore, we have to convert the string into a number before we can add and divide the numbers properly. This is a process called casting. The following assignment converts the string in variable a into an integer and then assigns the integer back to the variable a, overwriting its old contents.
a = int(a)
Modify version 4 so that it converts the three variables to integer values before executing the sum or average expressions.
Note that you can find out more about a function
like input() by using the Python help function. If
you start Python in a terminal (type
python3 and hit return),
then you can ask Python for help about functions and types.
If you type
then Python will give you more information about the function, including the fact that it returns a string. You can type q to get out of the help function. Likewise, if you type
then Python gives you a lot of information about the int
type, including that it will convert a number or string to
an integer. Again, type q to exit the help viewer. To
exit the Python interpreter, you can use Ctrl-d, or you
For version 5, create a new file. Call it three.py. Put your name, a date, and the project at the top in comments. After that, import the sys package by writing the following line of code:
Importing packages is something we do a lot when using Python, because there are lots of packages people have written that do useful things. The sys package lets us communicate with information from the terminal.
Copy version 4 into the new file and call it version 5. Instead of using input() to get information from the Terminal, however, use the function sys.stdin.readline():
a = sys.stdin.readline()
Do the same for variables b and c. You will still need to cast the variables to integers, just as in version 4.
To run version 5, we're going to use a file named threenumbers.txt that contains three numbers, each on a different line. Download it into your project1 directory. You are going to use the cat terminal command to dump the contents of the file and then you will pipe the contents to your three.py program. The terminal command to do this is as follows:
cat threenumbers.txt | python3 three.py
The vertical slash is called a pipe in Unix. It sends the output of one program to the input of the next. In this example, it dumps the contents of the threenumbers.txt file into the input stream of three.py. When you run the program, you should get the same output as the prior versions. To change what numbers you want to sum, however, you just need to change the input file, not the Python code.
One last change to this version. Make it so that your code can read in floating point numbers (data type float) instead of just integers (data type int). Figure out how to cast something to type float instead of type int.
For the second main task, you will explore some standard Unix tools for requesting data from a web page and then extracting information from that data stream. The source web page we will be using is the Goldie Buoy Data from Great Pond.
This page contains a comma-separated value (CSV) file with data every 15 minutes from May through July 30th, 2016. The data include information about the buoy, information about temperature, information about how much chlorophyll is in the water, and information about how much visible light is available.
To see the contents of this web page, use the following Terminal command:
To be able to scroll through the data, pipe the output of curl to the program less. You can type q at any time to exit less.
curl schupflab.colby.edu/buoy/Goldie2016.csv | less
To select data from a particular file, you can pipe the curl output to a program called grep that searches for lines that contain a particular string. For example, the following finds all of the data from July 4th, 2016:
curl schupflab.colby.edu/buoy/Goldie2016.csv | grep 07/04/2016
In a CSV file, all of the different fields are separated by commas.
Another useful Unix command is cut which allows us to chop up
lines of text into fields using a specified separator. To find out
more about the cut command, you can enter
man cut into
the terminal, or you can take a look at
this overview of cut. Field 11
happens to be the 1m temperature measurement. So we could get the 1m
temperature measurement for July 4th, 2016 using the following.
curl schupflab.colby.edu/buoy/Goldie2016.csv | grep 07/04/2016 | cut -d ',' -f 11
Now you have a single stream of numbers coming from the buoy data. The diagram below shows the whole process.
Try piping the output of the last command to your Python program that
adds and averages three values. To do that, type the line above, then
add the pipe symbol, then add
python3 three.py. Does your
answer seem reasonable? It should sum and average the first three
numbers from the buoy.
Take a screenshot of the Terminal with the command and output. You should include this in your wiki page write-up as an image to demonstrate the correctness of your code. This is required image 1.
Writing a function: An important concept in programming is the idea of a function. A function is a set of instructions with a name. A function will sometimes take one or more inputs, and sometimes it will have one or more return values. When a program executes a function it stops what it is doing, executes the function, then goes back to where it was in the code.
Right now we're using the expression
(a + b + c) twice in our
code, when what we want is the sum of three numbers. To explore how
to write a function, let's edit our file three.py.
To define a function, we use the keyword def followed by the
name of the function. You can call a function whatever you like, so
long as it starts with a letter (or an underscore) and contains letters, numbers, or the
_. Go ahead and define a function called sum3()
using the line:
def sum3(x, y, z):
The parameters x, y, and z in parentheses tell Python that the function takes three arguments and the colon tells Python to begin a block of code. A block of code must be indented relative to its parent, and the end of the indentation indicates the end of the block.
On the first line of the function, indented relative to the def
statement, assign the expression
x + y + z to a variable named sigma.
This puts the sum of x, y, and z into the variable sigma.
The second and last line of the function is a return statement. In order to use the value that is in the variable sigma outside of the function sum3(), we have to return the value, which means moving data. To do that, we need to return sigma as the last line of the function. A function ends when it hits a return statement, such as:
Once you have finished the function, the rest of your code is similar
to the prior code. Put a comment after your function that says
# main code
Then everwhere you have the expression
a + b + c, you can replace it
with the function call
sum3(a, b, c). The values in a,
b, and c will
get copied into x, y, and z inside the
sum3() function, and their sum
will replace the function call in the original expressions.
Test your program. You can pipe the contents of three.txt to it, or
alternatively, you can use the Unix commands to access the buoy data
and pipe it to
python3 three.py. Is your answer still reasonable? If not,
then examine your code carefully to determine what went wrong and fix the problem.
The last coding task is to write a program that adds all of the numbers coming from the standard input and outputs the average. Since we don't know how many numbers there will be, we will have to use a simple loop and keep going until the numbers run out. That also means we have to count how many numbers are in the input.
Create a new file addlots.py. Put your name and a date at the top. Your main program will have three parts: initialization, a loop, and the final calculations and print commands. Each of the following comments corresponds to one line of code.
# import sys # assign to sigma the value 0.0 # assign to count the value 0 # assign to nextval the result of calling sys.stdin.readline() # while nextval.strip() != '': # assign to sigma the value of sum plus the result of casting nextval to a float. # assign to count the result of count + 1 # assign to nextval the result of calling sys.stdin.readline() # print an appropriate string and the value of count # print an appropriate string and the value of sigma / count
Using the final Unix command from above that grabs all of the 1m temperature values from July 4th, pipe that to your program and see what you get. You should get an N of 96 and an average of 22.556. (It might print more decimal places.) Take a screenshot of the Terminal with this output. This is required image 2.
The final task is to compare the average temperature at 1m on the 4th day of the month in May, June, and July 2016. This means you should run your program 3 times, once to compute the average temperature at 1m for all times on May 4, 2016, once to compute the average temperature at 1m for all times on June 4, 2016, and once to compute the average temperature at 1m for all times on July 4, 2016. Each program run should output one number. Report all three numbers in your write-up, along with a description of any trend you find and a sentence about whether or not the data make sense.
Report your results as part of your writeup. You can do this by changing the argument to the grep function on the terminal.
Take a screenshot of the Terminal with this output or copy-paste the text. This is required image 3.
If you have a problem with files looking as though they have just one line (the last one), then check out these instructions for fixing the problem.
Each assignment will have a set of suggested extensions. The required tasks and writeup constitute about 83% of the assignment (25 points out of 30). So, if you do only the required tasks and writeup -- and do them well -- you will earn a B+. To earn a higher grade, you need to undertake one or more extensions. The difficulty and quality of the extension or extensions will determine your final grade for the assignment. One complex extension, done well, or 2-3 simple extensions are typical.
The following are a few examples of things you can do as extensions to this assignment. Please do not view these as a list of extra required tasks. These are not the most important extensions, or the hardest, or the easiest, and completing any or all of them will not guarantee all 5/5 extension points. These are only samples to help you start thinking of the unlimited ways in which you could extend this project. In fact, there is a good chance that we will be more interested in and impressed by your project if you design your own, original extensions.
Ultimately, it is the motivation underlying your chosen extensions, the creativity of your choices, and the quality of your implementations that earn high marks for extensions. A key factor here is the description of your extensions in your writeup, which should help us appreciate your motivations, creativity, and implementation.
See which month of 2016 was the warmest on average.
You can also access data from a second buoy that has all of the 2017 data for May through November. Maybe some feature of this dataset could be compared with the corresponding feature in the 2016 dataset. For example, the 1m temperature is in field 9 in this data.LEA buoy: http://schupflab.colby.edu/buoy/3100_iSIC.csv
You will turn in your code (all files ending with .py) by putting it in a directory in the Courses server. On the Courses server, you should have access to a directory called CS152, and within that, a directory with your user name. Within this directory is a directory named private. Files that you put into that private directory you can edit, read, and write, and the professor can edit, read, and write, but no one else. To hand in your code and other materials, you will create a new directory, such as project1, and then copy your code into the project directory for that week. Note: This directory will not be available during lab, but will become available during the week before the projects are due.
As with the Personal server, there are two ways to mount the appropriate directory:
Option 1: Load the root server directory and navigate to your directory.
You can mount the Colby fileserver root directory by going to the Finder and typing Cmd-k, or selecting 'Connect To Server...' from the Go menu. It will bring up a dialog box, into which you want to enter the following:
Then click on the CS152 directory, and then your hand-in directory (it will have your username as its name).
Option 2: Mount your directory directly.
You can mount your personal directory explicitly using the the following path in the 'Connect To Server...' dialog.
Turn in your code by copying your entire project1 directory from your Personal server to the Courses server. The easiest way to do this is to drag and drop the folder from one Finder (one open to Personal) to another (one open to Courses).
In lab, you made a new wiki page for your assignment. Put the label cs152s18project1 in the label field on the bottom of the page. But give the page a meaningful title (e.g. Caitrin's Project 1).
Next, expand on the wiki page you began in lab. In general, your intended audience for your write-up is your peers not in the class. Your goal should be to be able to use it to explain to friends what you accomplished in this project and to give them a sense of how you did it. Follow the outline below.
cs152s18project1. Make sure it is there. This is how we find your writeup for grading.
© 2018 Caitrin Eaton.