Project 3: Calculating Thermoclines
For this project you will be writing a library of use functions as well as a set of more general programs that will be useful for computing simple statistical functions of a data stream. The final task will involve computing the depth of the thermocline on Great Pond. The thermocline is the depth at which there is the largest difference in water density, with a layer of denser water below and a layer of less dense water above.
- If you haven't already set yourself up for working
on the project, then do so now.
- Mount your directory on the Personal server.
- Open the Terminal and navigate to your Project3 directory on the Personal server.
- Open TextWrangler. If you want to look at any of the files you have already created, then open those files.
If you did not do so in lab, complete your stats functions in
stats.py so that you have functions for computing each of max, min,
sum, mean, variance, and standard deviation for a list of single
values. Make sure each function, except max and min, computes and
returns a floating point value. You can assume the input is a list
of integers or floats.
Add to your stats functions a function that converts Celsius to Fahrenheit. The function should take in a single parameter--temperature in Celsius--and return a single value--temperature in Fahrenheit. Call the function celsius2fahrenheit.
Write a Python program, computeStats.py, that reads a single stream of
numbers, stores them in a list, then uses the library of functions you
wrote in the lab to calculate the min, max, mean, and standard
deviation and prints the values to the terminal.
Use your program, plus appropriate grep and cut commands, to compute these values from the LEA buoy data for temperature at 1m, temperature at 5m, and one other variable of your choice for the first week of June (i.e. June 1-7, 2016). Repeat the exercise for the first week of July and then for the first week ofAugust 2016. Report these values in your report. Be sure to put the whole command for running your program in comments at the top of your Python program.
When computing the max, min, and mean of temperature variables, use the Celsius to Fahrenheit function to display the temperatures in both systems. You can choose how to implement this.
The next task will be to write a program that computes the depth of
the thermocline on Great Pond. The thermocline is the depth at
which the water density changes most quickly, creating a layer of
colder, denser water below a layer of warmer water that tend not to
mix. Overall, you will write a function that computes the density
of water given a list of temperatures, a function that computes the
depth of the maximum change in density, and then a top-level
function that reads in a stream of data and sets up the
- Create a new file, thermocline.py. Put your name, date, and class at the top, along with a comment indicating what the program will do (compute the thermocline).
Write a function density that takes in one
parameter, temps that is a list of temperatures. The
function should first create a new empty list to hold density
values, rhos. Then, it should loop over
the temps list and for each temperature value t compute
the density using the following equation.
rho = 1000 * (1 - (t + 288.9414) * (t - 3.9863)**2 / (508929.2*(t + 68.12963)))
It should then append the computed density to the rhos list. Finally, it should return the list of densities (rhos).
Test your function using this test file. It should print out the following if your density function is working correctly.
24.47 -> 997.21 23.95 -> 997.34 24.41 -> 997.22 23.81 -> 997.37 19.92 -> 998.25 16.88 -> 998.82 14.06 -> 999.26 11.56 -> 999.57 9.82 -> 999.74 9.13 -> 999.80 8.82 -> 999.82
The next step is to add a function to thermocline.py that
computes the derivative of density with respect to depth, or how
fast the density is changing as you get deeper. The function
will take in two lists: one is the set of temperatures, the
other is the set of corresponding depths. The function will
return one value: the depth of the maximum change in density. The algorithm below gives the
def thermocline_depth( temps, depths ): # assign to rhos the result of calling the density function with temps as the argument # assign to drho_dz the empty list # calculate the first derivative of density # loop for one less than the length of rhos # append to drho_dz the quantity rhos[i+1] minus rhos[i] divided by the quantity depths[i+1] minus depths[i] # optional step: print out temps[i], rhos[i], and drho_dz[i] # assign to max_drho_dz the value -1.0 # assign the maxindex the value -1 # loop for the length of drho_dz (loop variable i) # if drho_dz[i] is greater than max_drho_dz # assign to max_drho_dz the value drho_dz[i] # assign to maxindex the value i # assign to thermoDepth the average of depths[maxindex] and depths[maxindex+1] return thermoDepth
Test your thermocline_depth function using this test file. It should return a depth of 6.0m (note that the maximum change of 0.44 at that depth -- you do not need to report this, but if you run into problems, knowing the maximum change is supposed to be 0.44 may help you debug).
The final step is to write the main function that reads in data
from the buoy file through stdin, extracts all of the
temperature fields in order, computes the thermocline_depth and
prints it out. The program should also keep track of the
minimum and maximum depth of the thermocline over the range of
measurements provided and print out those values at the end
along with when those minimum and maximum events occurred.
The algorithm given below is not line by line. Each comment will correspond to one or more lines of Python.
def main(stdin): # these are the fields corresponding to the temperatures in order by depth # note the 0-indexing fields = [8, 11, 14, 17, 20, 23, 26] # these are the depth values for each temperature measurement depths = [ 1, 3, 5, 7, 9, 11, 13 ] # create variables to hold the max depth, min depth, the datetime # of the max depth and the datetime of the min depth. Give them # reasonable initial values (a small value for max depth, a large # value for min depth, and empty strings for the datetime variables. # assign buf the first line of stdin and then start the standard while loop until buf is empty # split buf on commas and assign it to words # assign to datetime the value in words # assign to temps the empty list # loop over the number of items in depths (loop variable i) # append to temps the result of casting words[ fields[i] ] to a float # assign to depth the result of calling thermocline_depth with temps and depths as arguments # test if depth is greater than maxdepth and update maxdepth and maxtime if it is # test if depth is less than mindepth and update mindepth and mintime if it is # print out the datetime value and the depth value, separated by commas # update buf with the next readline from stdin # print out the minimum and maximum thermocline depth and the corresponding date/time if __name__ == "__main__": main(sys.stdin)
Test your program using the following command.
curl http://schupflab.labs.keyes.colby.edu/buoy/3100_iSIC.csv | grep '6/2/2016' | grep -e ' 1:00' -e ' 2:00' | python thermocline.py
The thermocline depth at 1:00am should be 2m, and at 2:00am it should be 4m.
For the final task, compute the thermocline depth on hourly
intervals (i.e. you should grep for the data points that are "on the hour") for the month of June and save the results to a file. For this exercise, we want just the dates and thermocline depths, so be sure to comment out any additional print statements (such as those printing the minimum and maximum.) Use
grep to select the lines you need and pipe it to thermocline.py file
to compute the thermocline values. Direct the output to a new file 2016-06-thermo.csv.
This should be a file with two columns: datetime, and thermocline
Use the cat command to direct the contents of 2016-06-thermo.csv to stdout, pipe that to cut to get the second field, and then pipe that to your computeStats.py file to get the min, max, mean, and stdev of the thermocline (which is in units of meters) for the month of June.
Include this final output in your writeup.
Each assignment will have a set of suggested extensions. The required tasks constitute about 85% of the assignment, and if you do only the required tasks and do them well you will earn a B+. To earn a higher grade, you need to undertake one or more extensions. The difficulty and quality of the extension or extensions will determine your final grade for the assignment. One complex extension, done well, or 2-3 simple extensions are typical.
- Use the tools you have built to examine the trajectory of other variables, such as DOSat, over time. For example, identify when DOSat at 20m drops below 10% for the first time and rises above 10% for the last time.
Write a more general program for taking data at one sampling rate, for
example 5min intervals, and converting it to an hourly sampling
rate. Here is how to approach it:
Given a stream of data from stdin, the data is formatted such that the first field is a date/time field and the remaining data consists of comma separated numbers. The output of the program should be a stream of data to stdout with the date/time field first, followed by the same number of fields as the input as comma separated numbers. The numbers should be the hourly averages on the top of the hour.
To accomplish this task, you will need to figure out how many numbers are on each line and then build a list of lists, with one sublist to hold each column of numbers between hour intervals. As it loops, it stores the values from each column of data into their corresponding sublist. When the algorithm hits the top of the hour, it calculates the mean of each sublist and prints them out along with the date/time field.
- Generalize your convert-to-hour program to let you specify the interval you want to average. A simpler alternative is to write a different function such as convert2day that computes the average for each day in the input stream.
- Look at the relationship between mid-day oxygen, above-water light availability, and fluorescence at the 2m sensor. Test the hypothesis that algal activity (measured by fluorescence) is higher on sunnier days.
- Write a general function that takes in date/time and another variable. Have your function compute the max and min for each day in the input stream.
- Using a tool of your choice, create graphs/plots of data that is an output of running your own code.
Write-up and Hand-in
Turn in your code by putting it into your private hand-in directory on the Courses server. All files should be organized in a folder titled "Project 3" and you should include only those files necessary to run the program. We will grade all files turned in, so please do not turn in old, non-working, versions of files.
Make a new wiki page for your assignment. Put the label cs152s17project3 in the label field on the bottom of the page. But give the page a meaningful title (e.g. Milo's Project 3).
In general, your intended audience for your write-up is your peers not in the class. Your goal should be to be able to use it to explain to friends what you accomplished in this project and to give them a sense of how you did it. Follow the outline below.
- A brief summary of the task, in your own words. This should be no more than a few sentences. Give the reader context and identify the key purpose of the assignment.
- A description of your solution to each task, including any text output or images you created. This should be a description of the form and functionality of your final code. For this project, you should also include a description of the programming process, and refer to your first image. You may want to incorporate code snippets in your description to point out relevant features. Note any unique computational solutions you developed. Code snippets should be small segments of code--usually less than a whole function--that demonstrate a particular concept. If you find yourself including more than 5-10 lines of code, it's probably not a snippet.
- A description of any extensions you undertook, including text output or images demonstrating those extensions. If you added any modules, functions, or other design components, note their structure and the algorithms you used.
- A brief description (1-3 sentences) of what you learned. Think about the answer to this question in terms of the stated purpose of the project. What are some specific things you had to learn or discover in order to complete the project?
- A list of people you worked with, including TAs and professors. Include in that list anyone whose code you may have seen, such as those of friends who have taken the course in a previous semester.
- Double-check the label. When you created the page, you should have added a the label cs152s17project3. Make sure it is there.