Project 3: Calculating Thermoclines
For this project you will be building a library of useful functions as well as a more general program that will be useful for computing statistics of data in a CSV file.
The final task will involve computing the depth of the thermocline on Great Pond and plotting it for the month of July. The thermocline is the depth at which there is the largest difference in water density, with a layer of denser water below and a layer of less dense water above.
- Set up your workspace
If you have not already done so, make a new project 3 directory. Open a terminal and cd to the directory.
- Write your library of useful statistical functions
In the stats.py file you started in lab, write the following four functions. Each function should have a single parameter, which should be a list of numbers. The function should loop over the numbers in the list, compute the given statistic, and return it.
- mean(data) - computes the mean of the list of data.
- min(data) - computes the min of the list of data.
- max(data) - computes the max of the list of data.
- variance(data) - computes the variance of the list of data.
+ (more detail)
To compute the mean, sum the values in the list and then divide by the length of the list. The length of a list is returned by the len() function. So if you have a list data the number of elements is len(data)
To compute the variance use the formula:
First compute the mean (you can use your mean function), then compute the sum of the squared differences between each data point and the mean using a for loop. Finally, divide by N-1.
Use your test function in stats.py to check that each function is working correctly. In your report, indicate whether these four functions worked properly and, briefly, how you made that determination.
- Write a program to compute statistics of a column of data
Using your analyze.py file from the lab, update it so that it computes the sum, mean, variance, max, and min statistics for the selected column of data from the specified file and prints the statistics to the terminal.
Calling analyze.py with the hurricanes.csv file and column 1 should produce the following statistics.
sum : 103.00 mean: 7.36 var : 12.55 min : 2.00 max : 15.00
Demonstrate that your program works with a column from your extracted Goldie-MLRC July data file from project 2.
- Calculate the thermocline depth in Great Pond for July 2019
The next task will be to write a program that computes the depth of the thermocline on Great Pond for each day in July. The thermocline is the depth at which the water density changes most quickly, creating a layer of colder, denser water below a layer of warmer water that tend not to mix. Overall, you will write a function that computes the density of water given a list of temperatures, a function that computes the depth of the maximum change in density, and then a top-level function that reads in the data file and guides the computation.
Create a new file, thermocline.py. Put your name, date, and class at the top, along with a comment indicating what the program will do (compute the thermocline).
- Convert temperatures to densities
Write a function to convert a list of temperatures to a list of densities
Write a function density that takes in one parameter, temps that is a list of temperatures. The function should first create a new empty list to hold density values, rhos. Then, it should loop over the temps list and for each temperature value compute the density using the following equation.
rho = 1000 * (1 - (t + 288.9414) * (t - 3.9863)**2 / (508929.2*(t + 68.12963)))
It should then append the computed density to a list. Finally, it should return the list of densities.
Test your function using this test file. It should print out the following if your density function is working correctly.
24.47 -> 997.21 23.95 -> 997.34 24.41 -> 997.22 23.81 -> 997.37 19.92 -> 998.25 16.88 -> 998.82 14.06 -> 999.26 11.56 -> 999.57 9.82 -> 999.74 9.13 -> 999.80 8.82 -> 999.82
- Compute the derivative of the densities
The next step is to add a function to thermocline.py that computes the derivative of density with respect to depth, or how fast the density is changing as you get deeper. The function will take in two lists: one is the set of temperatures, the other is the set of corresponding depths. The function will return one value: the depth of the maximum change in density. The algorithm below gives the function.
def thermocline_depth( temps, depths ): # assign to rhos the result of calling the density function with temps as the argument # assign to drho_dz the empty list # loop for one less than the length of rhos # append to drho_dz the quantity rhos[i+1] minus rhos[i] divided by the quantity depths[i+1] minus depths[i] # optional step: print out temps[i], rhos[i], and drho_dz[i] # assign to max_drho_dz the value -1.0 # assign the maxindex the value -1 # loop for the length of drho_dz (loop variable i) # if drho_dz[i] is greater than max_drho_dz # assign to max_drho_dz the value drho_dz[i] # assign to maxindex the value i # assign to thermoDepth the average of depths[maxindex] and depths[maxindex+1] return thermoDepth
Test your thermocline_depth function using this test file. It should return a depth of 6.0m (note that the maximum change of 0.44 at that depth -- you do not need to report this, but if you run into problems, knowing the maximum change is supposed to be 0.44 may help you debug).
- Compute the thermocline for each day in July
The final step is to write the main function that reads in data from the buoy file, extracts all of the temperature fields in order, computes the thermocline_depth and either prints the day and thermocline_depth value or saves them to a CSV file.
You can use this Goldie data file for this task. The file includes all of the data fields for the month of July and a single header line. The fields indexes for the depths (m) [1, 3, 5, 7, 9, 11, 13, 15] are [10, 11, 16, 17, 15, 14, 13, 12]. You may want to double-check the field numbers before starting by look at the header line.
+ (more detail)
The algorithm given below is not strictly line by line. Each comment will correspond to one or more lines of Python.
def main(): # these are the fields corresponding to the temperatures in order by depth # note they use 0-indexing fields = [10, 11, 16, 17, 15, 14, 13, 12] # these are the depth values for each temperature measurement depths = [ 1, 3, 5, 7, 9, 11, 13, 15 ] # open the data file and read past the header line # assign to day the value 0 # for each line in the file # split the line on commas and assign it to words # if the time is about noon (12:03:00 PM) # add one to the day variable # assign to temps the empty list # loop over the number of items in depths (loop variable i) # append to temps the result of casting words[ fields[i] ] to a float # assign to thermo_depth the result of calling thermocline_depth with temps and depths as arguments # print (or save to a file) the day of the month and thermo_depth separated by a comma return if __name__ == "__main__": main()
Run your program and create a plot of the results with day on the x-axis and thermocline depth on the y-axis. Include this plot in your report
- What is a command-line argument, and why are they useful?
- What is the difference between = and == in Python?
- Within a function, how would you control the number of times a for loop executes using a function parameter? Give an example.
- Look up a woman statistician and give a one sentence description of a contribution they made.
Extensions are your opportunity to customize your project, learn something else of interest to you, and improve your grade. The following are some suggested extensions, but you are free to choose your own. Be sure to describe any extensions you complete in your report.
- Write functions in your stats.py file to compute more types of statistics.
- Use your code compute statistics on a data set of your own choosing.
- Compare different times or time periods in the Goldie data.
- Add more command-line control options, such as specifying what time of day to compute the thermocline.
- Explore how the thermocline changes and why. What are the min and max thermocline values for July? What if you graph wind direction and thermocline together, is there a relationship?
- Automate the process of making a graph from data.
Submit your code
Turn in your code (all files ending with .py) by putting it in a directory in the Courses server. On the Courses server, you should have access to a directory called CS152, and within that, a directory with your user name. Within this directory is a directory named Private. Files that you put into that private directory you can edit, read, and write, and the professor can edit, read, and write, but no one else. To hand in your code and other materials, create a new directory, such as project1, and then copy your code into the project directory for that week. Please submit only code that you want to be graded.
When submitting your code, double check the following.
- Is your name at the top of each code file?
- Does every function have a comment or docstring specifying what it does?
- Is your handin project directory inside your Private folder on Courses?
Write your project report
For CS 152 please use Google Docs to write your report. Create a new doc for each project. Start the doc with a title and your name. Attach the doc to your project on Google classroom. Make sure you click submit when you are done. The graders cannot provide feedback unless you click submit.
Your intended audience for your report is your peers not in the class. From week to week you can assume your audience has read your prior reports. Your goal should be to be able to use it to explain to friends what you accomplished in this project and to give them a sense of how you did it.
Your project report should contain the following elements. Please include a header for each section.
A brief summary of the project, in your own words. This should be no more than a few sentences. Give the reader context and identify the key purpose of the assignment. An abstract should define the project's key lecture concepts in your own words for a general, non-CS audience. It should also describe the program's context and output, highlighting a couple of important algorithmic and/or scientific details.
Writing an effective abstract is an important skill. Consider the following questions while writing it.
- Does it describe the CS concepts of the project (e.g. writing well-organized and efficient code)?
- Does it describe the specific project application (e.g. extracting data)?
- Does it describe your the solution or how it was developed (e.g. what code did you write)?
- Does it describe the results or outputs (e.g. did your code work as expected and what did the results tell you)?
- Is it concise?
- Are all of the terms well-defined?
- Does it read logically and in the proper order?
The method section should describe in clear sentences (without pasting any code) at least one example of your own computational thinking that helped you complete your project. This could involve illustrating how a key lecture concept was applied to creating an image, how you solved a challenging problem, or explaining an algorithmic feature that is essential to your program as well as why it is so essential. The explanation should be suitable for a general audience who does not know Python.
Your methods section should be at most one or two paragraphs.
Present your results in a clear manner using human-friendly images or graphs labeled with captions and interpreted for a general audience such as your peers not in the course. Explain, for a general, non-CS audience, what your output means and whether it makes sense.
A description of any extensions you undertook, including text output or images demonstrating those extensions. If you added any modules, functions, or other design components, note their structure and the algorithms you used.
- Follow-up questions
The answers to any follow-up questions (there will be 3-4 for each project).
Draw connections between lecture concepts utilized in this project and real-world problems that interest you. How else could these concepts apply to our everyday lives? What are some specific things you had to learn or discover in order to complete the project?
Identify your collaborators, including TAs and professors. Include in that list anyone whose code you may have seen, such as those of friends who have taken the course in a previous semester. Cite any other sources, imported libraries, or tutorials you used to complete the project.