CS 152: Project 3

Title image Project 3
Spring 2020

Project 3: Calculating Thermoclines

For this project you will be building a library of useful functions as well as a more general program that will be useful for computing statistics of data in a CSV file.

The final task will involve computing the depth of the thermocline on Great Pond and plotting it for the month of July. The thermocline is the depth at which there is the largest difference in water density, with a layer of denser water below and a layer of less dense water above.


  1. Set up your workspace

    If you have not already done so, make a new project 3 directory. Open a terminal and cd to the directory.

  2. Write your library of useful statistical functions

    In the stats.py file you started in lab, write the following four functions. Each function should have a single parameter, which should be a list of numbers. The function should loop over the numbers in the list, compute the given statistic, and return it.

    1. mean(data) - computes the mean of the list of data.
    2. min(data) - computes the min of the list of data.
    3. max(data) - computes the max of the list of data.
    4. variance(data) - computes the variance of the list of data.

    + (more detail)

    To compute the mean, sum the values in the list and then divide by the length of the list. The length of a list is returned by the len() function. So if you have a list data the number of elements is len(data)

    To compute the variance use the formula:

    First compute the mean (you can use your mean function), then compute the sum of the squared differences between each data point and the mean using a for loop. Finally, divide by N-1.

    Use your test function in stats.py to check that each function is working correctly. In your report, indicate whether these four functions worked properly and, briefly, how you made that determination.

  3. Write a program to compute statistics of a column of data

    Using your analyze.py file from the lab, update it so that it computes the sum, mean, variance, max, and min statistics for the selected column of data from the specified file and prints the statistics to the terminal.

    Calling analyze.py with the hurricanes.csv file and column 1 should produce the following statistics.

    sum :  103.00
    mean:    7.36
    var :   12.55
    min :    2.00
    max :   15.00

    Demonstrate that your program works with a column from your extracted Goldie-MLRC July data file from project 2.

  4. Calculate the thermocline depth in Great Pond for July 2019

    The next task will be to write a program that computes the depth of the thermocline on Great Pond for each day in July. The thermocline is the depth at which the water density changes most quickly, creating a layer of colder, denser water below a layer of warmer water that tend not to mix. Overall, you will write a function that computes the density of water given a list of temperatures, a function that computes the depth of the maximum change in density, and then a top-level function that reads in the data file and guides the computation.

    1. Setup

      Create a new file, thermocline.py. Put your name, date, and class at the top, along with a comment indicating what the program will do (compute the thermocline).

    2. Convert temperatures to densities

      Write a function to convert a list of temperatures to a list of densities

      Write a function density that takes in one parameter, temps that is a list of temperatures. The function should first create a new empty list to hold density values, rhos. Then, it should loop over the temps list and for each temperature value compute the density using the following equation.

      rho = 1000 * (1 - (t + 288.9414) * (t - 3.9863)**2 / (508929.2*(t + 68.12963)))

      It should then append the computed density to a list. Finally, it should return the list of densities.

      Test your function using this test file. It should print out the following if your density function is working correctly.

      24.47 -> 997.21
      23.95 -> 997.34
      24.41 -> 997.22
      23.81 -> 997.37
      19.92 -> 998.25
      16.88 -> 998.82
      14.06 -> 999.26
      11.56 -> 999.57
      9.82 -> 999.74
      9.13 -> 999.80
      8.82 -> 999.82
    3. Compute the derivative of the densities

      The next step is to add a function to thermocline.py that computes the derivative of density with respect to depth, or how fast the density is changing as you get deeper. The function will take in two lists: one is the set of temperatures, the other is the set of corresponding depths. The function will return one value: the depth of the maximum change in density. The algorithm below gives the function.

      def thermocline_depth( temps, depths ):
          # assign to rhos the result of calling the density function with temps as the argument
          # assign to drho_dz the empty list
          # loop for one less than the length of rhos
              # append to drho_dz  the quantity rhos[i+1] minus rhos[i] divided by the quantity depths[i+1] minus depths[i]
              # optional step: print out temps[i], rhos[i], and drho_dz[i]
          # assign to max_drho_dz the value -1.0
          # assign the maxindex the value -1
          # loop for the length of drho_dz (loop variable i)
              # if drho_dz[i] is greater than max_drho_dz
                  # assign to max_drho_dz the value drho_dz[i]
                  # assign to maxindex the value i
          # assign to thermoDepth the average of depths[maxindex] and depths[maxindex+1]
          return thermoDepth

      Test your thermocline_depth function using this test file. It should return a depth of 6.0m (note that the maximum change of 0.44 at that depth -- you do not need to report this, but if you run into problems, knowing the maximum change is supposed to be 0.44 may help you debug).

    4. Compute the thermocline for each day in July

      The final step is to write the main function that reads in data from the buoy file, extracts all of the temperature fields in order, computes the thermocline_depth and either prints the day and thermocline_depth value or saves them to a CSV file.

      You can use this Goldie data file for this task. The file includes all of the data fields for the month of July and a single header line. The fields indexes for the depths (m) [1, 3, 5, 7, 9, 11, 13, 15] are [10, 11, 16, 17, 15, 14, 13, 12]. You may want to double-check the field numbers before starting by look at the header line.

      + (more detail)

      The algorithm given below is not strictly line by line. Each comment will correspond to one or more lines of Python.

      def main():
          # these are the fields corresponding to the temperatures in order by depth
          # note they use 0-indexing 
          fields = [10, 11, 16, 17, 15, 14, 13, 12]
          # these are the depth values for each temperature measurement
          depths = [ 1, 3, 5, 7, 9, 11, 13, 15 ]
          # open the data file and read past the header line
          # assign to day the value 0
          # for each line in the file
              # split the line on commas and assign it to words
              # if the time is about noon (12:03:00 PM)
                  # add one to the day variable
                  # assign to temps the empty list
                  # loop over the number of items in depths (loop variable i)
                      # append to temps the result of casting words[ fields[i] ] to a float
                  # assign to thermo_depth the result of calling thermocline_depth with temps and depths as arguments
                  # print (or save to a file) the day of the month and thermo_depth separated by a comma
      if __name__ == "__main__":

    Run your program and create a plot of the results with day on the x-axis and thermocline depth on the y-axis. Include this plot in your report

Follow-up Questions

  1. What is a command-line argument, and why are they useful?
  2. What is the difference between = and == in Python?
  3. Within a function, how would you control the number of times a for loop executes using a function parameter? Give an example.
  4. Look up a woman statistician and give a one sentence description of a contribution they made.


Extensions are your opportunity to customize your project, learn something else of interest to you, and improve your grade. The following are some suggested extensions, but you are free to choose your own. Be sure to describe any extensions you complete in your report.

Submit your code

Turn in your code (all files ending with .py) by putting it in a directory in the Courses server. On the Courses server, you should have access to a directory called CS152, and within that, a directory with your user name. Within this directory is a directory named Private. Files that you put into that private directory you can edit, read, and write, and the professor can edit, read, and write, but no one else. To hand in your code and other materials, create a new directory, such as project1, and then copy your code into the project directory for that week. Please submit only code that you want to be graded.

When submitting your code, double check the following.

  1. Is your name at the top of each code file?
  2. Does every function have a comment or docstring specifying what it does?
  3. Is your handin project directory inside your Private folder on Courses?

Write your project report

For CS 152 please use Google Docs to write your report. Create a new doc for each project. Start the doc with a title and your name. Attach the doc to your project on Google classroom. Make sure you click submit when you are done. The graders cannot provide feedback unless you click submit.

Your intended audience for your report is your peers not in the class. From week to week you can assume your audience has read your prior reports. Your goal should be to be able to use it to explain to friends what you accomplished in this project and to give them a sense of how you did it.

Your project report should contain the following elements. Please include a header for each section.