Project 6: Optimizing the Simulation
This is the second project on elephant population simulation. In the first project, you developed the overall simulation and used it to figure out a single parameter: the percentage of female elephants to dart each year. This week we're going to explore how to optimize one or more parameters of a simulation automatically. This will involve a little bit of restructuring of your elephant simulation code and then the development of an optimizer that will run the simulation many times in order to figure out parameters that best achieve a specific outcome.
- Write a function to optimize the percent darted
Create a new file called optimize.py. Have the file import the sys, elephant, and random packages. Then create a function called optimize with the following definition.
# Executes a search to bring the result of the function optfunc to zero. # min: minimum parameter value to search # max: maximum parameter value to search # optfunc: function to optimize # parameters: optional parameter list to pass to optfunc # tolerance: how close to zero to get before terminating the search # maIterations: how many iterations to run before terminating the search # verbose: whether to print lots of information or not def optimize( min, max, optfunc, parameters = None, tolerance = 0.001, maxIterations = 20, verbose=False ):
The optimize function is very similar to the binary search function you wrote in lab.
- Start by assigning to a variable done the value False.
- Start a loop that continues while done is equal to False.
- Inside the loop, assign to testValue the average of max and min. This is not (should not be) an integer calculation. If verbose is True, print out testValue.
- Assign to result the return value of calling optfunc with testValue and parameters as the arguments. If verbose is True, print out the result value.
- If the result is positive, assign to max the value of testValue. Else if the result is negative, assign to min the value of testValue. Else, assign to done the value True.
- If max - min is less than the tolerance value, then assign to done the value True.
- Decrement maxIterations. If maxIterations is less than or equal to zero, then set done to True.
- Outside the loop, return testValue.
To test your optimize function, copy the following code and run optimize.py. As noted in the comments, try making tolerance smaller and see if it matches more digits in the target value.
# a function that returns x - target def target(x, pars): return x - 0.73542618 # Tests the binary search using a simple target function. # Try changing the tolerance to see how that affects the search. def testTarget(): res = optimize( 0.0, 1.0, target, tolerance = 0.01, verbose=True) print res return if __name__ == "__main__": testTarget()
- Test the optimize function with your elephantSim
The next step is to test the optimize function with your elephantSim function. Create a testEsim function (similar to testTarget above) that calls optimize with a min value of 0.0, a max value of 0.5, and passes it elephant.elephantSim as the target function. You probably want to set verbose=True as well. As with the testTarget function, assign the return value to a variable and then print the variable. At the bottom of your code, change testTarget() to testEsim() then run optimize.py. Does your optimize function find a value close to 0.43 for the percent darted?
- Automate varying a simulation parameter
The next step is to automate the process of evaluating the effects of changing a simulation parameter across a range of values. This function will let us discover, for example, the effect on the dart percentage of changing the calfSurvival rate from 80% to 90% in steps of 1%. The function definition is given below.
# Evaluates the effects of the selected parameter on the dart percentage # whichParameter: the index of the parameter to test # testmin: the minimum value to test # testmax: the maximum value to test # teststep: the step between parameter values to test # defaults: default parameters to use (default value of None) def evalParameterEffect( whichParameter, testmin, testmax, teststep, defaults=None, verbose=False ): # if defaults is None, assign to simParameters the result of calling elephant.defaultParameters. # else, assign to simParameters a copy of defaults (e.g. simParameters = defaults[:] # create an empty list (e.g. results) to hold the results if verbose: print "Evaluating parameter %d from %.3f to %.3f with step %.3f" % (whichParameter, testmin, testmax, teststep) # assign to t the value testmin # while t is less than testmax # assign to the whichParameter element of simParameters (e.g. simParameters[whicParameter]) the value t # assign to percDart the result of calling optimize with the appropriate arguments, including simParameters # append to results the tuple (t, percDart) if verbose: print "%8.3f \t%8.3f" % (t, percDart) # increment t by the value teststep if verbose: print "Terminating" # return the list of results
Test your evalParameterEffects function by modifying your top level code at the bottom of your file to be the following.
if __name__ == "__main__": evalParameterEffect( elephant.IDXProbAdultSurvival, 0.98, 1.0, 0.001, verbose=True )
What does this do? What should you expect the output to be?
- Evaluate the effects of varying many parameters
Your final task is to make the following evaluations, showing the effect on the dart percentage of the following parameter sweeps. Make a table or graph (or both) for each case. These five items should go in your report.
- Vary the adult survival probability from 0.98 to 1.0 in steps of 0.001.
- Vary the calf survival probability from 0.80 to 0.90 in steps of 0.01.
- Vary the senior survival probability from 0.1 to 0.5 in steps of 0.05.
- Vary the calving interval from 3.0 to 3.4 in steps of 0.05.
- Vary the max age from 56 to 66 in steps of 2.
While not required, your goal should be to automate this process as much as possible, these include writing the output data to a CSV file or plotting the results using matplotlib. While debugging the automation process, you might want to reduce the carrying capacity even further to 200 or 500 elephants. Only run with 1000 elephants once everything works properly.
- What does an import statement do?
- What is binary search?
- Why is binary search faster than a linear search (e.g. going page by page to find a word in a dictionary)?
- What national park (of any country) would you most like to visit?
Extensions are your opportunity to customize your project, learn something else of interest to you, and improve your grade. The following are some suggested extensions, but you are free to choose your own. Be sure to describe any extensions you complete in your report.
- Automate the graphiing process using matplotlib or another graphing package of your choice.
- Have your program write out proper CSV files with a header line and appropriate commas for the full process.
- How much variation is there in the average total population for a 200-year elephant simulation across different runs? How stable is the estimate generated by doing 5 simulation runs? Calculating the standard deviation of the population sizes is a reasonable indicator of spread.
- Enable the user to control your top level program with optional flags. For example, -par CarryingCapacity would specify that the program should evaluate carrying capacity, and -min 3500 would specify that it should start the evaluation at 3500.
- Check out the os package (import os). What could you do with the os.system function to automate your simulations?
- Uber-extension: explore varying two parameters simultaneously. For example, if you are evaluating maximum age from 56 to 64, combine it with a set of values for senior survival rates. If you have five values for maximum age and five values for senior survival rates, there would be 25 unique parameter combinations. The plot of the probability of darting for a stable population would be a 3D graph with horizontal axes for the two parameter values and a vertical axis indicating the probability of darting.
- Does the carrying capacity have a significant effect on the probability of darting? Is there a carrying capacity below which the probability of darting goes to zero?
Submit your code
Turn in your code (all files ending with .py) by putting it in a directory in the Courses server. On the Courses server, you should have access to a directory called CS152, and within that, a directory with your user name. Within this directory is a directory named Private. Files that you put into that private directory you can edit, read, and write, and the professor can edit, read, and write, but no one else. To hand in your code and other materials, create a new directory, such as project6, and then copy your code into the project directory for that week. Please submit only code that you want to be graded.
When submitting your code, double check the following.
- Is your name at the top of each code file?
- Does every function have a comment or docstring specifying what it does?
- Is your handin project directory inside your Private folder on Courses?
Write your project report
For CS 152 please use Google Docs to write your report. Create a new doc for each project. Start the doc with a title and your name. Attach the doc to your project on Google classroom. Make sure you click submit when you are done. The graders cannot provide feedback unless you click submit.
Your intended audience for your report is your peers not in the class. From week to week you can assume your audience has read your prior reports. Your goal should be to be able to use it to explain to friends what you accomplished in this project and to give them a sense of how you did it.
Your project report should contain the following elements.
A brief summary of the project, in your own words. This should be no more than a few sentences. Give the reader context and identify the key purpose of the assignment.
Writing an effective abstract is an important skill. Consider the following questions while writing it.
- Does it describe the CS concepts of the project (e.g. writing well-organized and efficient code)?
- Does it describe the specific project application?
- Does it describe your the solution or how it was developed (e.g. what code did you write)?
- Does it describe the results or outputs (e.g. did your code work as expected)?
- Is it concise?
- Are all of the terms well-defined?
- Does it read logically and in the proper order?
- A description of your solution to the tasks, including any text output or images you created (including the three required images mentioned above). This should be a description of the form and functionality of your final code. Note any unique computational solutions you developed or any insights you gained from your code's output.
- A description of any extensions you undertook, including text output or images demonstrating those extensions. If you added any modules, functions, or other design components, note their structure and the algorithms you used.
- The answers to any follow-up questions (there will be 3-4 for each project).
- A brief description (1-3 sentences) of what you learned. Think about the answer to this question in terms of the stated purpose of the project. What are some specific things you had to learn or discover in order to complete the project?
- A list of people you worked with, including TAs and professors. Include in that list anyone whose code you may have seen, such as those of friends who have taken the course in a previous semester.
Thanks to Cathy Collins for the project idea and documentation. The original project concept and idea came from Therese Donovan, University of Vermont.