Title image Fall 2018

Basic Graphics

The purpose of this project is to give you practice with creating different types of plots using the basic R plotting capabilities.


Project tasks

  1. Open R Studio or a text editor and Create a new R file called project2.R. Put your name, Project 2, and the date at the top of the file. No credit will be given for an R file without your name at the top of the file. Create a project2 directory and save your file and R project there. Set your working directory to the source file directory.
  2. The General Social Survey is the longest running survey to support social indicator research, with data going back to 1972. Here is the 1991 data, and here is the 2016 data, both in SPSS format. Download the 1991 and 2016 data and save them in your project2 working directory.
  3. In R Studio, go to the Packages tab and check the foreign package, which lets you read SPSS files. In your project2.R source file, assign the the variable gss91 the result of calling the read.spss function with the filename as the first argument and the additional arguments to.data.frame=TRUE, use.value.labels = FALSE, trim_values=TRUE.
  4. After reading the file, have your code print the result of calling names(gss91). These are the names of all of the columns of data.
  5. Assign to gss16 the result of using read.spss to read the GSS2016.sav file. Except for the filename, use the same arguments as when reading the 1991 file.
  6. Using the hist function, Make a histogram of the WRKYEARS field (# of years worked since age 16) of the 1991 data. It's good to assign the result of calling the hist function to a variable to you can access information about the histogram (like the breaks or density values) later on.
    • Set the x-axis label and main title to appropriate values.
    • Color the bars to something you find pleasing, and have the histogram include a label for each bar. (You may need to set the y limit in order to avoid cropping one of the labels.)
    • Print out the histogram breaks and density vector.
    • At the end, add the following code to pause execution.
    • readline( "Press enter to continue" )
      (You will have to hit the Enter/Return key in the console to continue execution.)
  7. Make a second histogram of a column of your choice (e.g. AGE works well) and make it look nice. You can find the meaning of each header in the GSS here. After you create your second histogram, add another readline call to pause execution.
  8. Using the plot function, create a 2-D scatter plot of AGE (x-axis) versus Work Years (y-axis) using the 1991 data. Set an appropriate x-label, y-label, and main title. Modify the color of the points in the plot.
  9. Make a 2-D scatter plot of two variables of your choice from the 2016 data. Make it look nice.
  10. Assign to a variable (e.g. tab91) the result of using the table function on the CONEDUC field of the 1991 data. Repeat with a second variable (e.g. tab16) using the 2016 data. Note that you can convert the table values to percentages using the following expression.
    tab91perc <- 100 * tab91 / sum(tab91)
    tab16perc <- 100 * tab16 / sum(tab16)
  11. Using the barplot function, create a plot for the 1991 table for CONEDUC. See if you can set up the category names under the bars in the plot. Give the plot a useful title and y label. Repeat for 2016. Note any differences?
  12. Make two bar plots of a variable of your choice, comparing the 1991 and 2016 data.

Report

Answer the following questions. Submit your answers as a plain text file or PDF in your handin directory. Put your name at the top of the file. No credit will be given for any other format or for a file without a name.


Handin

Create a project2 folder inside the Private folder in your Courses directory. Put your project2.R file and your text/PDF file with your answers into the project2 folder. They should be the only documents in the project2 folder.