Title image Fall 2018

Reading and Accessing Data

The purpose of this project is to give you practice with reading data and working with variables in R.


Project tasks

  1. Open R Studio or a text editor and Create a new R file called project1.R
  2. Put your name, Project 1, and the date at the top of the file. No credit will be given for an R file without your name at the top of the file. Create a project1 working directory and save your file there. Then select the menu option Session::Set Working Directory::To Source File Location.
  3. Assign to the variable valuesA the numbers from 1 to 10 using the slice notation. Assign to a second variable valuesB the numbers from 11 to 20, also using the slice notation. Assign to a third variable valuesC the numbers from 21 to 30 using the c() function. The arguments to the c function should be the numbers 21 to 30 separated by commas.
  4. Print out the results of adding, subtracting, multiplying, and dividing valuesA by valuesB. Repeat with valuesA and valuesC.
  5. Assign to the variable results a double vector of length four. You can use the function double(). The function parameter is the length of the vector you want R to create.
  6. Using the sum or prod function, assign to results[1] the sum of valuesA plus valuesB, assign to results[2] the sum of valuesA minus valuesB, assign to results[3] the product of the elements of valuesA, and assign to results[4] the sum of valuesA times valuesB. Have R print out results, which should be the values 210 -100 3628800 935.
  7. Download the Comma-Separated-Values [CSV] test file test1.csv and save it in the same directory as your variables.R file. In your project1.R file assign to the variable df the result of calling the read.csv function with the argument test1.csv. Be sure to put quotes around the filename in the function call. Note, you may need to use the Session::Set Working Directory::To Source File Location, for R Studio to look in the right place for the file.

    Run your source file and make sure that df shows up in your workspace as 6 observations of 3 variables.

  8. Have your code print the names of the columns of the data frame by printing the result of calling the names function with df as the argument.
  9. Have your code print out the first column of data using three methods.
    1. Using the $ notation and the name of the first column.
    2. Using the [] notation and the index of the first column.
    3. Using the [[]] notation and the index of the first column.

    Which notation returns the result as a vector? (Hint: a vector should look like what happens when you print valuesA.)

  10. Have your code print out the value in the the third row and second column of the data.
  11. Have your code print out the first row of data using the expression df[1,]. Have your code print out the second row of data the same way. Then use the as.numeric() function to convert the row of data to a vector before printing it.
  12. Have your code print out the sum and mean of each column of data. Note that you will want to use the $ or [[]] notation to access the column as a vector.
  13. Create or find an Excel or CSV file with at least two columns of data of numeric data. If it is an Excel file, save it as a CSV file. Write code to read in the file and print out the mean of each of the first two columns of numeric data in the CSV file.

Report

Answer the following questions. Submit your answers as a plain text file or PDF in your handin directory. Put your name at the top of the file. No credit will be given for any other format or for a file without a name.


Handin

Create a project1 folder inside the Private folder in your Courses directory. Put your project1.R file and your text/PDF file with your answers into the project1 folder. They should be the only documents in the project1 folder.