 Fall 2018

## Reading and Accessing Data

The purpose of this project is to give you practice with reading data and working with variables in R.

1. Open R Studio or a text editor and Create a new R file called project1.R
2. Put your name, Project 1, and the date at the top of the file. No credit will be given for an R file without your name at the top of the file. Create a project1 working directory and save your file there. Then select the menu option Session::Set Working Directory::To Source File Location.
3. Assign to the variable `valuesA` the numbers from 1 to 10 using the slice notation. Assign to a second variable `valuesB` the numbers from 11 to 20, also using the slice notation. Assign to a third variable `valuesC` the numbers from 21 to 30 using the c() function. The arguments to the c function should be the numbers 21 to 30 separated by commas.
4. Print out the results of adding, subtracting, multiplying, and dividing `valuesA` by `valuesB`. Repeat with `valuesA` and `valuesC`.
5. Assign to the variable `results` a double vector of length four. You can use the function `double()`. The function parameter is the length of the vector you want R to create.
6. Using the `sum` or `prod` function, assign to `results` the sum of `valuesA` plus `valuesB`, assign to `results` the sum of `valuesA` minus `valuesB`, assign to `results` the product of the elements of `valuesA`, and assign to `results` the sum of `valuesA` times `valuesB`. Have R print out `results`, which should be the values 210 -100 3628800 935.
7. Download the Comma-Separated-Values [CSV] test file test1.csv and save it in the same directory as your variables.R file. In your project1.R file assign to the variable `df` the result of calling the read.csv function with the argument test1.csv. Be sure to put quotes around the filename in the function call. Note, you may need to use the Session::Set Working Directory::To Source File Location, for R Studio to look in the right place for the file.

Run your source file and make sure that df shows up in your workspace as 6 observations of 3 variables.

8. Have your code print the names of the columns of the data frame by printing the result of calling the `names` function with `df` as the argument.
9. Have your code print out the first column of data using three methods.
1. Using the \$ notation and the name of the first column.
2. Using the [] notation and the index of the first column.
3. Using the [[]] notation and the index of the first column.

Which notation returns the result as a vector? (Hint: a vector should look like what happens when you print valuesA.)

10. Have your code print out the value in the the third row and second column of the data.
11. Have your code print out the first row of data using the expression `df[1,]`. Have your code print out the second row of data the same way. Then use the `as.numeric()` function to convert the row of data to a vector before printing it.
12. Have your code print out the sum and mean of each column of data. Note that you will want to use the \$ or [[]] notation to access the column as a vector.
13. Create or find an Excel or CSV file with at least two columns of data of numeric data. If it is an Excel file, save it as a CSV file. Write code to read in the file and print out the mean of each of the first two columns of numeric data in the CSV file.

### Report

Answer the following questions. Submit your answers as a plain text file or PDF in your handin directory. Put your name at the top of the file. No credit will be given for any other format or for a file without a name.

• What is a variable?
• What is a vector?
• What is a data frame?
• What is a CSV file?
• What are the steps you have to take to read a CSV file and compute the mean of the first column of data?

### Handin

Create a project1 folder inside the Private folder in your Courses directory. Put your project1.R file and your text/PDF file with your answers into the project1 folder. They should be the only documents in the project1 folder.