-
Notifications
You must be signed in to change notification settings - Fork 6
Getting data into R
We can import Comma Separated Values (CSV) files into R very easily. These files can be generated by Microsoft Excel, Apple Numbers and Google Sheets usually through a File -> Export process.
Once a sheet has been exported, it can be imported into R:
data <- read.csv(LOCATION_OF_FILE, header=TRUE)
This will create a variable called data in the global namespace with your sheet. Columns can then be referenced by:
data$column_name
We will need to teach people to install the googlesheets library for importing data from Google Sheets.
install.packages("dplyr")
install.packages("googlesheets")
Once the package has been installed, load it into your R script:
library("dplyr"
library("googlesheets")
Now list all the sheets available (local and ones shared to your account as shown https://docs.google.com/spreadsheets/) using
gs_ls()
On the first run, it will ask you to authorise the package 'googlespreadsheet' to view your account by copying a link from the console and pasting it into a browser. Once you have authorised it, you will receive a code from your browser which you can then paste into the console. A list of all your sheets will then be printed in the console.
To access the sheet:
sheet <- gs_title(YOUR_TITLE)
Where YOUR_TITLE is the name of the sheet as shown in the gs_ls command. gs_title will authorise the use of that sheet.
To get access to the worksheet:
data <- sheet %>% gs_read(ws = NAME_OF_WORKSHEET)
data variable now contains a dataframe much like importing from CSV. You can access columns by:
data$column_name
- to lower
- ??data types
-
head(),tail() stem()summary()
https://ramnathv.github.io/pycon2014-r/explore/README.html
Types of data:
- Numeric
- Integers
- Strings
- Date/Time objects
I'm going to use the above link to focus a 20 minute tutorial on 3 common mistakes:
- Column headers are values, not variable names
- Multiple variables are stored in one column
- Variables are stored in both rows and columns
There are 2 other instances of untidy data, but I won't delve too much into it.
I've already written a little powerpoint presentation that I have presented to the SHOs regarding tidy data so it'll be an extension of that.
- Example with Income and Religion table
Please contact Steve Harris if you have any questions.