Skip to content

Getting data into R

Steve Harris edited this page Mar 18, 2016 · 12 revisions

Tidying data in R

Getting the data in

We will need to teach people to install the googlesheets library for importing data from Google Sheets.

  • Import from Google Sheets
  • Import from .csv and/or .xls

Testing the data is what was expected

  • head(), tail()
  • stem()
  • summary()

https://ramnathv.github.io/pycon2014-r/explore/README.html

Beginners

Types of data:

  • Numeric
  • Integers
  • Strings
  • Date/Time objects

Improvers

I'm going to use the above link to focus a 20 minute tutorial on 3 common mistakes:

  1. Column headers are values, not variable names
  2. Multiple variables are stored in one column
  3. Variables are stored in both rows and columns

There are 2 other instances of untidy data, but I won't delve too much into it.

I've already written a little powerpoint presentation that I have presented to the SHOs regarding tidy data so it'll be an extension of that.

Column headers are values, not variable names

  • Example with Income and Religion table

Multiple variables are stored in one column

Variables are stored in both rows and columns

Clone this wiki locally