The script "run_analysis.R" runs an analysis on data collected from the accelerometers from the Samsung Galaxy S smartphone.
More information about the data: http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones
The actual data that is and can be used: https://d396qusza40orc.cloudfront.net/getdata%2Fprojectfiles%2FUCI%20HAR%20Dataset.zip
The assignment documentation can be found on: https://class.coursera.org/getdata-011/human_grading/view/courses/973498/assessments/3/submissions
This project consists of the following files:
- README.md This README file that the describes the repository.
- run_analysis.R: the R-script that runs the analysis.
- CodeBook.md: Mark-up file to dscribe the functions, variables taht are used in the run_analysis.R script.
- run_analysis.txt: Export file of the script run_analysis.R.
- Merges the training and the test sets to create one data set. The "train/X_train.txt" is read with the fucntion "read.table" and stored in train. The "train/y_train.txt" and "train/subject_train.txt" are binded as a new columns with the function cbind. For the test sets this is equivalent only with the test files and stored in the test variable. The two sets train and test are combined with the function rbind.
- Extracts only the measurements on the mean and standard deviation for each measurement. The "features.txt" file is read in into the variable features with the function "read.table". The two colomns of the fatures data.frame are renamead as "column" and "name". A new dataframe "selectedfeatures" is created by filtering only the features with the text "mean()" or "std()". The data in ttdata is filtered with the selectedfeatures
- Uses descriptive activity names to name the activities in the data set. Reads in the activities labels with the function "read.table" in to a variable "activities". Converts the activities column in the data.frame "ttdata" to a factor with the "activities" variable
- Appropriately labels the data set with descriptive variable names.Sets the names of the columns of the data.frame "ttdata" to the names from the features together with "activity" and "subject".
- From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject.Aggregates the data with the function "aggregate" grouped by the two columns "subject" and "activity" and applies the function meand to all the other columns.
- Writes the result to a file. writes result the table to a file "run_analysis.txt".
Roy de Groot