Accidents_of_Toronto_From-_2015_to_2018

title: "Project1" author: "Dhruv Sojitra" date: "202-10-28" output: pdf_document

Importing Datasets

data2015 <- read.csv("2015_KSI.csv")
data2016 <- read.csv("2016_KSI.csv")
data2017 <- read.csv("2017_KSI.csv")
data2018 <- read.csv("2018_KSI.csv")

Merging all datasets in one

Viewing Column Names

colnames(data2015)

Getting all column names same.

colnames(data2016) <- colnames(data2015)
colnames(data2017) <- colnames(data2015)
colnames(data2018) <- colnames(data2015)

Merging all data sets

dataMerged <- rbind(data2015,data2016)
dataMerged <- rbind(dataMerged,data2017)
dataMerged <- rbind(dataMerged,data2018)

Viewing Data

str(dataMerged)

Creating sub table for Question 1

dataset_for_fatality <- data.frame(dataMerged$YEAR, dataMerged$VEHICLES_IN_STREET, dataMerged$DISTRICT, dataMerged$NEIGHBOURHOOD, dataMerged$FATAL_NO)
colnames(dataset_for_fatality) <- c("YEAR", "VEHICLES_IN_STREET", "DISTRICT", "NEIGHBOURHOOD", "FATAL_NO")

Generating Table for Fatalities in last 4 Years in each Neighbourhood of Toronto.

Here we have used library "SQLDF" which allows us to use sql like queries to interpret our dataframe. We have used GROUP BY clause to group the neighbourhood and SUM aggregate function to add all the fatalities in that neighbourhood

library(sqldf)
datares <- sqldf('SELECT DISTRICT, NEIGHBOURHOOD, sum(FATAL_NO) AS "Fatalities" FROM dataset_for_fatality WHERE DISTRICT LIKE "Toronto%" GROUP BY NEIGHBOURHOOD')
head(datares,15)

Moving on to Question 2 ...

Generating Table for total vehicles in the street for each district for the last 4 years during the accidents.

datares2 <- sqldf('SELECT DISTRICT, sum(VEHICLES_IN_STREET) AS "Vehicles in Street" FROM dataset_for_fatality WHERE DISTRICT NOT LIKE "<Null>" GROUP BY DISTRICT')
show(datares2)

Above we have the result of the sum of all vehicles in street during an accident since last 4 years in each District.

Moving on to Question 3...

Generating the table for Top 5 neighbourhoods with the highest average number of vehicles in the streets.

datares3 <- sqldf('SELECT NEIGHBOURHOOD, avg(VEHICLES_IN_STREET) AS "Vehicles in Street" FROM dataset_for_fatality GROUP BY NEIGHBOURHOOD ORDER BY VEHICLES_IN_STREET DESC LIMIT 5')
show(datares3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Accidents_of_Toronto_From-_2015_to_2018

title: "Project1" author: "Dhruv Sojitra" date: "202-10-28" output: pdf_document

Importing Datasets

Merging all datasets in one

Creating sub table for Question 1

Generating Table for Fatalities in last 4 Years in each Neighbourhood of Toronto.

Generating Table for total vehicles in the street for each district for the last 4 years during the accidents.

Generating the table for Top 5 neighbourhoods with the highest average number of vehicles in the streets.

About

Uh oh!

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
2015_KSI.csv		2015_KSI.csv
2016_KSI.csv		2016_KSI.csv
2017_KSI.csv		2017_KSI.csv
2018_KSI.csv		2018_KSI.csv
Accidents_of_Toronto_From-_2015_to_2018.Rmd		Accidents_of_Toronto_From-_2015_to_2018.Rmd
LICENSE		LICENSE
README.md		README.md

License

dhruvsojitra76/Accidents_of_Toronto_From-_2015_to_2018

Folders and files

Latest commit

History

Repository files navigation

Accidents_of_Toronto_From-_2015_to_2018

title: "Project1" author: "Dhruv Sojitra" date: "202-10-28" output: pdf_document

Importing Datasets

Merging all datasets in one

Creating sub table for Question 1

Generating Table for Fatalities in last 4 Years in each Neighbourhood of Toronto.

Generating Table for total vehicles in the street for each district for the last 4 years during the accidents.

Generating the table for Top 5 neighbourhoods with the highest average number of vehicles in the streets.

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages