This repository contains multiple data analysis projects implemented in R. The projects focus on extracting insights from real-world datasets using data visualization and statistical techniques.
- Big Data Analysis Using R.R → Main R script containing analysis for multiple datasets.
- Big Data Analysis Using R.docx → Documentation of the analysis and results.
- README.md → Project overview (this file).
- Dataset:
vgsales.csv
- Objective: Identify sales distribution across genres and find the highest selling game genre.
- Techniques Used:
- Pie chart visualization of global sales by genre
- Percentage contribution analysis
- Dataset:
SeoulBikeData.csv
- Objectives:
- Analyze seasonal and monthly demand for bike rentals
- Study relationships with weather conditions
- Techniques Used:
- Line charts for bike count and temperature trends
- Pie chart of bike rentals by season
- Scatter plots for bike count vs temperature and bike count vs rainfall
- Bar charts for holiday vs non-holiday usage
- Datasets:
CallVoiceQualityExperience-2018-April.csv
CallVoiceQuality_Data_2018_May.csv
- Objectives:
- Evaluate call quality across states, operators, network types, and conditions (indoor/outdoor/travelling).
- Techniques Used:
- Bar charts for operator quality ratings, state-wise performance, and network type analysis
- Heatmap of state vs network type ratings
- Horizontal bar charts for call drop categories
- R Language
- Libraries:
ggplot2
→ Data visualizationdplyr
→ Data manipulationlubridate
→ Date handlingreshape2
→ Data reshaping
- Clone the repository:
git clone https://github.com/Mahak0747/Big-Data-Analysis.git cd Big-Data-Analysis
- Open Big Data Analysis Using R.R in RStudio or run in R console.
- Make sure the datasets (vgsales.csv, SeoulBikeData.csv, CallVoiceQualityExperience-2018-April.csv, CallVoiceQuality_Data_2018_May.csv) are available in your working directory. Update the file paths in the script if needed.
- Install required libraries if not already installed:
install.packages(c("ggplot2", "dplyr", "lubridate", "reshape2"))
- Video game sales by genre
- Bike rentals vs temperature
- Call quality ratings across states
- Heatmap of call quality (State × Network Type)
CSE (AI) Student
Interests: Data Analysis, Machine Learning, Game Development