Exploratory Data Analysis (EDA) of global video game sales to uncover trends across genres, platforms, and regions using Python data tools.
Exploratory Data Analysis (EDA) of global video game sales to uncover trends across genres, platforms, and regions using Python data tools.
This project explores the Video Game Sales with Ratings dataset from Kaggle to answer key analytical questions such as:
- Which game genres and platforms have dominated sales over time?
- How do critic and user ratings relate to global sales?
- Which publishers achieved the highest commercial success?
- Python 3.11
- Pandas, NumPy – data cleaning and transformation
- Seaborn, Matplotlib – static visualization
- Plotly – interactive charts
- Jupyter Notebook – exploration workflow
- Kaggle API – automated data acquisition
video-game-sales-analysis/
├── data/ # Dataset (downloaded via Kaggle API)
├── figures/ # Saved plots and visualizations
├── notebooks/ # Jupyter notebooks for analysis
│ └── Video_Game_Sales_EDA.ipynb
├── environment.yml # Conda environment file
├── requirements.txt # Pip dependencies
├── .gitignore
└── README.md
conda env create -f environment.yml
conda activate vg-sales-envpython -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
pip install -r requirements.txtTo use the Kaggle command below, you must first set up your Kaggle API credentials:
- Go to your Kaggle Account Settings or create one if you don't have.
- Click on Settings from your Profile Dashborad and Scroll to API and click Create New API Token. This will download a file named
kaggle.json. - Move the file to your home directory’s hidden
.kagglefolder:mkdir -p ~/.kaggle mv ~/Downloads/kaggle.json ~/.kaggle/ chmod 600 ~/.kaggle/kaggle.json
~/.kaggle/is where the Kaggle CLI looks for your credentials.chmod 600ensures your key file is private and secure.
- Verify installation:
If you see a version number (e.g.,
kaggle --version
Kaggle API 1.7.4.5), your setup is correct.
Once configured, run this inside the notebook or terminal:
kaggle datasets download -d rush4ratio/video-game-sales-with-ratings -p data/ --unzipThis command will automatically create the data/ folder (if missing) and extract the CSV file for analysis.
- Import & Setup – Load libraries and set styles
- Data Overview – Inspect structure, missing values, and summary stats
- Cleaning & Fixes – Handle missing data and inconsistent values
- Feature Engineering – Create features like
game_age - Univariate Analysis – Examine distributions of sales, genres, and scores
- Bivariate Analysis – Explore relationships (e.g., critic score vs global sales)
- Summary & Conclusions – Highlight insights and findings
- Apply complete EDA workflow from setup to visualization
- Use the Kaggle API for reproducible dataset acquisition
- Document findings clearly with Markdown and visual summaries
- Strengthen portfolio presentation for data analysis roles
Dataset provided under Kaggle’s data-sharing license.
Project notebooks and analysis are released under the MIT License.