Exploratory Data Analysis of New York City Airbnb listings, focusing on pricing, availability, host behavior, and neighborhood patterns across the five boroughs.
This project explores the NYC Airbnb Open Data dataset to answer questions about pricing, neighborhood trends, availability, and host behavior. The analysis follows a complete professional workflow suitable for portfolio use.
Python 3.11
Pandas, NumPy
Matplotlib, Seaborn
Jupyter Notebook
Conda environment
nyc-airbnb-analysis/
├── data/
├── figures/
├── notebooks/
│ └── NYC_Airbnb_EDA.ipynb
├── environment.yml
├── requirements.txt
├── .gitignore
└── README.md
conda env create -f environment.yml
conda activate nyc-airbnb-env
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Download from Kaggle:
https://www.kaggle.com/datasets/airbnb/new-york-city
Place the CSV inside the data/ folder.
- Import & Setup
- Data Overview
- Cleaning & Fixes
- Feature Engineering
- Univariate Analysis
- Bivariate Analysis
- Summary & Insights
- Manhattan has the highest price levels.
- Brooklyn shows strong listing volume at moderate prices.
- Cheaper listings attract more guest reviews.
- Host activity varies sharply by portfolio size.
- Availability shows seasonal and behavioral patterns.
MIT License for project code.
Dataset licensed by Kaggle and Inside Airbnb.