Skip to content

keyurgohel1/Datathon-2025

Repository files navigation

Datathon-2025: Ski Resort Visitor Forecasting

This repository contains a data science project for the Inter-University Datathon 2025, focusing on predicting ski resort visitor numbers using historical visitation data and climate factors.

🎯 Project Overview

The project aims to forecast weekly visitor numbers for Australian ski resorts for the 2026 ski season using:

  • Historical visitation data (2021-2025) across multiple ski resorts
  • Climate data including temperature, rainfall/snowfall, freeze days, and sub-zero nights
  • Machine learning techniques including time series forecasting with Prophet, DARTS, and XGBoost

📊 Key Components

Core Analysis Files

  • Datathon2025.ipynb — Main analysis notebook featuring data preprocessing, feature engineering, Prophet time series forecasting, and visitor prediction models
  • EDA.ipynb — Exploratory data analysis with comprehensive visualizations and statistical insights
  • data/2025 Allianz Datathon Dataset.xlsx — Primary dataset containing visitation and climate data

Workshop Materials

  • notebooks/Copy_of_DS3_EDA.ipynb — Educational EDA notebook from workshop series
  • workshops/DataSoc_2025_Interuni_Workshop_DARTS — DARTS (Deep AR Time Series) forecasting techniques
  • workshops/DataSoc_2025_Interuni_Workshop_XGBoost — XGBoost ensemble learning methods
  • docs/ — Competition case brief, information packs, and workshop slides

Workshop recordings

🛠️ Technical Approach

Data Processing Pipeline

  1. Data Loading — Import visitation and climate data from Excel sheets
  2. Data Cleaning — Handle missing values, outliers, and data type conversions
  3. Feature Engineering — Create climate-based indicators (freeze days, temperature comfort scores)
  4. Resort Mapping — Match climate stations to ski resort locations
  5. Time Series Preparation — Structure data for weekly forecasting

Machine Learning Models

  • Prophet — Facebook's time series forecasting tool for seasonal visitor patterns
  • DARTS — Deep learning-based time series forecasting
  • XGBoost — Gradient boosting for capturing complex feature interactions

Key Features

  • Snow Reliability Score (SRS) — Composite metric combining snowfall, freeze days, and thaw risk
  • Weather Comfort Score (WCS) — Temperature-based visitor comfort index
  • Historical Patterns — Seasonal trends and year-over-year visitor growth

📈 Results & Insights

The analysis reveals strong correlations between climate factors and ski resort visitation, with models achieving robust forecasting performance for the 2026 ski season across multiple Australian ski resorts.

🚀 How to use these resources

  1. Start with the dataset (data/2025 Allianz Datathon Dataset.xlsx) and case reveal document to understand the problem scope
  2. Explore the data using EDA.ipynb for comprehensive visualizations and statistical insights
  3. Run the main analysis with Datathon2025.ipynb to see the complete modeling pipeline
  4. Learn techniques from workshop materials covering DARTS and XGBoost implementations
  5. Reference documentation in docs/ for competition guidelines and methodology

🏆 Competition Context

This project was developed for the Inter-University Datathon 2025, sponsored by Allianz, focusing on practical applications of data science in the tourism and recreation industry.


Repository maintained by: keyurgohel1

About

Repo for Datathon-2025

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors