Skip to content

ashiq-km/Ames-Housing-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Ames Housing Price Prediction 🏠📈

This repository contains an end-to-end regression project using the
Ames Housing dataset from Kaggle.

The goal is to predict house sale prices by applying:

  • classical data preprocessing techniques
  • feature engineering
  • and a Neural Network built using TensorFlow Keras Functional API

This project is primarily for learning and experimentation, not leaderboard chasing.


📌 Dataset

  • Source: Kaggle Competition
    House Prices: Advanced Regression Techniques
  • Target variable: SalePrice
  • Data includes:
    • numerical features
    • categorical features
    • missing values
    • skewed distributions

⚠️ Dataset files are not included in this repository due to Kaggle’s licensing rules.


🗂️ Project Structure

.
├── data/ # Local dataset (ignored by git)
├── notebooks/ # EDA & experiments
├── src/ # Preprocessing & model code
├── models/ # Saved models (optional)
├── README.md
├── .gitignore
└── requirements.txt

🔁 Workflow / Steps Followed

1️⃣ Data Acquisition

  • Download dataset using Kaggle CLI
  • Store raw data locally under data/

2️⃣ Exploratory Data Analysis (EDA)

  • Understand feature distributions
  • Analyze target variable (SalePrice)
  • Identify:
    • missing values
    • skewness
    • outliers
  • Visualizations:
    • histograms
    • correlation heatmaps
    • pairplots (feature vs target)

3️⃣ Data Preprocessing

  • Handle missing values
  • Encode categorical variables
  • Scale numerical features
  • Log-transform skewed variables
  • Train–test split

4️⃣ Model Building (Keras Functional API)

  • Define input layers explicitly
  • Build dense neural network with:
    • multiple hidden layers
    • ReLU activations
    • output layer for regression
  • Compile model with:
    • optimizer (Adam)
    • loss function (MSE / MAE)

5️⃣ Model Training

  • Train on training data
  • Validate on hold-out set
  • Monitor:
    • training loss
    • validation loss
  • Tune:
    • number of layers
    • neurons
    • learning rate

6️⃣ Evaluation

  • Evaluate model using:
    • RMSE
    • MAE
  • Compare with baseline ML models (optional)

7️⃣ Experimentation & Improvements

  • Feature selection
  • Regularization (Dropout / L2)
  • Hyperparameter tuning
  • Architecture experiments

🧠 Why Functional API?

  • More flexible than Sequential
  • Explicit control over:
    • inputs
    • outputs
    • complex architectures
  • Industry-standard for non-trivial models

🛠️ Tech Stack

  • Python
  • NumPy, Pandas
  • Matplotlib, Seaborn
  • Scikit-learn
  • TensorFlow / Keras
  • Kaggle API
  • Git & GitHub

🚫 Notes

  • data/ directory is ignored using .gitignore
  • Kaggle datasets should not be committed
  • This repo focuses on learning and clarity, not competition ranking

📜 License

This project is licensed under the Apache License 2.0
See the LICENSE file for details.


✍️ Author

Ashiq KM
Learning-focused ML & Deep Learning experiments 🚀

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors