CSC522 Course Project

Allstate Claims Severity

Data Exploration --> data exploration.ipynb

Single Model:

Data Precessing for single model -- > Data Precessing_single model.ipynb

4 output :

      training dataset features:  "train_x.csv"
      
      train traget attribut: loss : "train_label.csv"
      
      testing dataset features: "test_x.csv"  
      
      test dataset ID: "test_id.csv" // This one is used for kaggle submission

These files precessed data for the single model input.

Data Precessing for stacking model --> Dataset_stacking.ipynb

4 output:

      Layer1 training dataset: 'train_layer1_x.csv'
                               "train_layer1_label.csv"
      Layer2 training dataset: 'train_layer1_x.csv'
                               "train_layer1_label.csv"

I am constructiong the pipeline for stacking, before that you need to run the stacking manually.

For example, if you have 4 models in first layer, these 4 models need train first using the Layer1 training dataset that we get from Dataset_stacking.ipynb. Then prepare the data for the second layer: use the Layer2 training dataset as an input for first layer's model, the outputs are then the training data for second layer's model.

Note: currently, dividing the whole dataset for training models in two layers separately is not the best idea. But due to time limit, it is the fastest one. Later, I will complete this part for future use.

Note: Note: Although I first used four different models in my first layer's stacking. But I found you will you better result if you applied similar models in the first model, for example, two NN and two xgboost.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Data Precessing_single model.ipynb		Data Precessing_single model.ipynb
Data exploration.ipynb		Data exploration.ipynb
Dataset_stacking.ipynb		Dataset_stacking.ipynb
NN.py		NN.py
README.md		README.md
Random Forest.ipynb		Random Forest.ipynb
Ridge.ipynb		Ridge.ipynb
Ridge2.ipynb		Ridge2.ipynb
SVM.ipynb		SVM.ipynb
Xgboost Grid search.ipynb		Xgboost Grid search.ipynb
Xgboost.ipynb		Xgboost.ipynb
xgboost_stacking.ipynb		xgboost_stacking.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CSC522 Course Project

About

Uh oh!

Releases

Packages

Languages

rubyzhou2014/CSC522

Folders and files

Latest commit

History

Repository files navigation

CSC522 Course Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages