Skip to content

MichelFaloughi/cis5200-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hola chicos I hope you're well

Here's a quick guide to stuff we'll do:

I need everyone of you to make an account here: https://cds.climate.copernicus.eu/ and retrieve your API key Once you do that, you will have to create a .cdsapirc file (in your home directory), that looks like this url: https://cds.climate.copernicus.eu/api key:

so that you'll be able to query data from the Copernicus Climate Data Store (CDS), not to be confused with the Coperniucs Climate Change Service (CDS) to see how to query data from CDS, see the playground.ipynb notebook, where I'm trying a bunch of stuff


Questions:

  • How to make the compute not super expensive ? Should we focus on a smaller specific geography ?

Concerns:

  • Make sure one method is novel: a non-standard training loss, transfer learning, regularization, or such
  • Cite packages
  • I'm scared we won't get any meaningful results... Could we still get a full grades ? Wind speed prediction might be too chaotic...

TO DO:

  • Check the literature, the current state of the art, etc
  • Pre-process data.
  • Build 5 models
  • Build common evaluation metrics on which to test each model. Figures, etc. Like a 5-subplot plot for each. The losses, the results, etc

Maybe as a step one, I can build an RNN from start to finish, including data preprocessing and post processing, then use that for all 4 other models


TO DO: Define the task formally: "We predict 10m wind speed using past 24h ERA5 at time horizon: 1-hour ? Maybe do autoregressive rollouts ?

Data: TO DO:

  • What we could do later is make a grid around the globe at a higher resolution, doesn't have to be ALL the data point that era5 has.

Eval: TO DO:

  • Figure out what evaluation metrics you want per model
  • Create a standardized evaluation 'score card' for all models
  • Then write some code to show the results side by side

Models: TO DO:

  • Figure out what 5 models we want (RNN ? Ensemble ?)
  • Figure out what the baseline model iscd

Hence, the next steps are:

  • Define the task formally in your notebook: “We predict next-hour 10m wind speed using past 24h ERA5 variables at location X.”
  • Build the dataset (single DataFrame used by all models).
  • Implement Baseline (Persistence) and evaluate it.
  • Implement Linear Regression + Random Forest + XGBoost → plug into shared eval.
  • Implement MLP.
  • (If time) Implement LSTM or 1D CNN as the 5th model.
  • Create the scorecard table + 1–2 plots.
  • Mirror this into Overleaf sections (Models + Evaluation).

Latest:

  • make a requirements.txt file for the requirements like dask etc.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors