Exported a large dataset of 11.3 million rows from Harvard Dataverse regarding flights from 1993 and 2003, and ingested the data with SnowSQL and created the database into Snowflake which was used as the data warehouse. Analyzed the data and executed queries with SQL, while having the cleaned table ready for visualizions of the data in PowerBI where a dashboard was conducted. Snowpark was used and connected to the database in Snowflake for aggregations and further analysis while later on converting to Pandas for training two machine learning models which were Light GBM and XGBoost. An evaluation of the models were done by using different metrics for classification where XGBoost performed slightly better, however where both models having a moderate performance.
dsxyash/Airline_Performance
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|