You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+12-7Lines changed: 12 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ For an in-depth understanding of **Kedro**, consider exploring the official docu
17
17
## 🎯 Project Goals
18
18
19
19
The objectives were:
20
-
-Make the code in a Notebook **production-ready** and **easily deployable**.
20
+
-Transition code from Jupyter Notebooks to a **production-ready**, **easily deployable** format.
21
21
- Allow **easy** addition of models and their performance graphs in the pipeline.
22
22
- Adopt the Kedro framework to produce **reproducible**, **modular**, and **scalable workflows**.
23
23
@@ -104,15 +104,15 @@ Kedro-Energy-Forecasting/
104
104
105
105
First, **Clone the Repository** to download a copy of the code onto your local machine, and before diving into transforming **raw data** into a **trained pickle Machine Learning model**, please note:
106
106
107
-
## 🔴 Important Preparation Steps
107
+
####🔴 Important Preparation Steps
108
108
109
109
Before you begin, please follow these preliminary steps to ensure a smooth setup:
110
110
111
-
-**Clear Existing Data Directories**: If you're planning to run the pipeline, we recommend removing these directories if they currently exist: `data/02_processed`, `data/03_training_data`, `data/04_reporting`, and `data/05_model_output`. They will be recreated or updated once the pipeline runs. These directories are tracked in version control to provide you with a glimpse of the **expected outputs**.
111
+
-**Clear Existing Data Directories**: If you're planning to run the pipeline, i recommend removing these directories if they exist: `data/02_processed`, `data/03_training_data`, `data/04_reporting`, and `data/05_model_output` (leave only `data/01_raw` in the `data` folder). They will be recreated or updated once the pipeline runs. These directories are tracked in version control to provide you with a glimpse of the **expected outputs**.
112
112
113
113
-**Makefile Usage**: To utilize the Makefile for running commands, you must have `make` installed on your system. Follow the instructions in the [installation guide](https://sp21.datastructur.es/materials/guides/make-install.html) to set it up.
114
114
115
-
Here is an example of the available targets:
115
+
Here is an example of the available targets: (you type `make` in the command line)
@@ -121,10 +121,10 @@ Here is an example of the available targets:
121
121
122
122
-**Running the Kedro Pipeline**:
123
123
- For **production** environments, initialize your setup by executing `make prep-doc` or using `pip install -r docker-requirements.txt` to install the production dependencies.
124
-
- For a **development** environment, where you may want to use **Kedro Viz**, work with **Jupyter notebooks**, or test everything thoroughly, run `make prep-dev` or `pip install -r dev-requirements.txt` to install the development dependencies.
124
+
- For a **development** environment, where you may want to use **Kedro Viz**, work with **Jupyter notebooks**, or test everything thoroughly, run `make prep-dev` or `pip install -r dev-requirements.txt` to install all the development dependencies.
125
125
126
126
127
-
### Standard Method (Conda / venv) 🌿
127
+
### 🌿 Standard Method (Conda / venv)
128
128
129
129
Adopt this method if you prefer a traditional Python development environment setup using Conda or venv.
130
130
@@ -138,7 +138,7 @@ Adopt this method if you prefer a traditional Python development environment set
138
138
139
139
5.**(Optional) Explore with Kedro Viz**: To visually explore your pipeline's structure and data flows, initiate Kedro Viz using `make viz` or `kedro viz run`.
140
140
141
-
### Docker Method 🐳
141
+
### 🐳 Docker Method
142
142
143
143
Prefer this method for a containerized approach, ensuring a consistent development environment across different machines. Ensure Docker is operational on your system before you begin.
144
144
@@ -150,6 +150,11 @@ Prefer this method for a containerized approach, ensuring a consistent developme
150
150
151
151
For additional assistance or to explore more command options, refer to the **Makefile** or consult `kedro --help`.
152
152
153
+
## 🌌 Next Steps?
154
+
With our **Kedro Pipeline** 🏗 now capable of efficiently **transforming raw** data 🔄 into **trained models** 🤖, and the introduction of a Dockerized environment 🐳 for our code, the next phase involves _advancing beyond the current repository scope_ 🚀 to `orchestrate data updates automatically` using tools like **Databricks**, **Airflow**, **Azure Data Factory**... This progression allows for the seamless integration of fresh data into our models.
155
+
156
+
Moreover, implementing `experiment tracking and versioning` with **MLflow** 📊 or leveraging **Kedro Viz**'s versioning capabilities 📈 will significantly enhance our project's management and reproducibility. These steps are pivotal for maintaining a clean machine learning workflow that not only achieves our goal of simplifying model training processes 🛠 but also ensures our system remains dynamic and scalable with **minimal effort**.
157
+
153
158
## 🌐 Let's Connect!
154
159
155
160
You can connect with me on **LinkedIn** or check out my **GitHub repositories**:
0 commit comments