Skip to content

Commit 4b5b7fa

Browse files
committed
Development
1 parent d253567 commit 4b5b7fa

File tree

2 files changed

+4
-21
lines changed

2 files changed

+4
-21
lines changed

README.md

Lines changed: 3 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -175,29 +175,12 @@ Secrets in GitHub should look exactly like below. The secrets are case sensitive
175175

176176
---
177177
---
178-
179178

180-
# Repo Guidance
179+
## Running Pipelines
181180

182-
## Databricks as Infrastructure
183-
<details close>
184-
<summary>Click Dropdown... </summary>
181+
- The end to end machine learning pipleine will be pre-configured in the "workflows" section in databricks. This utilises a Job Cluster which will automatically upload the necessary dependencies contained within a python wheel file
185182

186-
<br>
187-
There are many ways that a User may create Databricks Jobs, Notebooks, Clusters, Secret Scopes etc. <br>
188-
<br>
189-
For example, they may interact with the Databricks API/CLI by using: <br>
190-
<br>
191-
i. VS Code on their local machine, <br>
192-
ii. the Databricks GUI online; or <br>
193-
iii. a YAML Pipeline deployment on a DevOps Agent (e.g. GitHub Actions or Azure DevOps etc). <br>
194-
<br>
195-
196-
The programmatic way in which the first two scenarios allow us to interact with the Databricks API is akin to "Continuous **Development**", as opposed to "Continuous **Deployment**". The former is strong on flexibility, however, it is somewhat weak on governance, accountability and reproducibility. <br>
197-
198-
In a nutshell, Continuous **Development** _is a partly manual process where developers can deploy any changes to customers by simply clicking a button, while continuous **Deployment** emphasizes automating the entire process_.
199-
200-
</details>
183+
- If you wish to run the machine learning scripts from the Notebook instead, first upload the dependencies (automatic upload is in development). Simply navigate to python wheel file contained within the dist/ folder. Manually upload the python wheel file to the cluster that you wish to run for the Notebook.
201184

202185
---
203186
---

mlOps/modelOps/data_science/nyc_taxi/train_register.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ def __init__(self, spark: SparkSession, experiment_name: str, namespace: str, wo
157157
self.track_in_azure_ml = False
158158
self.namespace = namespace
159159
self.ws = workspace
160-
self.model_folder = "outputs"
160+
self.model_folder = "cached_models"
161161
self.dbutils = SparkRunner().get_dbutils()
162162

163163

0 commit comments

Comments
 (0)