Skip to content

Commit c9d4b1b

Browse files
authored
Demonstrate working with private wheel packages (#47)
There is no native support for private repositories and this offers a workaround.
1 parent e5e0512 commit c9d4b1b

File tree

7 files changed

+136
-0
lines changed

7 files changed

+136
-0
lines changed

knowledge_base/private_wheel_packages/.gitignore

Whitespace-only changes.
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Private wheel packages
2+
3+
This example demonstrates how to use a private wheel package from a job in a Databricks Asset Bundle.
4+
5+
If you are using notebooks, you can use the approach documented in [Notebook-scoped Python libraries][doc] to install
6+
wheels from a private repository in a notebook. You can use the workaround documented here if you are not using notebooks.
7+
8+
[doc]: https://docs.databricks.com/en/libraries/notebooks-python-libraries.html#install-a-private-package-with-credentials-managed-by-databricks-secrets-with-pip
9+
10+
## Prerequisites
11+
12+
* Databricks CLI v0.235.0 or above
13+
* Python 3.10 or above
14+
15+
# Usage
16+
17+
You can refer to private wheel files from job libraries or serverless environments by downloading the wheel
18+
and making it part of your Databricks Asset Bundle deployment.
19+
20+
To emulate this for this example, we will download a wheel from PyPI, include it in deployment, and refer to it from job configuration.
21+
22+
## Downloading a wheel
23+
24+
First, download the wheel to the `dist` directory:
25+
26+
```shell
27+
pip download -d dist cowsay==6.1
28+
```
29+
30+
## Deploying the example
31+
32+
Next, update the `host` field under `workspace` in `databricks.yml` to the Databricks workspace you wish to deploy to.
33+
34+
Run `databricks bundle deploy` to upload the wheel and deploy the jobs.
35+
36+
Run `databricks bundle run` to run either job.
37+
38+
Example output:
39+
```
40+
$ databricks bundle run
41+
Run URL: https://...
42+
43+
2024-11-27 13:23:01 "[dev pieter_noordhuis] Example to demonstrate using a private wheel package on serverless" TERMINATED SUCCESS
44+
_____________
45+
| Hello, world! |
46+
=============
47+
\
48+
\
49+
^__^
50+
(oo)\_______
51+
(__)\ )\/\
52+
||----w |
53+
|| ||
54+
```
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
bundle:
2+
name: private_wheel_packages
3+
4+
include:
5+
- resources/*.yml
6+
7+
workspace:
8+
# host: https://myworkspace.cloud.databricks.com
9+
10+
targets:
11+
dev:
12+
default: true
13+
mode: development
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
resources:
2+
jobs:
3+
cluster:
4+
name: Example to demonstrate using a private wheel package on a job cluster
5+
6+
tasks:
7+
- task_key: task
8+
job_cluster_key: default
9+
10+
spark_python_task:
11+
python_file: ../src/main.py
12+
13+
# Relative local path reference to the private wheel package.
14+
# This references alone is sufficient to upload it on deployment.
15+
#
16+
# It is uploaded to `${workspace.artifact_path}`, which can point to
17+
# a workspace directory (under `${workspace.bundle_root}`, by default),
18+
# or a Unity Catalog Volume.
19+
#
20+
# This value is automatically rewritten to the fully qualified remote path.
21+
libraries:
22+
- whl: ../dist/cowsay-6.1-*.whl
23+
24+
job_clusters:
25+
- job_cluster_key: default
26+
new_cluster:
27+
spark_version: 15.4.x-scala2.12
28+
node_type_id: i3.xlarge
29+
data_security_mode: SINGLE_USER
30+
num_workers: 0
31+
spark_conf:
32+
"spark.databricks.cluster.profile": "singleNode"
33+
"spark.master": "local[*]"
34+
custom_tags:
35+
"ResourceClass": "SingleNode"
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
resources:
2+
jobs:
3+
serverless:
4+
name: Example to demonstrate using a private wheel package on serverless
5+
6+
tasks:
7+
- task_key: task
8+
environment_key: default
9+
10+
spark_python_task:
11+
python_file: ../src/main.py
12+
13+
environments:
14+
- environment_key: default
15+
16+
spec:
17+
client: "1"
18+
dependencies:
19+
# Relative local path reference to the private wheel package.
20+
# This references alone is sufficient to upload it on deployment.
21+
#
22+
# It is uploaded to `${workspace.artifact_path}`, which can point to
23+
# a workspace directory (under `${workspace.bundle_root}`, by default),
24+
# or a Unity Catalog Volume.
25+
#
26+
# This value is automatically rewritten to the fully qualified remote path.
27+
- ../dist/cowsay-6.1-*.whl
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# By successfully importing the previously downloaded wheel, we demonstrate that it is possible to
2+
# include an arbitrary wheel file in your bundle directory, deploy it, and use it from within your code.
3+
import cowsay
4+
5+
if __name__ == '__main__':
6+
cowsay.cow('Hello, world!')

0 commit comments

Comments
 (0)