|
| 1 | +# pydabs |
| 2 | + |
| 3 | +The 'pydabs' project was generated by using the default template. |
| 4 | + |
| 5 | +* `src/`: Python source code for this project. |
| 6 | + * `src/pydabs/`: Shared Python code that can be used by jobs and pipelines. |
| 7 | +* `resources/`: Resource configurations (jobs, pipelines, etc.) |
| 8 | +* `tests/`: Unit tests for the shared Python code. |
| 9 | +* `fixtures/`: Fixtures for data sets (primarily used for testing). |
| 10 | + |
| 11 | + |
| 12 | +## Getting started |
| 13 | + |
| 14 | +Choose how you want to work on this project: |
| 15 | + |
| 16 | +(a) Directly in your Databricks workspace, see |
| 17 | + https://docs.databricks.com/dev-tools/bundles/workspace. |
| 18 | + |
| 19 | +(b) Locally with an IDE like Cursor or VS Code, see |
| 20 | + https://docs.databricks.com/dev-tools/vscode-ext.html. |
| 21 | + |
| 22 | +(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html |
| 23 | + |
| 24 | +If you're developing with an IDE, dependencies for this project should be installed using uv: |
| 25 | + |
| 26 | +* Make sure you have the UV package manager installed. |
| 27 | + It's an alternative to tools like pip: https://docs.astral.sh/uv/getting-started/installation/. |
| 28 | +* Run `uv sync --dev` to install the project's dependencies. |
| 29 | + |
| 30 | + |
| 31 | +# Using this project using the CLI |
| 32 | + |
| 33 | +The Databricks workspace and IDE extensions provide a graphical interface for working |
| 34 | +with this project. It's also possible to interact with it directly using the CLI: |
| 35 | + |
| 36 | +1. Authenticate to your Databricks workspace, if you have not done so already: |
| 37 | + ``` |
| 38 | + $ databricks configure |
| 39 | + ``` |
| 40 | +
|
| 41 | +2. To deploy a development copy of this project, type: |
| 42 | + ``` |
| 43 | + $ databricks bundle deploy --target dev |
| 44 | + ``` |
| 45 | + (Note that "dev" is the default target, so the `--target` parameter |
| 46 | + is optional here.) |
| 47 | +
|
| 48 | + This deploys everything that's defined for this project. |
| 49 | + For example, the default template would deploy a pipeline called |
| 50 | + `[dev yourname] pydabs_etl` to your workspace. |
| 51 | + You can find that resource by opening your workpace and clicking on **Jobs & Pipelines**. |
| 52 | +
|
| 53 | +3. Similarly, to deploy a production copy, type: |
| 54 | + ``` |
| 55 | + $ databricks bundle deploy --target prod |
| 56 | + ``` |
| 57 | + Note the default template has a includes a job that runs the pipeline every day |
| 58 | + (defined in resources/sample_job.job.yml). The schedule |
| 59 | + is paused when deploying in development mode (see |
| 60 | + https://docs.databricks.com/dev-tools/bundles/deployment-modes.html). |
| 61 | +
|
| 62 | +4. To run a job or pipeline, use the "run" command: |
| 63 | + ``` |
| 64 | + $ databricks bundle run |
| 65 | + ``` |
| 66 | +
|
| 67 | +5. Finally, to run tests locally, use `pytest`: |
| 68 | + ``` |
| 69 | + $ uv run pytest |
| 70 | + ``` |
0 commit comments