This repo contains an example of an end-to-end implementation of a Digital Twin representing a ball bearing manufacturing process. The approach oulined here is intended to be reusable in many other scenarios.
To guide you through the process, it is orchestrated as a series of notebooks:
TODO: Outline features of the solution
This is where all the settings for the whole accelerator are configured - ensure that you adjust them to suit your workspace.
We first need to define a table where Zerobus can store the telemetry received from the IOT devices. This notebook shows how to prepare the table and will also generate sample data if you do not yet have access to Zerobus.
With the bronze table created, we set up the Zerobus endpoint and connect it to that table. This notebook shows how data can be written to the Zerobus API, although in reality this would come from the IoT devices themselves.
To convert the incoming sensor data into timestamped RDF triples that are compatible with the twin model (also defined in RDF) we use Lakeflow Declarative Pipelines with the spark-r2r library to do the mapping. The result is a Delta Lake table that is ready to be used by the app.
To provide a more responsive experience to users, we also serve the latest sensor data from Lakebase. By setting up a synced table, the system takes care of ensuring that the latest value from each sensor is always present based on the timestamp.
Finally, we set up a Databricks App that will serve the triple data and display the twin model as an interactive graph. This notebook configures the app as well as giving it access to the required tables.
When you are finished using this solution accelerator or just want a clean slate, this notebook will remove all the resources created along the way.
Instructions for testing Notebook 1 and 2
-- !! ******* REMOVE BEFORE PUBLISHING - CONTAINS SENSITIVE INFO -- !! *******
Notebook Description
- Notebook 1-Create-Sensor-Bronze_Table --> generate first batch of data and save it directly in a delta table
- Notebook 2-Ingest-Data_zerobus --> use Zerobus PrPr to push the data to the same delta table
How to test the code
- Code should be executed from https://e2-dogfood.staging.cloud.databricks.com/
- Access https://dbc-e2f0eb31-2b0e.staging.cloud.databricks.com/ and generate PAT (follow the instructions in https://docs.google.com/document/d/1JRS7ZVPHYP8dnPDPl48YEA9SCCZCMAe03-hIFX3rXbU/edit?tab=t.0#heading=h.358repd9il1q). If you don't have access to this workspace, request access in #lakeflow-connect-zerobus
Notebook configuration parameters:
- UC Table: table in shinkansen.default catalog (e.g. shinkansen.default.your_table_name) --> for both notebooks
- Workspace URL: https://dbc-e2f0eb31-2b0e.staging.cloud.databricks.com --> for notebook 2
- Zerobus URI: ingest.staging.cloud.databricks.com --> for notebook 2
- token --> PAT token created before --> for notebook 2
© 2025 Databricks, Inc. All rights reserved. The source in this notebook is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.
To run this accelerator, clone this repo into a Databricks workspace. Attach the RUNME notebook to any cluster running a DBR 11.0 or later runtime, and execute the notebook via Run-All. A multi-step-job describing the accelerator pipeline will be created, and the link will be provided. Execute the multi-step-job to see how the pipeline runs.
The job configuration is written in the RUNME notebook in json format. The cost associated with running the accelerator is the user's responsibility.