This repository includes code samples that use the DBSQL REST API.
To setup the Spreadsheet application follow these steps:
-
Create a new Google Spreadsheet file called
Orders. -
Inside, create a sheet called
Orders. -
Navigate to
Extensions -> Apps Scriptand rename the script toOrders. -
In the Apps Script editor, create four files with the names and content of the files located in the
spreadsheetfolder:Interface.gs,DBSQL.gs,sidebar.html, andinfo.html. -
Inside the
DBSQL.gsfile, provide the connectivity parameters for the Databricks server:HOST,WAREHOUSE_ID, andAUTH_TOKEN. -
Save the files.
-
Go back to your Spreadsheet and refresh it.
-
You will now get a
Managed Ordersmenu next to theExtensionsmenu. -
Select
Managed Orders -> Show Monthly Ordersand accept the authorization request. -
To configure the permissions in your sheet, select
Project Settings -> Show appscript.json manifest file in editor. -
This will show a new file in the App Script workspace where you can edit the grants. See more details.
-
You are now all set. Run the predefined queries using the entries in the
Managed Ordersmenu.
The Postman collection consists of two json files Databricks Environment.postman_environment.json and Databricks SQL Execution API.postman_collection.json. The former can be used to define environment variables needed to establish connection with the Databricks server: HOST, WAREHOUSE_ID, and AUTH_TOKEN. The latter includes several API requests using the sync and async flows.
To use the Postman collection follow the steps below:
-
Open Postman, go to
My Workspaceand click Import to add the two files. -
Go to
Environmentsand selectDatabricks Environmentto update the connection parameters. Make sure to save the changes. -
Select the
Databricks Enviromentfrom the environment selector. -
Run one of the API requests for executing statements.
-
Use the GET requests to poll for status and fetch chunks.
The Python script executes a statement in asynchronous mode with disposition and format set to EXTERNAL_LINKS and ARROW_STREAM, respectively. Then, it retrieves each chunk using the presigned URL, deserializes the Arrow bytes and converts the result into a pandas DataFrame.