Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions aibi-embedding/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
# Embedded Analytics with Databricks AI/BI

## :bulb: Demo Overview

This demo showcases how to embed Databricks AI/BI Dashboards and Genie into your application, providing users with real-time insights without leaving your platform.

For this demo, let's imagine a fictional company called Brickstore, where brick suppliers sell bricks to customers on the Brickstore web platform. Brickstore embeds AI/BI dashboards and Genie into their platform, allowing users to analyze sales data without having to leave to another site!

:no_good: Disclaimer: Please read the following before proceeding.
1. This demo uses Databricks Apps for simplicity but can be deployed on any modern web server.
2. This application is for demonstration purposes only; consider adding security measures and error handling for production use.
3. This repo is not meant to be a "click-and-deploy" solution. Instead, use the code in this repo as reference as you build your own embedded analytics application.

## :star2: AI/BI Embedding

AI/BI features two complementary capabilities: Dashboards and Genie. Dashboards provide a low-code experience to help analysts quickly build highly interactive data visualizations for their business teams using natural language, and Genie allows business users to converse with their data to ask questions and self-serve their own analytics. Both of these capabilites can be embedded into external applications.

### 1. Dashboard Embedding

![Example](images/dashboard_embedding_example.png)
With dashboard embedding, you are able to seamlessly embed a dashboard into your application as you can see in the example above. There are two different ways you can embed dashbaords with AI/BI.

**Option 1: iframe embedding (a.k.a Copy Embed Code)**

- Generally Available!
- This is the simplest way to embed an AI/BI dashboard (see [here](https://docs.databricks.com/en/dashboards/embed.html#embed-a-dashboard) for instructions).
- Best-suited for internal use cases
- Users are already Databricks users in your account
- Low-code/No-code approach, but comes with limitations (ex: users need to sign-in to Databricks to see dashboard)
- Not showcased in this demo

**Option 2: Token-Based Embedding (a.k.a App-Delegated Authentication Embedding)**

- Private Preview as of March 2025 (reach out to your Databricks Account Team)
- Best-suited for external use cases
- Users are not Databricks users in your account (i.e. your end-customers)
- More complex, requires backend and frontend implementation via code
- Allows passing parameters for Row-Level Security (RLS).
- For sample code, refer to:
- **app.py**
- backend/**dashboard_embedding.py**
- frontend/src/pages/**Analytics.js**

#### Dashboard Embedding Architecture Diagram
![Architecture](images/dashboard_embedding_architecture.gif)

### 2. Genie Embedding
![Example](images/genie_embedding_example.png)
Before Genie Conversational APIs became available, you could only interact with Genie through the Databricks UI. Now, you are able to use Genie Conversational APIs to build your own version of Genie in your own application as you can see above.

**Genie Conversational APIs**

- Public Preview as of March 2025
- Surfaces AI/BI Genie through a REST endpoint
- For sample code, refer to:
- **app.py**
- backend/**genie_embedding.py**
- frontend/src/pages/**Genie.js**
- frontend/src/components/**ChatMessage.js**

#### API Flow
![Architecture](images/genie_api_flow.gif)

#### Genie Embedding Architecture Diagram
![Architecture](images/genie_api_architecture.gif)

## :rocket: Learn More

- AI/BI
- [Databricks AI/BI Product Page](https://www.databricks.com/product/ai-bi)
- Dashboard Embedding
- [iframe Embedding Documentation](https://docs.databricks.com/aws/en/dashboards/embed)

- Genie Conversational APIs
- [Genie API Documentation](https://docs.databricks.com/api/workspace/genie)
- [Genie API Blog](https://www.databricks.com/blog/genie-conversation-apis-public-preview)


## :closed_lock_with_key: License

© 2025 Databricks, Inc. All rights reserved. The source in this project is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.

| library | description | license | source |
|----------------------------------------|-------------------------|------------|-----------------------------------------------------|
| **Flask** | Lightweight WSGI web application framework | BSD 3-Clause | https://github.com/pallets/flask |
| **Flask-SQLAlchemy** | Extension for Flask that adds support for SQLAlchemy | MIT | https://github.com/pallets/flask-sqlalchemy |
| **requests** | Library for making HTTP requests | Apache 2.0 | https://github.com/psf/requests |
| **pandas** | Library for data manipulation and analysis | BSD 3-Clause | https://github.com/pandas-dev/pandas |
| **Flask-JWT-Extended** | Extension for handling JSON Web Tokens in Flask | MIT | https://github.com/vimalloc/flask-jwt-extended |
| **Flask-CORS** | Extension for handling Cross-Origin Resource Sharing in Flask | MIT | https://github.com/corydolphin/flask-cors |
| **python-dotenv** | Library for loading environment variables from .env files | MIT | https://github.com/theskumar/python-dotenv |
| **axios** | Promise based HTTP client for the browser and node.js | MIT | https://github.com/axios/axios |
| **@emotion/react** | Library for building robust, customizable, and performant UI components with React | MIT | https://github.com/emotion-js/emotion |
| **@emotion/styled** | Library for building robust, customizable, and performant UI components with React | MIT | https://github.com/emotion-js/emotion |
| **@fontsource/roboto** | Font package for Roboto | Apache-2.0 | https://github.com/fontsource/fontsource |
| **@mui/icons-material** | Material Design icons for React | MIT | https://github.com/mui-org/material-ui |
| **@mui/material** | Material Design components for React | MIT | https://github.com/mui-org/material-ui |
| **react** | JavaScript library for building user interfaces | MIT | https://github.com/facebook/react |
| **react-dom** | React package for working with the DOM | MIT | https://github.com/facebook/react |
| **react-router-dom** | DOM bindings for React Router | MIT | https://github.com/remix-run/react-router |
| **react-scripts** | Scripts and configuration used by Create React App | MIT | https://github.com/facebook/create-react-app |
| **cra-template** | Template for Create React App | MIT | https://github.com/facebook/create-react-app |
| **@databricks/aibi-client** | Databricks AI & BI Client | MIT | https://github.com/databricks/aibi-client |
---

Databricks support doesn't cover this content. For questions or bugs, please open a github issue and the team will help on a best effort basis.
128 changes: 128 additions & 0 deletions aibi-embedding/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
import pandas as pd
from flask import Flask, jsonify, send_from_directory, session, request, redirect, url_for
from flask_jwt_extended import JWTManager, create_access_token, jwt_required, get_jwt_identity, get_jwt
from flask_sqlalchemy import SQLAlchemy
from flask_cors import CORS
import logging
import secrets
from backend.models import db, User
from backend.dashboard_embedding import get_dashboard_embedding_oauth_token
from backend.genie_embedding import get_databricks_oauth_token, get_genie_space_id, new_genie_conversation, continue_genie_conversation
from dotenv import load_dotenv
import os

log = logging.getLogger('werkzeug')
log.setLevel(logging.ERROR)

app = Flask(__name__, static_folder='frontend/build', static_url_path='/')
CORS(app, resources={r"/api/*": {"origins": ["https://aibi-embedding-demo-984752964297111.11.azure.databricksapps.com", "http://localhost:3000"]}})

# Configuration
app.config['SECRET_KEY'] = secrets.token_hex(32)
app.config["JWT_SECRET_KEY"] = secrets.token_hex(32)
app.config['JWT_TOKEN_LOCATION'] = ['headers']

# Dummy Database for storing user information
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///db.sqlite'

# JWT Initialization
jwt = JWTManager(app)

# Pull Environment Variables
load_dotenv()
databricks_host = os.environ['DATABRICKS_HOST']
databricks_client_id = os.environ['DATABRICKS_CLIENT_ID']
databricks_client_secret = os.environ['DATABRICKS_CLIENT_SECRET']

db.init_app(app)
with app.app_context():
db.create_all()

@app.route('/')
@app.route('/login')
@app.route('/home')
@app.route('/genie')
@app.route('/analytics')
def serve():
return send_from_directory(app.static_folder, 'index.html')

# This is the endpoint the front-end will hit to verify login information
@app.route('/api/login', methods=['POST'])
def login():
data = request.get_json()
email = data['email']
password = data['password']
print('Received data:', email , password)

user = User.query.filter_by(email=email).first()

if user and user.password==password:
access_token = create_access_token(identity=user.id)
databricks_token = get_databricks_oauth_token()
return jsonify({'message': 'Login Success', 'databricks_token':databricks_token,'first_name': user.first_name, 'last_name': user.last_name,'email': email, 'access_token': access_token,'company': user.company})
else:
return jsonify({'message': 'Login Failed'}), 401

@app.route('/api/data', methods=['GET'])
def get_data():
return jsonify({"message": "Hello from Flask!"})


@app.route('/api/dashboard/config', methods=['POST'])
def dashboard_config():
return jsonify({
"instance_url": "https://" + os.environ['DATABRICKS_HOST'],
"workspace_id": os.environ['DATABRICKS_WORKSPACE_ID'],
"dashboard_id": os.environ['DATABRICKS_DASHBOARD_ID']
})

# This is the endpoint the front-end will hit to embed the dashboard
@app.route('/api/dashboard/get_token', methods=['POST'])
def dashboard_get_token():
# Unwrap the payload sent from the front-end. This includes the user's question as well as additional information about the user
data = request.get_json()

# Inputs sent from the front-end
external_data = data['external_data']
external_viewer_id = data['external_viewer_id']
dashboard_name = data['dashboard_name']

# Call the continue_genie_conversation from 'backend/dashboard_embedding.py'
response = get_dashboard_embedding_oauth_token(external_data = external_data, external_viewer_id = external_viewer_id, dashboard_name = dashboard_name)
return response

# This is the endpoint the front-end will hit to start a new conversation
@app.route('/api/genie/start_conversation', methods=['POST'])
def genie_start_conversation():
# Unwrap the payload sent from the front-end. This includes the user's question as well as additional information about the user
data = request.get_json()

# Inputs sent from the front-end
initial_message = data['question']
databricks_token = data['databricks_token']
user_company = data['user_company']
databricks_genie_space_id = get_genie_space_id(user_company)

# Call the new_genie_conversation from 'backend/genie_embedding.py'
response = new_genie_conversation(space_id = databricks_genie_space_id, content=initial_message, token = databricks_token, databricks_host = databricks_host)
return response

# This is the endpoint the front-end will hit to continue an existing conversation
@app.route('/api/genie/continue_conversation', methods=['POST'])
def genie_continue_conversation():
# Unwrap the payload sent from the front-end. This includes the user's question as well as additional information about the user
data = request.get_json()

# Inputs sent from the front-end
followup_message = data['question']
conversation_id = data['conversation_id']
databricks_token = data['databricks_token']
user_company = data['user_company']
databricks_genie_space_id = get_genie_space_id(user_company)

# Call the continue_genie_conversation from 'backend/genie_embedding.py'
response = continue_genie_conversation(space_id = databricks_genie_space_id, content=followup_message, conversation_id = conversation_id, token = databricks_token, databricks_host = databricks_host)
return response

if __name__ == '__main__':
app.run(debug=True)
12 changes: 12 additions & 0 deletions aibi-embedding/app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
command:
- gunicorn
- app:app
- -w
- 4
env:
- name: DATABRICKS_GENIE_SPACE_ID_APEX
value: ''
- name: DATABRICKS_GENIE_SPACE_ID_SYNERGY
value: ''
- name: DATABRICKS_DASHBOARD_ID
value: ''
41 changes: 41 additions & 0 deletions aibi-embedding/backend/dashboard_embedding.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import os
import requests
from dotenv import load_dotenv

# Function to retrieve minted OAuth Token from Databricks server
def get_dashboard_embedding_oauth_token(external_data, external_viewer_id, dashboard_name):
# Pull Environment Variables from .env file
load_dotenv()

databricks_host = os.environ['DATABRICKS_HOST']
databricks_client_id = os.environ['DATABRICKS_CLIENT_ID']
databricks_client_secret = os.environ['DATABRICKS_CLIENT_SECRET']
if dashboard_name == "defects":
dashboard_id = os.environ['DATABRICKS_DASHBOARD_ID']

# These are additional parameters when making the OAuth request
# 1. The Oauth Scope limits the amount of access granted to an access token, ensuring scoped access
# 2. The Custom Claim will be used to filter data in the SQL statement for Row-Level Security
oauth_scopes = "dashboards.query-execution dashboards.lakeview-embedded:read sql.redash-config:read settings:read"
custom_claim = f'urn:aibi:external_data:{external_data}:{external_viewer_id}:{dashboard_id}'

# Make M2M OAuth Request to Databricks server to get a Databricks Access Token
token_url = f"https://{databricks_host}/oidc/v1/token"

payload = {
"grant_type": "client_credentials",
"client_id": databricks_client_id,
"client_secret": databricks_client_secret,
"scope": oauth_scopes,
"custom_claim": custom_claim
}

response = requests.post(token_url, data=payload)

# No need to worry about token expiration, since the dashboard embedding library will automatically reissue expired tokens
token_data = response.json()

# Store the access token and its expiration time
access_token = token_data["access_token"]

return access_token
Loading