Skip to content

Commit 50af506

Browse files
committed
NEP-18548 Add Dashboard Config Loader
1 parent bc524a5 commit 50af506

File tree

3 files changed

+284
-0
lines changed

3 files changed

+284
-0
lines changed
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# Transferring Dashboards across Environments
2+
3+
In this document, we will discuss how to transfer dashboards across Superset hosting environments with the goal of heading towards an API call to automate the process.
4+
5+
## Background
6+
7+
A common practice is to set up infrastructure to deploy multiple Superset environments. For example, a simple setup might be:
8+
- Local development environment for testing version upgrades and feature exploration
9+
- Staging Superset environment for testing in a production-like environment
10+
- Production Superset environment that requires a higher level of stability and uptime
11+
12+
For the above example, the Superset staging environment often holds connections to staging databases, and the Superset production environment will hold connections to the production databases.
13+
14+
In the event where the database schema structure for the local development, staging, and production databases are exactly the same, dashboards can be replicated and transferred across Superset hosting environments.
15+
16+
That process does require some manual updating of the exported YAML files before importing them into the target environment. Also required is some understanding of the underlying dashboard export structure and how the object UUIDs work and relate to each other, especially in the context of databases and datasets.
17+
18+
## Dashboard Export/Import within the Same Environment
19+
20+
This is a fairly straightforward process.
21+
22+
There are multiple methods for exporting a dashboard:
23+
- Export from the dashboard list page in the GUI
24+
- Export via the Superset API
25+
- Export via the Superset CLI
26+
27+
Each export method will result in a zip file that contains a set of YAML files as per this list below, which is an export of customized version of the test Sales dashboard from the default example dashboards.
28+
29+
Test fixture is: https://github.com/rdytech/superset-client/blob/develop/spec/fixtures/dashboard_18_export_20240322.zip
30+
31+
```
32+
└── dashboard_export_20240321T214117
33+
├── charts
34+
│   ├── Boy_Name_Cloud_53920.yaml
35+
│   ├── Names_Sorted_by_Num_in_California_53929.yaml
36+
│   ├── Number_of_Girls_53930.yaml
37+
│   ├── Pivot_Table_53931.yaml
38+
│   └── Top_10_Girl_Name_Share_53921.yaml
39+
├── dashboards
40+
│   └── Birth_Names_18.yaml
41+
├── databases
42+
│   └── examples.yaml
43+
├── datasets
44+
│   └── examples
45+
│   └── birth_names.yaml
46+
└── metadata.yaml
47+
```
48+
49+
Each of the above YAML files holds UUID values for the primary object and any related objects.
50+
51+
- Database YAMLs hold the database connection string as well as a UUID for the database
52+
- Dataset YAMLs have their own UUID as well as a reference to the database UUID
53+
- Chart YAMLs have their own UUID as well as a reference to their dataset UUID
54+
55+
Example of the database YAML file:
56+
57+
```
58+
cat databases/examples.yaml
59+
database_name: examples
60+
sqlalchemy_uri: postgresql+psycopg2://superset:XXXXXXXXXX@superset-host:5432/superset
61+
cache_timeout: null
62+
expose_in_sqllab: true
63+
allow_run_async: true
64+
allow_ctas: true
65+
allow_cvas: true
66+
allow_dml: true
67+
allow_file_upload: true
68+
extra:
69+
metadata_params: {}
70+
engine_params: {}
71+
metadata_cache_timeout: {}
72+
schemas_allowed_for_file_upload:
73+
- examples
74+
allows_virtual_table_explore: true
75+
uuid: a2dc77af-e654-49bb-b321-40f6b559a1ee
76+
version: 1.0.0
77+
```
78+
79+
If we grep the database/examples.yaml we can see the UUID of the database.
80+
81+
```
82+
grep -r uuid databases/
83+
databases/examples.yaml:uuid: a2dc77af-e654-49bb-b321-40f6b559a1ee
84+
85+
```
86+
87+
Now if we look at the UUID values in the datasets, you will see both the dataset UUID and the reference to the database UUID.
88+
89+
```
90+
grep -r uuid datasets
91+
datasets/examples/birth_names.yaml:uuid: 283f5023-0814-40f6-b12d-96f6a86b984f
92+
datasets/examples/birth_names.yaml:database_uuid: a2dc77af-e654-49bb-b321-40f6b559a1ee
93+
```
94+
95+
If the above dashboard zip file `dashboard_18_export_20240322.zip` was imported as is to the same superset environment as it was exported from, this would mean all UUID's would already exist in superset and these objects would be found and updated with the imported zip data.
96+
97+
If the above zip file was imported as is to a different target Superset environment, it would fail as there would be no matching database UUID entry in that target Superset environment.
98+
99+
**Key Point:** When importing a dashboard to a different Superset environment than the original environment, the database configuration in the zip export must exist in the target Superset environment and all datasets must point to the database config.
100+
101+
## Migrate a Dashboard to a Different Superset Environment
102+
103+
With the above knowledge, we can now think about how to migrate dashboards between Superset environments.
104+
105+
Each Superset object is given a UUID. Within the exported dashboard files, we are primarily concerned with:
106+
- Replacing the staging database configuration with the production configuration
107+
- Updating all staging datasets to point to the new production database UUID
108+
109+
Given we have a request to 'transfer' a dashboard across to a different environment, say staging to production, how would we then proceed?
110+
111+
With the condition that the database in staging and production are structurally exactly the same schema, from the above discussion on UUIDs, you can then see that if we want to import a staging dashboard export into the production environment, we will need to perform the following steps:
112+
113+
1. Export the staging dashboard and unzip
114+
2. Note the staging database UUIDs in the `databases/` directory
115+
3. Get a copy of the production database YAML configuration file
116+
4. In the exported dashboard files, replace the staging database YAML with the production YAML
117+
5. In the dataset YAML files, replace all instances of the previously noted staging database UUID with the new production UUID
118+
6. Zip the files and import them to the production environment
119+
120+
The process above assumes that whoever is migrating the dashboard has a copy of the target database YAML files so that in steps 3 and 4 we can then replace the staging database YAML with the production one.
121+
122+
## Requirements
123+
124+
The overall process requires the following:
125+
- The source dashboard zip file
126+
- The target Superset environment database YAML file
127+
- Ability to copy and manipulate the source dashboard zip file
128+
- The ability to import via API to the target Superset environment
129+
130+
131+
## Gotchas!
132+
133+
Migrating a dashboard once to a new target environment, database, schema will result in:
134+
- Creating a new dashboard with the UUID from the import zip
135+
- Creating a new set of charts with their UUIDs from the import zip
136+
- Creating a new set of datasets with their UUIDs from the import zip
137+
138+
Migrating the same dashboard a second time to the same target environment, database, but different schema will NOT create a new dashboard.
139+
140+
It will attempt to update the same dashboard as the UUID for the dashboard has not changed. It will also NOT change any of the datasets to the new schema. This appears to be a limitation of the import process, which may lead to some confusing results.
141+
142+
## References
143+
144+
Some helpful references relating to cross-environment workflows:
145+
- [Managing Content Across Workspaces](https://docs.preset.io/docs/managing-content-across-workspaces)
146+
- [Superset Slack AI Explanation](https://apache-superset.slack.com/archives/C072KSLBTC1/p1722382347022689)
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Given a path, load all yaml files
2+
3+
require 'superset/file_utilities'
4+
require 'yaml'
5+
6+
module Superset
7+
module Services
8+
class DashboardLoader
9+
include FileUtilities
10+
TMP_PATH = '/tmp/superset_dashboard_imports'.freeze
11+
12+
attr_reader :dashboard_export_zip
13+
14+
def initialize(dashboard_export_zip:)
15+
@dashboard_export_zip = dashboard_export_zip
16+
end
17+
18+
def perform
19+
unzip_source_file
20+
dashboard_config
21+
end
22+
23+
def dashboard_config
24+
@dashboard_config ||= DashboardConfig.new(
25+
dashboard_export_zip: dashboard_export_zip,
26+
tmp_uniq_dashboard_path: tmp_uniq_dashboard_path).config
27+
end
28+
29+
private
30+
31+
def unzip_source_file
32+
@extracted_files = unzip_file(dashboard_export_zip, tmp_uniq_dashboard_path)
33+
end
34+
35+
def tmp_uniq_dashboard_path
36+
@tmp_uniq_dashboard_path ||= File.join(TMP_PATH, uuid)
37+
end
38+
39+
def uuid
40+
SecureRandom.uuid
41+
end
42+
43+
class DashboardConfig < ::OpenStruct
44+
def config
45+
{
46+
dashboards: load_yamls_for('dashboards'),
47+
databases: load_yamls_for('databases'),
48+
datasets: load_yamls_for('datasets'),
49+
charts: load_yamls_for('charts'),
50+
metadata: load_yamls_for('metadata.yaml', pattern_sufix: nil),
51+
}
52+
end
53+
54+
def load_yamls_for(object_path, pattern_sufix: '**/*.yaml')
55+
pattern = File.join([tmp_uniq_dashboard_path, '**', object_path, pattern_sufix].compact)
56+
Dir.glob(pattern).map do |file|
57+
{ filename: file, content: load_yaml_and_symbolize_keys(file) } if File.file?(file)
58+
end.compact
59+
end
60+
61+
def load_yaml_and_symbolize_keys(path)
62+
yaml = YAML.load_file(path)
63+
yaml.deep_symbolize_keys
64+
end
65+
end
66+
end
67+
end
68+
end
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
require 'superset/services/dashboard_loader'
2+
3+
RSpec.describe Superset::Services::DashboardLoader do
4+
let(:loader) { described_class.new(dashboard_export_zip: dashboard_export_zip) }
5+
let(:dashboard_export_zip) { 'spec/fixtures/dashboard_18_export_20240322.zip' }
6+
7+
describe '#perform' do
8+
before { loader.perform }
9+
10+
it 'populates dashboard_config with each objects filename and content' do
11+
expect(loader.dashboard_config.keys).to contain_exactly(:dashboards, :datasets, :databases, :charts, :metadata)
12+
loader.dashboard_config.keys.each do |object|
13+
expect(loader.dashboard_config[object]).to all(include(:filename, :content))
14+
end
15+
end
16+
17+
it 'loads the metadata yaml file' do
18+
metadata = loader.dashboard_config[:metadata].first
19+
expect(File.basename(metadata[:filename])).to eq('metadata.yaml')
20+
expect(metadata[:content].keys).to contain_exactly(:version, :type, :timestamp)
21+
end
22+
23+
it 'loads the dashboards yaml' do
24+
dashboards = loader.dashboard_config[:dashboards]
25+
expect(dashboards.size).to eq(1)
26+
expect(File.basename(dashboards.first[:filename])).to eq('Birth_Names_18.yaml')
27+
expect(dashboards.first[:content].keys).to match_array(
28+
[ :dashboard_title, :description, :css, :slug, :certified_by, :certification_details, :published,
29+
:uuid, :position, :metadata, :version])
30+
expect(dashboards.first[:content][:dashboard_title]).to eq('Birth Names')
31+
end
32+
33+
it 'loads the databases yaml' do
34+
databases = loader.dashboard_config[:databases]
35+
expect(databases.size).to eq(1)
36+
expect(File.basename(databases.first[:filename])).to eq('examples.yaml')
37+
expect(databases.first[:content].keys).to match_array(
38+
[:allow_ctas, :allow_cvas, :allow_dml, :allow_file_upload, :allow_run_async, :cache_timeout,
39+
:database_name, :expose_in_sqllab, :extra, :sqlalchemy_uri, :uuid, :version])
40+
expect(databases.first[:content][:database_name]).to eq('examples')
41+
end
42+
43+
it 'loads the datasets yaml' do
44+
datasets = loader.dashboard_config[:datasets]
45+
expect(datasets.size).to eq(1)
46+
expect(File.basename(datasets.first[:filename])).to eq('birth_names.yaml')
47+
expect(datasets.first[:content].keys).to match_array([
48+
:table_name, :main_dttm_col, :description, :default_endpoint, :offset, :cache_timeout, :schema, :sql,
49+
:params, :template_params, :filter_select_enabled, :fetch_values_predicate, :extra, :normalize_columns, :always_filter_main_dttm,
50+
:uuid, :metrics, :columns, :version, :database_uuid])
51+
expect(datasets.first[:content][:table_name]).to eq('birth_names')
52+
end
53+
54+
it 'loads the charts yaml' do
55+
charts = loader.dashboard_config[:charts]
56+
binding.pry
57+
expect(charts.size).to eq(5)
58+
expect(charts.map {|c| File.basename(c[:filename]) }).to match_array([
59+
"Boy_Name_Cloud_53920.yaml",
60+
"Names_Sorted_by_Num_in_California_53929.yaml",
61+
"Number_of_Girls_53930.yaml",
62+
"Pivot_Table_53931.yaml",
63+
"Top_10_Girl_Name_Share_53921.yaml"])
64+
expect(charts.first[:content].keys).to match_array([
65+
:cache_timeout, :certification_details, :certified_by, :dataset_uuid, :description, :params, :query_context,
66+
:slice_name, :uuid, :version, :viz_type])
67+
expect(charts.first[:content][:slice_name]).to eq('Boy Name Cloud')
68+
end
69+
end
70+
end

0 commit comments

Comments
 (0)