Skip to content

Commit a908704

Browse files
Added dashboard widget with table count by storage and format (#852)
1 parent e11494c commit a908704

File tree

4 files changed

+52
-4
lines changed

4 files changed

+52
-4
lines changed

docs/assessment.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,15 +25,22 @@ Total number of identified storage locations based on scanning Hive Metastore ta
2525
# Assessment Widgets
2626
Assessment widgets query tables in the $inventory database and summarize or detail out findings.
2727

28-
The second row of the report starts with "Readiness" and "Assessment Summary"
29-
<img width="1235" alt="image" src="https://github.com/databrickslabs/ucx/assets/1122251/c68194c4-4b09-4c8d-b61f-ebf57b7106c7">
28+
The second row of the report starts with "Readiness", "Assessment Summary", "Table counts by storage" and "Table counts by schema and format"
29+
30+
<img width="1512" alt="image" src="https://github.com/databrickslabs/ucx/assets/106815134/41904d8a-c746-4191-be08-2e9e2090935d">
3031

3132
## Readiness
3233
This is a rough summary of the workspace readiness to run Unity Catalog governed workloads. Each line item is the percent of compatible items divided by the total items in the class.
3334

3435
## Assessment Summary
3536
This is a summary count, per finding type of all of the findings identified during the assessment workflow. The assessment summary will help identify areas that need focus (e.g. Tables on DBFS or Clusters that need DBR upgrades)
3637

38+
## Table counts by storage
39+
This is a summary count of Hive Metastore tables, per storage type (DBFS Root, DBFS Mount, Cloud Storage (referred as External)). This also gives a summary count of tables using storage types which are unsupported (such as WASB or ADL in Azure) in Unity Catalog. Count of tables created using Databricks Demo Datasets are also identified here
40+
41+
## Table counts by schema and format
42+
This is a summary count by Hive Metastore (HMS) table formats (Delta and Non Delta) for each HMS schema
43+
3744
The third row continues with "Database Summary"
3845
<img width="1220" alt="image" src="https://github.com/databrickslabs/ucx/assets/1122251/28742e33-d3e3-4eb8-832f-1edd34999fa2">
3946

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
-- viz type=table, name=Table counts by storage, columns=Storage,count
2+
-- widget title=Table counts by storage, row=2, col=2, size_x=2, size_y=5
3+
SELECT storage, COUNT(*) count
4+
FROM (
5+
SELECT
6+
CASE
7+
WHEN STARTSWITH(location, "dbfs:/mnt") THEN "DBFS MOUNT"
8+
WHEN STARTSWITH(location, "/dbfs/mnt") THEN "DBFS MOUNT"
9+
WHEN STARTSWITH(location, "dbfs:/databricks-datasets") THEN "Databricks Demo Dataset"
10+
WHEN STARTSWITH(location, "/dbfs/databricks-datasets") THEN "Databricks Demo Dataset"
11+
WHEN STARTSWITH(location, "dbfs:/") THEN "DBFS ROOT"
12+
WHEN STARTSWITH(location, "/dbfs/") THEN "DBFS ROOT"
13+
WHEN STARTSWITH(location, "wasb") THEN "UNSUPPORTED"
14+
WHEN STARTSWITH(location, "adl") THEN "UNSUPPORTED"
15+
ELSE "EXTERNAL"
16+
END AS storage
17+
FROM $inventory.tables)
18+
GROUP BY storage
19+
ORDER BY storage;
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
-- viz type=table, name=Table counts by schema and format, columns=schema,format,count
2+
-- widget title=Table counts by schema and format, row=2, col=4, size_x=2, size_y=5
3+
SELECT
4+
schema,
5+
format,
6+
COUNT(*) count
7+
FROM
8+
(
9+
SELECT
10+
`database` AS schema,
11+
IF(table_format = 'DELTA', "Delta", "Non Delta") AS format
12+
FROM
13+
$inventory.tables
14+
)
15+
WHERE
16+
format IS NOT NULL
17+
GROUP BY
18+
schema,
19+
format
20+
ORDER BY
21+
schema,
22+
format;
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
-- viz type=table, name=Assessment Summary, search_by=failure, columns=failure,count
2-
-- widget title=Assessment Summary, row=2, col=2, size_x=4, size_y=11
2+
-- widget title=Assessment Summary, row=1, col=2, size_x=4, size_y=6
33
WITH raw AS (
44
SELECT EXPLODE(FROM_JSON(failures, 'array<string>')) AS failure FROM $inventory.objects WHERE failures <> '[]'
55
)
66
SELECT failure as `item`, COUNT(*) AS count FROM raw GROUP BY failure
7-
ORDER BY count DESC, failure DESC
7+
ORDER BY count DESC, failure DESC

0 commit comments

Comments
 (0)