Skip to content

Commit f1fccf9

Browse files
committed
sql: optimize query to populate crdb_internal.table_row_statistics
The query to populate the `crdb_internal.table_row_statistics` virtual table has been rewritten to avoid scanning the `system.table_statistics` table twice. Before this commit the query plan was: • group (hash) │ group by: tableID │ └── • hash join │ actual row count: 515,822 │ equality: (tableID, createdAt) = (tableID, max) │ ├── • scan │ table: table_statistics@primary │ spans: FULL SCAN │ └── • group (streaming) │ group by: tableID │ ordered: +"tableID" │ └── • scan table: table_statistics@primary spans: FULL SCAN Now it is: • distinct │ distinct on: tableID │ order key: tableID │ └── • sort │ order: +"tableID",-"createdAt",-"rowCount" │ already ordered: +"tableID" │ └── • scan table: table_statistics@primary spans: FULL SCAN The `crdb_internal.table_row_statistics` table is used to populate the `estimated_row_count` column in the output of `SHOW TABLES`. In pathological cases where there a many rows in `system.table_statistics`, this new query makes `SHOW TABLES` significantly faster. In a test of mine with ~1.1 million rows in `system.table_statistics` setup with [this](https://gist.github.com/mgartner/b72b39901be0d942d5a026054e688a8c), I observed the latency of `SHOW TABLES` drop by ~50% with the new query, from ~720ms to ~350ms. Informs #143438 Release note (performance improvement): `SHOW TABLES` is now faster, especially in cases where there are many tables, both live and previously dropped.
1 parent 05d4ecf commit f1fccf9

File tree

1 file changed

+5
-9
lines changed

1 file changed

+5
-9
lines changed

pkg/sql/crdb_internal.go

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -768,15 +768,11 @@ CREATE TABLE crdb_internal.table_row_statistics (
768768
// contention on the stats table. We pass a nil transaction so that the AS
769769
// OF clause can be independent of any outer query.
770770
query := fmt.Sprintf(`
771-
SELECT s."tableID", max(s."rowCount")
772-
FROM system.table_statistics AS s
773-
JOIN (
774-
SELECT "tableID", max("createdAt") AS last_dt
775-
FROM system.table_statistics
776-
GROUP BY "tableID"
777-
) AS l ON l."tableID" = s."tableID" AND l.last_dt = s."createdAt"
778-
AS OF SYSTEM TIME '%s'
779-
GROUP BY s."tableID"`, statsAsOfTimeClusterMode.String(&p.ExecCfg().Settings.SV))
771+
SELECT DISTINCT ON ("tableID") "tableID", "rowCount"
772+
FROM system.table_statistics
773+
AS OF SYSTEM TIME '%s'
774+
ORDER BY "tableID", "createdAt" DESC, "rowCount" DESC`,
775+
statsAsOfTimeClusterMode.String(&p.ExecCfg().Settings.SV))
780776
statRows, err := p.ExtendedEvalContext().ExecCfg.InternalDB.Executor().QueryBufferedEx(
781777
ctx, "crdb-internal-statistics-table", nil,
782778
sessiondata.NodeUserSessionDataOverride,

0 commit comments

Comments
 (0)