| title | summary |
|---|---|
使用 EXPLAIN 解读执行计划 |
通过示例了解如何使用 EXPLAIN 分析执行计划。 |
SQL 是一种声明性语言,因此用户无法根据 SQL 语句直接判断一条查询的执行是否有效率。用户首先要使用 EXPLAIN 语句查看当前的执行计划。
以 bikeshare 数据库示例(英文) 中的一个 SQL 语句为例,该语句统计了 2017 年 7 月 1 日的行程次数:
{{< copyable "sql" >}}
EXPLAIN SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';+------------------------------+----------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+------------------------------+----------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------+
| StreamAgg_20 | 1.00 | root | | funcs:count(Column#13)->Column#11 |
| └─TableReader_21 | 1.00 | root | | data:StreamAgg_9 |
| └─StreamAgg_9 | 1.00 | cop[tikv] | | funcs:count(1)->Column#13 |
| └─Selection_19 | 250.00 | cop[tikv] | | ge(bikeshare.trips.start_date, 2017-07-01 00:00:00.000000), le(bikeshare.trips.start_date, 2017-07-01 23:59:59.000000) |
| └─TableFullScan_18 | 10000.00 | cop[tikv] | table:trips | keep order:false, stats:pseudo |
+------------------------------+----------+-----------+---------------+------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)以上是该查询的执行计划结果。从 └─TableFullScan_18 算子开始向上看,查询的执行过程如下(非最佳执行计划):
-
Coprocessor (TiKV) 读取整张
trips表的数据,作为一次TableFullScan操作,再将读取到的数据传递给Selection_19算子。Selection_19算子仍在 TiKV 内。 -
Selection_19算子根据谓词WHERE start_date BETWEEN ..进行数据过滤。预计大约有 250 行数据满足该过滤条件(基于统计信息以及算子的执行逻辑估算而来)。└─TableFullScan_18算子显示stats:pseudo,表示该表没有实际统计信息,执行ANALYZE TABLE trips收集统计信息后,预计的估算的数字会更加准确。 -
COUNT函数随后应用于满足过滤条件的行,这一过程也是在 TiKV (cop[tikv]) 中的StreamAgg_9算子内完成的。TiKV coprocessor 能执行一些 MySQL 内置函数,COUNT是其中之一。 -
StreamAgg_9算子执行的结果会被传递给TableReader_21算子(位于 TiDB 进程中,即root任务)。执行计划中,TableReader_21算子的estRows为1,表示该算子将从每个访问的 TiKV Region 接收一行数据。这一请求过程的详情,可参阅EXPLAIN ANALYZE。 -
StreamAgg_20算子随后对└─TableReader_21算子传来的每行数据计算COUNT函数的结果。StreamAgg_20是根算子,会将结果返回给客户端。
注意:
要查看 TiDB 中某张表的 Region 信息,可执行
SHOW TABLE REGIONS语句。
EXPLAIN 语句只返回查询的执行计划,并不执行该查询。若要获取实际的执行时间,可执行该查询,或使用 EXPLAIN ANALYZE 语句:
{{< copyable "sql" >}}
EXPLAIN ANALYZE SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';+------------------------------+----------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+----------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| StreamAgg_20 | 1.00 | 1 | root | | time:1.031417203s, loops:2 | funcs:count(Column#13)->Column#11 | 632 Bytes | N/A |
| └─TableReader_21 | 1.00 | 56 | root | | time:1.031408123s, loops:2, cop_task: {num: 56, max: 782.147269ms, min: 5.759953ms, avg: 252.005927ms, p95: 609.294603ms, max_proc_keys: 910371, p95_proc_keys: 704775, tot_proc: 11.524s, tot_wait: 580ms, rpc_num: 56, rpc_time: 14.111932641s} | data:StreamAgg_9 | 328 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 56 | cop[tikv] | | proc max:640ms, min:8ms, p80:276ms, p95:480ms, iters:18695, tasks:56 | funcs:count(1)->Column#13 | N/A | N/A |
| └─Selection_19 | 250.00 | 11409 | cop[tikv] | | proc max:640ms, min:8ms, p80:276ms, p95:476ms, iters:18695, tasks:56 | ge(bikeshare.trips.start_date, 2017-07-01 00:00:00.000000), le(bikeshare.trips.start_date, 2017-07-01 23:59:59.000000) | N/A | N/A |
| └─TableFullScan_18 | 10000.00 | 19117643 | cop[tikv] | table:trips | proc max:612ms, min:8ms, p80:248ms, p95:460ms, iters:18695, tasks:56 | keep order:false, stats:pseudo | N/A | N/A |
+------------------------------+----------+----------+-----------+---------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
5 rows in set (1.03 sec)执行以上示例查询耗时 1.03 秒,说明执行性能不太理想。
以上 EXPLAIN ANALYZE 的结果中,actRows 表明一些 estRows 预估数不准确(预估返回 10000 行数据但实际返回 19117643 行)。└─TableFullScan_18 算子的 operator info 列 (stats:pseudo) 信息也表明该算子的预估数不准确。
如果先执行 ANALYZE TABLE 再执行 EXPLAIN ANALYZE,预估数与实际数会更接近:
{{< copyable "sql" >}}
ANALYZE TABLE trips;
EXPLAIN ANALYZE SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';Query OK, 0 rows affected (10.22 sec)
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
| StreamAgg_20 | 1.00 | 1 | root | | time:926.393612ms, loops:2 | funcs:count(Column#13)->Column#11 | 632 Bytes | N/A |
| └─TableReader_21 | 1.00 | 56 | root | | time:926.384792ms, loops:2, cop_task: {num: 56, max: 850.94424ms, min: 6.042079ms, avg: 234.987725ms, p95: 495.474806ms, max_proc_keys: 910371, p95_proc_keys: 704775, tot_proc: 10.656s, tot_wait: 904ms, rpc_num: 56, rpc_time: 13.158911952s} | data:StreamAgg_9 | 328 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 56 | cop[tikv] | | proc max:592ms, min:4ms, p80:244ms, p95:480ms, iters:18695, tasks:56 | funcs:count(1)->Column#13 | N/A | N/A |
| └─Selection_19 | 432.89 | 11409 | cop[tikv] | | proc max:592ms, min:4ms, p80:244ms, p95:480ms, iters:18695, tasks:56 | ge(bikeshare.trips.start_date, 2017-07-01 00:00:00.000000), le(bikeshare.trips.start_date, 2017-07-01 23:59:59.000000) | N/A | N/A |
| └─TableFullScan_18 | 19117643.00 | 19117643 | cop[tikv] | table:trips | proc max:564ms, min:4ms, p80:228ms, p95:456ms, iters:18695, tasks:56 | keep order:false | N/A | N/A |
+------------------------------+-------------+----------+-----------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------+-----------+------+
5 rows in set (0.93 sec)执行 ANALYZE TABLE 后,可以看到 └─TableFullScan_18 算子的预估行数是准确的,└─Selection_19 算子的预估行数也更接近实际行数。以上两个示例中的执行计划(即 TiDB 执行查询所使用的一组算子)未改变,但过时的统计信息常常导致 TiDB 选择到非最优的执行计划。
除 ANALYZE TABLE 外,达到 tidb_auto_analyze_ratio 阈值后,TiDB 会自动在后台重新生成统计数据。若要查看 TiDB 有多接近该阈值(即 TiDB 判断统计数据有多健康),可执行 SHOW STATS_HEALTHY 语句。
{{< copyable "sql" >}}
SHOW STATS_HEALTHY;+-----------+------------+----------------+---------+
| Db_name | Table_name | Partition_name | Healthy |
+-----------+------------+----------------+---------+
| bikeshare | trips | | 100 |
+-----------+------------+----------------+---------+
1 row in set (0.00 sec)当前执行计划是有效率的:
-
大部分任务是在 TiKV 内处理的,需要通过网络传输给 TiDB 处理的仅有 56 行数据,每行都满足过滤条件,而且都很短。
-
在 TiDB (
StreamAgg_20) 中和在 TiKV (└─StreamAgg_9) 中汇总行数都使用了 Stream Aggregate,该算法在内存使用方面很有效率。
当前执行计划存在的最大问题在于谓词 start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59' 并未立即生效,先是 TableFullScan 算子读取所有行数据,然后才进行过滤选择。可以在 SHOW CREATE TABLE trips 的返回结果中找出问题原因:
{{< copyable "sql" >}}
SHOW CREATE TABLE trips\G*************************** 1. row ***************************
Table: trips
Create Table: CREATE TABLE `trips` (
`trip_id` bigint NOT NULL AUTO_INCREMENT,
`duration` int NOT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`start_station_number` int DEFAULT NULL,
`start_station` varchar(255) DEFAULT NULL,
`end_station_number` int DEFAULT NULL,
`end_station` varchar(255) DEFAULT NULL,
`bike_number` varchar(255) DEFAULT NULL,
`member_type` varchar(255) DEFAULT NULL,
PRIMARY KEY (`trip_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin AUTO_INCREMENT=20477318
1 row in set (0.00 sec)以上返回结果显示,start_date 列没有索引。要将该谓词下推到 index reader 算子,还需要一个索引。添加索引如下:
{{< copyable "sql" >}}
ALTER TABLE trips ADD INDEX (start_date);Query OK, 0 rows affected (2 min 10.23 sec)注意:
你可通过执行
ADMIN SHOW DDL JOBS语句来查看 DDL 任务的进度。TiDB 中的默认值的设置较为保守,因此添加索引不会对生产环境下的负载造成太大影响。测试环境下,可以考虑调大tidb_ddl_reorg_batch_size和tidb_ddl_reorg_worker_cnt的值。在参照系统上,将批处理大小设为10240,将 worker count 并发度设置为32,该系统可获得 10 倍的性能提升(较之使用默认值)。
添加索引后,可以使用 EXPLAIN 重复该查询。在以下返回结果中,可见 TiDB 选择了新的执行计划,而且不再使用 TableFullScan 和 Selection 算子。
{{< copyable "sql" >}}
EXPLAIN SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';+-----------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------------------+
| id | estRows | task | access object | operator info |
+-----------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------------------+
| StreamAgg_17 | 1.00 | root | | funcs:count(Column#13)->Column#11 |
| └─IndexReader_18 | 1.00 | root | | index:StreamAgg_9 |
| └─StreamAgg_9 | 1.00 | cop[tikv] | | funcs:count(1)->Column#13 |
| └─IndexRangeScan_16 | 8471.88 | cop[tikv] | table:trips, index:start_date(start_date) | range:[2017-07-01 00:00:00,2017-07-01 23:59:59], keep order:false |
+-----------------------------+---------+-----------+-------------------------------------------+-------------------------------------------------------------------+
4 rows in set (0.00 sec)若要比较实际的执行时间,可再次使用 EXPLAIN ANALYZE 语句:
{{< copyable "sql" >}}
EXPLAIN ANALYZE SELECT count(*) FROM trips WHERE start_date BETWEEN '2017-07-01 00:00:00' AND '2017-07-01 23:59:59';+-----------------------------+---------+---------+-----------+-------------------------------------------+------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------+-----------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-----------------------------+---------+---------+-----------+-------------------------------------------+------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------+-----------+------+
| StreamAgg_17 | 1.00 | 1 | root | | time:4.516728ms, loops:2 | funcs:count(Column#13)->Column#11 | 372 Bytes | N/A |
| └─IndexReader_18 | 1.00 | 1 | root | | time:4.514278ms, loops:2, cop_task: {num: 1, max:4.462288ms, proc_keys: 11409, rpc_num: 1, rpc_time: 4.457148ms} | index:StreamAgg_9 | 238 Bytes | N/A |
| └─StreamAgg_9 | 1.00 | 1 | cop[tikv] | | time:4ms, loops:12 | funcs:count(1)->Column#13 | N/A | N/A |
| └─IndexRangeScan_16 | 8471.88 | 11409 | cop[tikv] | table:trips, index:start_date(start_date) | time:4ms, loops:12 | range:[2017-07-01 00:00:00,2017-07-01 23:59:59], keep order:false | N/A | N/A |
+-----------------------------+---------+---------+-----------+-------------------------------------------+------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------+-----------+------+
4 rows in set (0.00 sec)从以上结果可看出,查询时间已从 1.03 秒减少到 0.0 秒。
注意:
以上示例另一个可用的优化方案是 coprocessor cache。如果你无法添加索引,可考虑开启 coprocessor cache 功能。开启后,只要算子上次执行以来 Region 未作更改,TiKV 将从缓存中返回值。这也有助于减少
TableFullScan和Selection算子的大部分运算成本。
在查询优化过程中,TiDB 会提前执行可以在优化阶段直接计算的子查询。例如:
CREATE TABLE t1(a int);
INSERT INTO t1 VALUES(1);
CREATE TABLE t2(a int);
EXPLAIN SELECT * FROM t2 WHERE a = (SELECT a FROM t1);+--------------------------+----------+-----------+---------------+--------------------------------+
| id | estRows | task | access object | operator info |
+--------------------------+----------+-----------+---------------+--------------------------------+
| TableReader_14 | 10.00 | root | | data:Selection_13 |
| └─Selection_13 | 10.00 | cop[tikv] | | eq(test.t2.a, 1) |
| └─TableFullScan_12 | 10000.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
+--------------------------+----------+-----------+---------------+--------------------------------+
3 rows in set (0.00 sec)在上述例子中 a = (SELECT a FROM t1) 子查询在优化阶段就进行了计算,表达式被改写为 t2.a=1。这种执行方式可以在优化阶段进行更多的常量传播和常量折叠优化,但是会影响 EXPLAIN 语句的执行时间。当子查询本身耗时较长时,EXPLAIN 语句无法执行完成,可能会影响线上问题的排查。
从 v7.3.0 开始,TiDB 引入 tidb_opt_enable_non_eval_scalar_subquery 系统变量,可以控制这类子查询在 EXPLAIN 语句中是否禁止提前执行计算展开。该变量默认值为 OFF,即提前计算子查询。你可以将该变量设置为 ON 来禁止子查询提前执行:
SET @@tidb_opt_enable_non_eval_scalar_subquery = ON;
EXPLAIN SELECT * FROM t2 WHERE a = (SELECT a FROM t1);+---------------------------+----------+-----------+---------------+---------------------------------+
| id | estRows | task | access object | operator info |
+---------------------------+----------+-----------+---------------+---------------------------------+
| Selection_13 | 8000.00 | root | | eq(test.t2.a, ScalarQueryCol#5) |
| └─TableReader_15 | 10000.00 | root | | data:TableFullScan_14 |
| └─TableFullScan_14 | 10000.00 | cop[tikv] | table:t2 | keep order:false, stats:pseudo |
| ScalarSubQuery_10 | N/A | root | | Output: ScalarQueryCol#5 |
| └─MaxOneRow_6 | 1.00 | root | | |
| └─TableReader_9 | 1.00 | root | | data:TableFullScan_8 |
| └─TableFullScan_8 | 1.00 | cop[tikv] | table:t1 | keep order:false, stats:pseudo |
+---------------------------+----------+-----------+---------------+---------------------------------+
7 rows in set (0.00 sec)可以看到,标量子查询在执行阶段并没有被展开,这样更便于理解该类 SQL 具体的执行过程。
注意:
tidb_opt_enable_non_eval_scalar_subquery目前仅控制EXPLAIN语句的行为,EXPLAIN ANALYZE语句仍然会将子查询提前展开。