Skip to content

Improve Analyze Performance and Stability #41930

@xuyifangreeneyes

Description

@xuyifangreeneyes

Enhancement

Currently, when we use the analyze command to collect statistics. There are several problems we have met, especially for large tables:

  • Analyze is slow. Since analyze needs to scan the full table, it may take hours even days to finish the analyze job for large tables.
  • Analyze may consume much resource. Some users may increase concurrency(like tidb_build_stats_concurrency and tidb_distsql_scan_concurrency) to speed up analyze. However, that may consume lots of cpu/mem/io for tikv(when scanning the table and sampling) and lots of cpu/mem for tidb(when merging samples and building stats).
  • When the table has many columns or some columns have large sizes(like text/blob/json type columns), the samples may take up lots of mem. When merging samples and building stats in tidb, tidb may OOM or analyze may be killed by global mem control mechanism. Maybe we can give up collecting statistics for some columns whose stats are barely used such as json columns.
  • The execution of analyze is not fault-tolerant. If one analyze request to some region fails(maybe due to region unavailable or other reasons), the whole analyze job would fail and we need to rerun the analyze job from the very beginning. It is unfriendly to users.

Here is the related issue in tikv repo:
tikv/tikv#14231

Tasks

Use faster murmur3 hash function for FMSketch calculation

Reduce encoding cost

Avoid FMSketch calculation for single-column index

Sample-based NDV calculation

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions