Skip to content

Commit 65435c8

Browse files
authored
make row-store hit condition more clear (#2631)
## Versions - [x] dev - [x] 3.0 - [x] 2.1 - [ ] 2.0 ## Languages - [x] Chinese - [x] English ## Docs Checklist - [x] Checked by AI - [ ] Test Cases Built
1 parent 26a1965 commit 65435c8

File tree

5 files changed

+190
-63
lines changed

5 files changed

+190
-63
lines changed

docs/table-design/row-store.md

Lines changed: 55 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -35,39 +35,74 @@ When creating a table, specify whether to enable row storage, which columns to e
3535
A page is the smallest unit for storage read and write operations, and `page_size` refers to the size of a row-store page. This means that reading a single row requires generating a page IO. The larger this value is, the better the compression effect and the lower the storage space usage. However, the IO overhead during point queries increases, resulting in lower performance (because each IO operation reads at least one page). Conversely, the smaller the value, the higher the storage space usage and the better the performance for point queries. The default value of 16KB is a balanced choice in most cases. If you prioritize query performance, you can configure a smaller value, such as 4KB or even lower. If you prioritize storage space, you can configure a larger value, such as 64KB or even higher.
3636

3737

38-
## Example
38+
## Row Store Hit Conditions
3939

40-
The example below creates an 8-column table, where "key,v1,v3,v5,v7" are the 5 columns enabled for row storage. To optimize for high-concurrency point query performance, the page_size is configured to 4KB.
40+
Row store hit conditions are divided into two scenarios: one is high-concurrency primary key point queries that depend on table attributes and satisfy point query conditions, and the other is single-table `SELECT *` queries. These two query types are explained below.
4141

42-
```
42+
- For high-concurrency primary key point queries, the table attributes need to have `"enable_unique_key_merge_on_write" = "true"` (MOW table) and `"store_row_column" = "true"` (all columns are stored separately in the row store, which incurs relatively high storage costs) or `"row_store_columns" = "key,v1,v3,v5,v7"` (only specified columns are stored in the row store). When querying, ensure the `WHERE` clause includes all primary keys with equality conditions connected by `AND`, e.g., `SELECT * FROM tbl WHERE k1 = 1 AND k2 = 2` or querying specific columns `SELECT v1, v2 FROM tbl WHERE k1 = 1 AND k2 = 2`. If the row store only contains some columns (e.g., v1) but the queried column (e.g., v2) is not in the row store, the remaining columns will be queried from the column store. In this example, v1 will be queried from the row store, while v2 will be queried from the column store (which has a larger page size, leading to more read amplification). You can confirm whether the high-concurrency primary key point query optimization is hit using `EXPLAIN`. For more details on point query usage, refer to [High-Concurrency Point Query](../query-acceleration/high-concurrent-point-query).
43+
44+
- For general non-primary key point queries, to utilize the row store, the table model must be `DUPLICATE` or have `"enable_unique_key_merge_on_write" = "true"` (MOW table) and `"store_row_column" = "true"` (all columns are stored separately in the row store, which incurs relatively high storage costs). Queries satisfying this pattern can hit the row store with `SELECT * FROM tbl [WHERE XXXXX] ORDER BY XXX LIMIT N`, where the content in square brackets is an optional query condition. Note that currently, only `SELECT *` is supported, and it must hit the TOPN delayed materialization optimization. For details, refer to [TOPN Query Optimization](../query-acceleration/optimization-technology-principle/topn-optimization), i.e., hitting `OPT TWO PHASE`. Finally, use `EXPLAIN` to check for the `FETCH ROW STORE` marker to confirm the row store hit.
45+
46+
47+
## Usage Examples
48+
49+
The following example creates a table with 8 columns, where the 5 columns `key, v1, v3, v5, v7` are enabled for row store, and the `page_size` is set to 4KB for high-concurrency point query performance.
50+
51+
```
4352
CREATE TABLE `tbl_point_query` (
44-
`key` int(11) NULL,
45-
`v1` decimal(27, 9) NULL,
46-
`v2` varchar(30) NULL,
47-
`v3` varchar(30) NULL,
48-
`v4` date NULL,
49-
`v5` datetime NULL,
50-
`v6` float NULL,
51-
`v7` datev2 NULL
53+
`k` int(11) NULL,
54+
`v1` decimal(27, 9) NULL,
55+
`v2` varchar(30) NULL,
56+
`v3` varchar(30) NULL,
57+
`v4` date NULL,
58+
`v5` datetime NULL,
59+
`v6` float NULL,
60+
`v7` datev2 NULL
5261
) ENGINE=OLAP
53-
UNIQUE KEY(`key`)
62+
UNIQUE KEY(`k`)
5463
COMMENT 'OLAP'
55-
DISTRIBUTED BY HASH(`key`) BUCKETS 1
64+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
5665
PROPERTIES (
57-
"enable_unique_key_merge_on_write" = "true",
58-
"light_schema_change" = "true",
59-
"row_store_columns" = "key,v1,v3,v5,v7",
60-
"row_store_page_size" = "4096"
66+
"enable_unique_key_merge_on_write" = "true",
67+
"light_schema_change" = "true",
68+
"row_store_columns" = "k,v1,v3,v5,v7",
69+
"row_store_page_size" = "4096"
6170
);
6271
```
6372

64-
Query
73+
Query 1
74+
75+
```
76+
SELECT k, v1, v3, v5, v7 FROM tbl_point_query WHERE k = 100
77+
```
78+
The `EXPLAIN` output for the above statement should include the `SHORT-CIRCUIT` marker. For more details on point query usage, refer to [High-Concurrency Point Query](../query-acceleration/high-concurrent-point-query).
79+
80+
The following example demonstrates how a `DUPLICATE` table can meet row store query conditions.
81+
6582
```
66-
SELECT key, v1, v3, v5, v7 FROM tbl_point_query WHERE key = 100;
83+
CREATE TABLE `tbl_duplicate` (
84+
`k` int(11) NULL,
85+
`v1` string NULw
86+
) ENGINE=OLAP
87+
DUPLICATE KEY(`k`)
88+
COMMENT 'OLAP'
89+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
90+
PROPERTIES (
91+
"light_schema_change" = "true",
92+
"store_row_column" = "true",
93+
"row_store_page_size" = "4096"
94+
);
6795
```
6896

69-
For more information on point query usage, please refer to [High-Concurrent Point Query](../query-acceleration/high-concurrent-point-query).
97+
`"store_row_column" = "true"` is required.
98+
99+
Query 2 (Note: It must hit TOPN query optimization and must be `SELECT *`)
100+
101+
```
102+
SELECT * FROM tbl_duplicate WHERE k < 10 ORDER BY k LIMIT 10
103+
```
70104

105+
The `EXPLAIN` output for the above statement should include the `FETCH ROW STORE` marker and the `OPT TWO PHASE` marker.
71106

72107
## Notice
73108

i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/row-store.md

Lines changed: 40 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ Doris 默认采用列式存储,每个列连续存储,在分析场景(如
2222
"store_row_column" = "true"
2323
```
2424

25-
2. 哪些列开启行存:如果 `"store_row_column" = "true"`,默认所有列开启行存,若需要指定部分列开启行存,设置 row_store_columns 参数 (3.0之后的版本),格式为逗号分割的列名
25+
2. 哪些列开启行存:如果 `"store_row_column" = "true"`,默认所有列开启行存,若需要指定部分列开启行存,设置 row_store_columns 参数(3.0 之后的版本),格式为逗号分割的列名
2626
```
2727
"row_store_columns" = "column1,column2,column3"
2828
```
@@ -34,14 +34,21 @@ Doris 默认采用列式存储,每个列连续存储,在分析场景(如
3434

3535
page 是存储读写的最小单元,page_size 是行存 page 的大小,也就是说读一行也需要产生一个 page 的 IO。这个值越大压缩效果越好存储空间占用越低,但是点查时 IO 开销越大性能越低(因为一次 IO 至少读一个 page),反过来值越小存储空间极高,点查性能越好。默认值 16KB 是大多数情况下比较均衡的选择,如果更偏向查询性能可以配置较小的值比如 4KB 甚至更低,如果更偏向存储空间可以配置较大的值比如 64KB 甚至更高。
3636

37+
## 行存命中条件
38+
行存命中条件分成两种情况,一种是高并发主键点查需要依赖表的属性以及查询满足点查条件,另一种是单表 SELECT * 查询,下面针对这两种查询进行说明。
39+
40+
- 对于主键高并发点查,建表属性需要开启 `"enable_unique_key_merge_on_write" = "true"`(MOW 表)以及 `"store_row_column" = "true"`(所有列都会在行存中单独额外存一份,存储代价相对较高)或者 `"row_store_columns" = "key,v1,v3,v5,v7"`(只会存储询部分列到行存中)。查询的时候注意 where 条件中需要有所有的主键等值并且是 AND,例如`SELECT * FROM tbl WHERE k1 = 1 AND k2 = 2` 或者查询部分列 `SELECT v1, v2 FROM tbl WHERE k1 = 1 AND k2 = 2`,如果行存只包含了部分列(v1),但是查询的列不在行存中(例如 v2),那么将会从列存中查询剩余的列,该例子中 v1 将会从行存查询,而 v2 会从列存中查询(列存的 page size 更大,会有更多的读放大),通过 EXPLAIN 可以确认是否命中主键高并发点查优化,更多点查的使用请参考 [高并发点查](../query-acceleration/high-concurrent-point-query)
41+
42+
43+
- 对于一般的非主键点查,如果想要走行存那么表模型 DUPLICATE 或者开启`"enable_unique_key_merge_on_write" = "true"`(MOW 表),以及及 `"store_row_column" = "true"`(所有列都会在行存中单独额外存一份,存储代价相对较高)。查询满足这种模式将可以命中行存`SELECT * FROM tble [WHERE XXXXX] ORDER BY XXX LIMIT N` 方括号中的是可选查询条件,注意目前只能是`SELECT *`,且需要命中 TOPN 的延迟物化优化,具体参考[TOPN 查询优化](../query-acceleration/optimization-technology-principle/topn-optimization),即命中`OPT TWO PHASE`。最后通过 EXPLAIN 查看是否有有`FETCH ROW STORE`相应的标记即可确认命中行存
3744

3845
## 使用示例
3946

4047
下面的例子创建一个 8 列的表,其中 "key,v1,v3,v5,v7" 这 5 列开启行存,为了高并发点查性能配置 page_size 为 4KB。
4148

4249
```
4350
CREATE TABLE `tbl_point_query` (
44-
`key` int(11) NULL,
51+
`k` int(11) NULL,
4552
`v1` decimal(27, 9) NULL,
4653
`v2` varchar(30) NULL,
4754
`v3` varchar(30) NULL,
@@ -50,23 +57,48 @@ CREATE TABLE `tbl_point_query` (
5057
`v6` float NULL,
5158
`v7` datev2 NULL
5259
) ENGINE=OLAP
53-
UNIQUE KEY(`key`)
60+
UNIQUE KEY(`k`)
5461
COMMENT 'OLAP'
55-
DISTRIBUTED BY HASH(`key`) BUCKETS 1
62+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
5663
PROPERTIES (
5764
"enable_unique_key_merge_on_write" = "true",
5865
"light_schema_change" = "true",
59-
"row_store_columns" = "key,v1,v3,v5,v7",
66+
"row_store_columns" = "k,v1,v3,v5,v7",
6067
"row_store_page_size" = "4096"
6168
);
6269
```
6370

64-
查询
71+
查询 1
72+
6573
```
66-
SELECT key, v1, v3, v5, v7 FROM tbl_point_query WHERE key = 100
74+
SELECT k, v1, v3, v5, v7 FROM tbl_point_query WHERE k = 100
6775
```
76+
explain 上述语句应该包含 `SHORT-CIRCUIT` 相应的标记。更多点查的使用请参考 [高并发点查](../query-acceleration/high-concurrent-point-query)
77+
78+
下面这个例子展示了 DUPLICATE 表怎么命中行存查询条件
6879

69-
更多点查的使用请参考 [高并发点查](../query-acceleration/high-concurrent-point-query)
80+
```
81+
CREATE TABLE `tbl_duplicate` (
82+
`k` int(11) NULL,
83+
`v1` string NULL
84+
) ENGINE=OLAP
85+
DUPLICATE KEY(`k`)
86+
COMMENT 'OLAP'
87+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
88+
PROPERTIES (
89+
"light_schema_change" = "true",
90+
"store_row_column" = "true",
91+
"row_store_page_size" = "4096"
92+
);
93+
```
94+
` "store_row_column" = "true",` 是必须的
95+
96+
查询 2(注意命中 TOPN 查询优化以及需要是`SELECT *`
97+
98+
```
99+
SELECT * FROM tbl_duplicate WHERE k < 10 ORDER BY k LIMIT 10
100+
```
101+
explain 上述语句应该包含`FETCH ROW STORE` 相应的标记,以及`OPT TWO PHASE`标记
70102

71103

72104
## 注意事项

i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/row-store.md

Lines changed: 3 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,12 +22,7 @@ Doris 默认采用列式存储,每个列连续存储,在分析场景(如
2222
"store_row_column" = "true"
2323
```
2424

25-
2. 哪些列开启行存:如果 `"store_row_column" = "true"`,默认所有列开启行存,若需要指定部分列开启行存,设置 row_store_columns 参数(3.0 之后的版本),格式为逗号分割的列名
26-
```
27-
"row_store_columns" = "column1,column2,column3"
28-
```
29-
30-
3. 行存 page_size:默认为 16KB。
25+
2. 行存 page_size:默认为 16KB。
3126
```
3227
"row_store_page_size" = "16384"
3328
```
@@ -37,7 +32,7 @@ page 是存储读写的最小单元,page_size 是行存 page 的大小,也
3732

3833
## 使用示例
3934

40-
下面的例子创建一个 8 列的表,其中 "key,v1,v3,v5,v7" 这 5 列开启行存,为了高并发点查性能配置 page_size 为 4KB。
35+
下面的例子创建一个 8 列的表,为了高并发点查性能配置 page_size 为 4KB。
4136

4237
```
4338
CREATE TABLE `tbl_point_query` (
@@ -56,7 +51,7 @@ DISTRIBUTED BY HASH(`key`) BUCKETS 1
5651
PROPERTIES (
5752
"enable_unique_key_merge_on_write" = "true",
5853
"light_schema_change" = "true",
59-
"row_store_columns" = "key,v1,v3,v5,v7",
54+
"store_row_column" = "true",
6055
"row_store_page_size" = "4096"
6156
);
6257
```

i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/row-store.md

Lines changed: 38 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -34,14 +34,21 @@ Doris 默认采用列式存储,每个列连续存储,在分析场景(如
3434

3535
page 是存储读写的最小单元,page_size 是行存 page 的大小,也就是说读一行也需要产生一个 page 的 IO。这个值越大压缩效果越好存储空间占用越低,但是点查时 IO 开销越大性能越低(因为一次 IO 至少读一个 page),反过来值越小存储空间极高,点查性能越好。默认值 16KB 是大多数情况下比较均衡的选择,如果更偏向查询性能可以配置较小的值比如 4KB 甚至更低,如果更偏向存储空间可以配置较大的值比如 64KB 甚至更高。
3636

37+
## 行存命中条件
38+
行存命中条件分成两种情况,一种是高并发主键点查需要依赖表的属性以及查询满足点查条件,另一种是单表 SELECT * 查询,下面针对这两种查询进行说明。
39+
40+
- 对于主键高并发点查,建表属性需要开启 `"enable_unique_key_merge_on_write" = "true"`(MOW 表)以及 `"store_row_column" = "true"`(所有列都会在行存中单独额外存一份,存储代价相对较高)或者 `"row_store_columns" = "key,v1,v3,v5,v7"`(只会存储询部分列到行存中)。查询的时候注意 where 条件中需要有所有的主键等值并且是 AND,例如`SELECT * FROM tbl WHERE k1 = 1 AND k2 = 2` 或者查询部分列 `SELECT v1, v2 FROM tbl WHERE k1 = 1 AND k2 = 2`,如果行存只包含了部分列(v1),但是查询的列不在行存中(例如 v2),那么将会从列存中查询剩余的列,该例子中 v1 将会从行存查询,而 v2 会从列存中查询(列存的 page size 更大,会有更多的读放大),通过 EXPLAIN 可以确认是否命中主键高并发点查优化,更多点查的使用请参考 [高并发点查](../query-acceleration/high-concurrent-point-query)
41+
42+
43+
- 对于一般的非主键点查,如果想要走行存那么表模型 DUPLICATE 或者开启`"enable_unique_key_merge_on_write" = "true"`(MOW 表),以及及 `"store_row_column" = "true"`(所有列都会在行存中单独额外存一份,存储代价相对较高)。查询满足这种模式将可以命中行存`SELECT * FROM tble [WHERE XXXXX] ORDER BY XXX LIMIT N` 方括号中的是可选查询条件,注意目前只能是`SELECT *`,且需要命中 TOPN 的延迟物化优化,具体参考[TOPN 查询优化](../query-acceleration/optimization-technology-principle/topn-optimization),即命中`OPT TWO PHASE`。最后通过 EXPLAIN 查看是否有有`FETCH ROW STORE`相应的标记即可确认命中行存
3744

3845
## 使用示例
3946

4047
下面的例子创建一个 8 列的表,其中 "key,v1,v3,v5,v7" 这 5 列开启行存,为了高并发点查性能配置 page_size 为 4KB。
4148

4249
```
4350
CREATE TABLE `tbl_point_query` (
44-
`key` int(11) NULL,
51+
`k` int(11) NULL,
4552
`v1` decimal(27, 9) NULL,
4653
`v2` varchar(30) NULL,
4754
`v3` varchar(30) NULL,
@@ -50,24 +57,48 @@ CREATE TABLE `tbl_point_query` (
5057
`v6` float NULL,
5158
`v7` datev2 NULL
5259
) ENGINE=OLAP
53-
UNIQUE KEY(`key`)
60+
UNIQUE KEY(`k`)
5461
COMMENT 'OLAP'
55-
DISTRIBUTED BY HASH(`key`) BUCKETS 1
62+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
5663
PROPERTIES (
5764
"enable_unique_key_merge_on_write" = "true",
5865
"light_schema_change" = "true",
59-
"row_store_columns" = "key,v1,v3,v5,v7",
66+
"row_store_columns" = "k,v1,v3,v5,v7",
6067
"row_store_page_size" = "4096"
6168
);
6269
```
6370

64-
查询
71+
查询 1
72+
73+
```
74+
SELECT k, v1, v3, v5, v7 FROM tbl_point_query WHERE k = 100
75+
```
76+
explain 上述语句应该包含 `SHORT-CIRCUIT` 相应的标记。更多点查的使用请参考 [高并发点查](../query-acceleration/high-concurrent-point-query)
77+
78+
下面这个例子展示了 DUPLICATE 表怎么命中行存查询条件
79+
6580
```
66-
SELECT key, v1, v3, v5, v7 FROM tbl_point_query WHERE key = 100;
81+
CREATE TABLE `tbl_duplicate` (
82+
`k` int(11) NULL,
83+
`v1` string NULL
84+
) ENGINE=OLAP
85+
DUPLICATE KEY(`k`)
86+
COMMENT 'OLAP'
87+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
88+
PROPERTIES (
89+
"light_schema_change" = "true",
90+
"store_row_column" = "true",
91+
"row_store_page_size" = "4096"
92+
);
6793
```
94+
` "store_row_column" = "true",` 是必须的
6895

69-
更多点查的使用请参考 [高并发点查](../query-acceleration/high-concurrent-point-query)
96+
查询 2(注意命中 TOPN 查询优化以及需要是`SELECT *`
7097

98+
```
99+
SELECT * FROM tbl_duplicate WHERE k < 10 ORDER BY k LIMIT 10
100+
```
101+
explain 上述语句应该包含`FETCH ROW STORE` 相应的标记,以及`OPT TWO PHASE`标记
71102

72103
## 注意事项
73104

0 commit comments

Comments
 (0)