You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Versions
- [x] dev
- [x] 3.0
- [x] 2.1
- [ ] 2.0
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [x] Checked by AI
- [ ] Test Cases Built
Copy file name to clipboardExpand all lines: docs/table-design/row-store.md
+55-20Lines changed: 55 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,39 +35,74 @@ When creating a table, specify whether to enable row storage, which columns to e
35
35
A page is the smallest unit for storage read and write operations, and `page_size` refers to the size of a row-store page. This means that reading a single row requires generating a page IO. The larger this value is, the better the compression effect and the lower the storage space usage. However, the IO overhead during point queries increases, resulting in lower performance (because each IO operation reads at least one page). Conversely, the smaller the value, the higher the storage space usage and the better the performance for point queries. The default value of 16KB is a balanced choice in most cases. If you prioritize query performance, you can configure a smaller value, such as 4KB or even lower. If you prioritize storage space, you can configure a larger value, such as 64KB or even higher.
36
36
37
37
38
-
## Example
38
+
## Row Store Hit Conditions
39
39
40
-
The example below creates an 8-column table, where "key,v1,v3,v5,v7" are the 5 columns enabled for row storage. To optimize for high-concurrency point query performance, the page_size is configured to 4KB.
40
+
Row store hit conditions are divided into two scenarios: one is high-concurrency primary key point queries that depend on table attributes and satisfy point query conditions, and the other is single-table `SELECT *` queries. These two query types are explained below.
41
41
42
-
```
42
+
- For high-concurrency primary key point queries, the table attributes need to have `"enable_unique_key_merge_on_write" = "true"` (MOW table) and `"store_row_column" = "true"` (all columns are stored separately in the row store, which incurs relatively high storage costs) or `"row_store_columns" = "key,v1,v3,v5,v7"` (only specified columns are stored in the row store). When querying, ensure the `WHERE` clause includes all primary keys with equality conditions connected by `AND`, e.g., `SELECT * FROM tbl WHERE k1 = 1 AND k2 = 2` or querying specific columns `SELECT v1, v2 FROM tbl WHERE k1 = 1 AND k2 = 2`. If the row store only contains some columns (e.g., v1) but the queried column (e.g., v2) is not in the row store, the remaining columns will be queried from the column store. In this example, v1 will be queried from the row store, while v2 will be queried from the column store (which has a larger page size, leading to more read amplification). You can confirm whether the high-concurrency primary key point query optimization is hit using `EXPLAIN`. For more details on point query usage, refer to [High-Concurrency Point Query](../query-acceleration/high-concurrent-point-query).
43
+
44
+
- For general non-primary key point queries, to utilize the row store, the table model must be `DUPLICATE` or have `"enable_unique_key_merge_on_write" = "true"` (MOW table) and `"store_row_column" = "true"` (all columns are stored separately in the row store, which incurs relatively high storage costs). Queries satisfying this pattern can hit the row store with `SELECT * FROM tbl [WHERE XXXXX] ORDER BY XXX LIMIT N`, where the content in square brackets is an optional query condition. Note that currently, only `SELECT *` is supported, and it must hit the TOPN delayed materialization optimization. For details, refer to [TOPN Query Optimization](../query-acceleration/optimization-technology-principle/topn-optimization), i.e., hitting `OPT TWO PHASE`. Finally, use `EXPLAIN` to check for the `FETCH ROW STORE` marker to confirm the row store hit.
45
+
46
+
47
+
## Usage Examples
48
+
49
+
The following example creates a table with 8 columns, where the 5 columns `key, v1, v3, v5, v7` are enabled for row store, and the `page_size` is set to 4KB for high-concurrency point query performance.
50
+
51
+
```
43
52
CREATE TABLE `tbl_point_query` (
44
-
`key` int(11) NULL,
45
-
`v1` decimal(27, 9) NULL,
46
-
`v2` varchar(30) NULL,
47
-
`v3` varchar(30) NULL,
48
-
`v4` date NULL,
49
-
`v5` datetime NULL,
50
-
`v6` float NULL,
51
-
`v7` datev2 NULL
53
+
`k` int(11) NULL,
54
+
`v1` decimal(27, 9) NULL,
55
+
`v2` varchar(30) NULL,
56
+
`v3` varchar(30) NULL,
57
+
`v4` date NULL,
58
+
`v5` datetime NULL,
59
+
`v6` float NULL,
60
+
`v7` datev2 NULL
52
61
) ENGINE=OLAP
53
-
UNIQUE KEY(`key`)
62
+
UNIQUE KEY(`k`)
54
63
COMMENT 'OLAP'
55
-
DISTRIBUTED BY HASH(`key`) BUCKETS 1
64
+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
56
65
PROPERTIES (
57
-
"enable_unique_key_merge_on_write" = "true",
58
-
"light_schema_change" = "true",
59
-
"row_store_columns" = "key,v1,v3,v5,v7",
60
-
"row_store_page_size" = "4096"
66
+
"enable_unique_key_merge_on_write" = "true",
67
+
"light_schema_change" = "true",
68
+
"row_store_columns" = "k,v1,v3,v5,v7",
69
+
"row_store_page_size" = "4096"
61
70
);
62
71
```
63
72
64
-
Query
73
+
Query 1
74
+
75
+
```
76
+
SELECT k, v1, v3, v5, v7 FROM tbl_point_query WHERE k = 100
77
+
```
78
+
The `EXPLAIN` output for the above statement should include the `SHORT-CIRCUIT` marker. For more details on point query usage, refer to [High-Concurrency Point Query](../query-acceleration/high-concurrent-point-query).
79
+
80
+
The following example demonstrates how a `DUPLICATE` table can meet row store query conditions.
81
+
65
82
```
66
-
SELECT key, v1, v3, v5, v7 FROM tbl_point_query WHERE key = 100;
83
+
CREATE TABLE `tbl_duplicate` (
84
+
`k` int(11) NULL,
85
+
`v1` string NULw
86
+
) ENGINE=OLAP
87
+
DUPLICATE KEY(`k`)
88
+
COMMENT 'OLAP'
89
+
DISTRIBUTED BY HASH(`k`) BUCKETS 1
90
+
PROPERTIES (
91
+
"light_schema_change" = "true",
92
+
"store_row_column" = "true",
93
+
"row_store_page_size" = "4096"
94
+
);
67
95
```
68
96
69
-
For more information on point query usage, please refer to [High-Concurrent Point Query](../query-acceleration/high-concurrent-point-query).
97
+
`"store_row_column" = "true"` is required.
98
+
99
+
Query 2 (Note: It must hit TOPN query optimization and must be `SELECT *`)
100
+
101
+
```
102
+
SELECT * FROM tbl_duplicate WHERE k < 10 ORDER BY k LIMIT 10
103
+
```
70
104
105
+
The `EXPLAIN` output for the above statement should include the `FETCH ROW STORE` marker and the `OPT TWO PHASE` marker.
0 commit comments