Skip to content

Commit e2af70b

Browse files
authored
Merge branch 'master' into data-token-test
2 parents 05f2ee9 + a7486b0 commit e2af70b

File tree

71 files changed

+2216
-581
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

71 files changed

+2216
-581
lines changed

docs/content/append-table/query-performance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ scenario. Using a bitmap may consume more space but can result in greater accura
6363
* `file-index.bloom-filter.<column_name>.items` to config the expected distinct items in one data file.
6464

6565
`Bitmap`:
66-
* `file-index.bitmap.columns`: specify the columns that need bitmap index.
66+
* `file-index.bitmap.columns`: specify the columns that need bitmap index. See [Index Bitmap]({{< ref "concepts/spec/fileindex#index-bitmap" >}}).
6767

6868
`Bit-Slice Index Bitmap`
6969
* `file-index.bsi.columns`: specify the columns that need bsi index.

docs/content/concepts/rest-catalog.md

Lines changed: 22 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -51,48 +51,48 @@ Paimon REST Catalog provides a lightweight implementation to access the catalog
5151
```sql
5252
CREATE CATALOG `paimon-rest-catalog`
5353
WITH (
54-
'type' = 'paimon',
55-
'uri' = '<catalog server url>',
56-
'metastore' = 'rest',
57-
'token.provider' = 'bear'
58-
'token' = '<token>'
54+
'type' = 'paimon',
55+
'uri' = '<catalog server url>',
56+
'metastore' = 'rest',
57+
'token.provider' = 'bear'
58+
'token' = '<token>'
5959
);
6060
```
6161
- DLF ak
6262
```sql
6363
CREATE CATALOG `paimon-rest-catalog`
6464
WITH (
65-
'type' = 'paimon',
66-
'uri' = '<catalog server url>',
67-
'metastore' = 'rest',
68-
'token.provider' = 'dlf',
69-
'dlf.accessKeyId'='<accessKeyId>',
70-
'dlf.accessKeySecret'='<accessKeySecret>',
65+
'type' = 'paimon',
66+
'uri' = '<catalog server url>',
67+
'metastore' = 'rest',
68+
'token.provider' = 'dlf',
69+
'dlf.accessKeyId'='<accessKeyId>',
70+
'dlf.accessKeySecret'='<accessKeySecret>',
7171
);
7272
```
7373

7474
- DLF sts token
7575
```sql
7676
CREATE CATALOG `paimon-rest-catalog`
7777
WITH (
78-
'type' = 'paimon',
79-
'uri' = '<catalog server url>',
80-
'metastore' = 'rest',
81-
'token.provider' = 'dlf',
82-
'dlf.accessKeyId'='<accessKeyId>',
83-
'dlf.accessKeySecret'='<accessKeySecret>',
84-
'dlf.securityToken'='<securityToken>'
78+
'type' = 'paimon',
79+
'uri' = '<catalog server url>',
80+
'metastore' = 'rest',
81+
'token.provider' = 'dlf',
82+
'dlf.accessKeyId'='<accessKeyId>',
83+
'dlf.accessKeySecret'='<accessKeySecret>',
84+
'dlf.securityToken'='<securityToken>'
8585
);
8686
```
8787

8888
- DLF sts token path
8989
```sql
9090
CREATE CATALOG `paimon-rest-catalog`
9191
WITH (
92-
'type' = 'paimon',
93-
'uri' = '<catalog server url>',
94-
'metastore' = 'rest',
95-
'token.provider' = 'dlf'
92+
'type' = 'paimon',
93+
'uri' = '<catalog server url>',
94+
'metastore' = 'rest',
95+
'token.provider' = 'dlf'
9696
);
9797
```
9898

docs/content/concepts/spec/fileindex.md

Lines changed: 158 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -96,11 +96,76 @@ This class use (64-bits) long hash. Store the num hash function (one integer) an
9696

9797
## Index: Bitmap
9898

99-
Define `'file-index.bitmap.columns'`.
99+
* `file-index.bitmap.columns`: specify the columns that need bitmap index.
100+
* `file-index.bitmap.version`: specify the bitmap index format version, default version is 1, latest version is 2.
101+
* `file-index.bitmap.<column_name>.index-block-size`: to config secondary index block size, default value is 16kb.
100102

101-
Bitmap file index format (V1):
103+
104+
Bitmap file index format (V2):
102105

103106
<pre>
107+
108+
Bitmap file index format (V2)
109+
+-------------------------------------------------+-----------------
110+
| version (1 byte) = 2 |
111+
+-------------------------------------------------+
112+
| row count (4 bytes int) |
113+
+-------------------------------------------------+
114+
| non-null value bitmap number (4 bytes int) |
115+
+-------------------------------------------------+
116+
| has null value (1 byte) |
117+
+-------------------------------------------------+
118+
| null value offset (4 bytes if has null value) | HEAD
119+
+-------------------------------------------------+
120+
| null bitmap length (4 bytes if has null value) |
121+
+-------------------------------------------------+
122+
| bitmap index block number (4 bytes int) |
123+
+-------------------------------------------------+
124+
| value 1 | offset 1 |
125+
+-------------------------------------------------+
126+
| value 2 | offset 2 |
127+
+-------------------------------------------------+
128+
| ... |
129+
+-------------------------------------------------+
130+
| bitmap body offset (4 bytes int) |
131+
+-------------------------------------------------+-----------------
132+
| bitmap index block 1 |
133+
+-------------------------------------------------+
134+
| bitmap index block 2 | INDEX BLOCKS
135+
+-------------------------------------------------+
136+
| ... |
137+
+-------------------------------------------------+-----------------
138+
| serialized bitmap 1 |
139+
+-------------------------------------------------+
140+
| serialized bitmap 2 |
141+
+-------------------------------------------------+ BITMAP BLOCKS
142+
| serialized bitmap 3 |
143+
+-------------------------------------------------+
144+
| ... |
145+
+-------------------------------------------------+-----------------
146+
147+
index block format:
148+
+-------------------------------------------------+
149+
| entry number (4 bytes int) |
150+
+-------------------------------------------------+
151+
| value 1 | offset 1 | length 1 |
152+
+-------------------------------------------------+
153+
| value 2 | offset 2 | length 2 |
154+
+-------------------------------------------------+
155+
| ... |
156+
+-------------------------------------------------+
157+
158+
value x: var bytes for any data type (as bitmap identifier)
159+
offset: 4 bytes int (when it is negative, it represents that there is only one value
160+
and its position is the inverse of the negative value)
161+
length: 4 bytes int
162+
163+
</pre>
164+
165+
(Legacy) Bitmap file index format (V1):
166+
167+
<pre>
168+
104169
Bitmap file index format (V1)
105170
+-------------------------------------------------+-----------------
106171
| version (1 byte) |
@@ -135,7 +200,97 @@ offset: 4 bytes int (when it is negative, it represents t
135200
and its position is the inverse of the negative value)
136201
</pre>
137202

138-
Integer are all BIG_ENDIAN.
203+
Integer are all BIG_ENDIAN. In the paimon version that supports v2, the bitmap index version defaults to v2.
204+
205+
Bitmap only support the following data type:
206+
207+
<table class="table table-bordered">
208+
<thead>
209+
<tr>
210+
<th class="text-left" style="width: 10%">Paimon Data Type</th>
211+
<th class="text-left" style="width: 5%">Supported</th>
212+
</tr>
213+
</thead>
214+
<tbody>
215+
<tr>
216+
<td><code>TinyIntType</code></td>
217+
<td>true</td>
218+
</tr>
219+
<tr>
220+
<td><code>SmallIntType</code></td>
221+
<td>true</td>
222+
</tr>
223+
<tr>
224+
<td><code>IntType</code></td>
225+
<td>true</td>
226+
</tr>
227+
<tr>
228+
<td><code>BigIntType</code></td>
229+
<td>true</td>
230+
</tr>
231+
<tr>
232+
<td><code>DateType</code></td>
233+
<td>true</td>
234+
</tr>
235+
<tr>
236+
<td><code>TimeType</code></td>
237+
<td>true</td>
238+
</tr>
239+
<tr>
240+
<td><code>LocalZonedTimestampType</code></td>
241+
<td>true</td>
242+
</tr>
243+
<tr>
244+
<td><code>TimestampType</code></td>
245+
<td>true</td>
246+
</tr>
247+
<tr>
248+
<td><code>CharType</code></td>
249+
<td>true</td>
250+
</tr>
251+
<tr>
252+
<td><code>VarCharType</code></td>
253+
<td>true</td>
254+
</tr>
255+
<tr>
256+
<td><code>StringType</code></td>
257+
<td>true</td>
258+
</tr>
259+
<tr>
260+
<td><code>BooleanType</code></td>
261+
<td>true</td>
262+
</tr>
263+
<tr>
264+
<td><code>DecimalType(precision, scale)</code></td>
265+
<td>false</td>
266+
</tr>
267+
<tr>
268+
<td><code>FloatType</code></td>
269+
<td>Not recommended</td>
270+
</tr>
271+
<tr>
272+
<td><code>DoubleType</code></td>
273+
<td>Not recommended</td>
274+
</tr>
275+
<tr>
276+
<td><code>VarBinaryType</code>, <code>BinaryType</code></td>
277+
<td>false</td>
278+
</tr>
279+
<tr>
280+
<td><code>RowType</code></td>
281+
<td>false</td>
282+
</tr>
283+
<tr>
284+
<td><code>MapType</code></td>
285+
<td>false</td>
286+
</tr>
287+
<tr>
288+
<td><code>ArrayType</code></td>
289+
<td>false</td>
290+
</tr>
291+
</tbody>
292+
</table>
293+
139294

140295
## Index: Bit-Slice Index Bitmap
141296

docs/content/primary-key-table/query-performance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Supported filter types:
6161
* `file-index.bloom-filter.<column_name>.items` to config the expected distinct items in one data file.
6262

6363
`Bitmap`:
64-
* `file-index.bitmap.columns`: specify the columns that need bitmap index.
64+
* `file-index.bitmap.columns`: specify the columns that need bitmap index. See [Index Bitmap]({{< ref "concepts/spec/fileindex#index-bitmap" >}}).
6565

6666
`Bit-Slice Index Bitmap`
6767
* `file-index.bsi.columns`: specify the columns that need bsi index.

docs/layouts/shortcodes/generated/kafka_sync_database.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,7 @@
8686
<li>"char-to-string": maps MySQL CHAR(length)/VARCHAR(length) types to STRING.</li>
8787
<li>"longtext-to-bytes": maps MySQL LONGTEXT types to BYTES.</li>
8888
<li>"bigint-unsigned-to-bigint": maps MySQL BIGINT UNSIGNED, BIGINT UNSIGNED ZEROFILL, SERIAL to BIGINT. You should ensure overflow won't occur when using this option.</li>
89+
<li>"decimal-no-change": Ignore decimal type change.</li>
8990
</ul>
9091
</td>
9192
</tr>

docs/layouts/shortcodes/generated/kafka_sync_table.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@
5858
<li>"char-to-string": maps MySQL CHAR(length)/VARCHAR(length) types to STRING.</li>
5959
<li>"longtext-to-bytes": maps MySQL LONGTEXT types to BYTES.</li>
6060
<li>"bigint-unsigned-to-bigint": maps MySQL BIGINT UNSIGNED, BIGINT UNSIGNED ZEROFILL, SERIAL to BIGINT. You should ensure overflow won't occur when using this option.</li>
61+
<li>"decimal-no-change": Ignore decimal type change.</li>
6162
</ul>
6263
</td>
6364
</tr>

0 commit comments

Comments
 (0)