Skip to content

Commit 10dd353

Browse files
Ekaterina SokolovaMarina Polyakova
authored andcommitted
Update RUM to PostgreSQL 18
1. Rename some definitions and structures due to conflicts after 8492feb98f6d 2. Add vacuum argument due to e5b0b0ce1509 3. Update README according to the documentation 4. Update Travis CI and other minor fixes
1 parent 0dae718 commit 10dd353

File tree

10 files changed

+136
-112
lines changed

10 files changed

+136
-112
lines changed

.travis.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ notifications:
2323
on_failure: always
2424

2525
env:
26+
- PG_VERSION=18
27+
- PG_VERSION=18 LEVEL=hardcore
2628
- PG_VERSION=17
2729
- PG_VERSION=17 LEVEL=hardcore
2830
- PG_VERSION=16
@@ -32,6 +34,4 @@ env:
3234
- PG_VERSION=14
3335
- PG_VERSION=14 LEVEL=hardcore
3436
- PG_VERSION=13
35-
- PG_VERSION=13 LEVEL=hardcore
3637
- PG_VERSION=12
37-
- PG_VERSION=12 LEVEL=hardcore

README.md

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,29 +8,33 @@
88

99
## Introduction
1010

11-
The **rum** module provides an access method to work with a `RUM` index. It is based
12-
on the `GIN` access method's code.
11+
The **rum** module provides access method to work with the `RUM` indexes. It is based
12+
on the `GIN` access method code.
1313

14-
A `GIN` index allows performing fast full-text search using `tsvector` and
15-
`tsquery` types. But full-text search with a GIN index has several problems:
14+
`GIN` index allows you to perform fast full-text search using `tsvector` and
15+
`tsquery` types. However, full-text search with `GIN` index has some performance
16+
issues because positional and other additional information is not stored.
1617

17-
- Slow ranking. It needs positional information about lexemes to do ranking. A `GIN`
18-
index doesn't store positions of lexemes. So after index scanning, we need an
19-
additional heap scan to retrieve lexeme positions.
20-
- Slow phrase search with a `GIN` index. This problem relates to the previous
21-
problem. It needs positional information to perform phrase search.
22-
- Slow ordering by timestamp. A `GIN` index can't store some related information
23-
in the index with lexemes. So it is necessary to perform an additional heap scan.
18+
`RUM` solves these issues by storing additional information in a posting tree.
19+
As compared to `GIN`, `RUM` index has the following benefits:
2420

25-
`RUM` solves these problems by storing additional information in a posting tree.
26-
For example, positional information of lexemes or timestamps. You can get an
27-
idea of `RUM` with the following diagram:
21+
- Faster ranking. Ranking requires positional information. And after the
22+
index scan we do not need an additional heap scan to retrieve lexeme positions
23+
because `RUM` index stores them.
24+
- Faster phrase search. This improvement is related to the previous one as
25+
phrase search also needs positional information.
26+
- Faster ordering by timestamp. `RUM` index stores additional information together
27+
with lexemes, so it is not necessary to perform a heap scan.
28+
- A possibility to perform depth-first search and therefore return first
29+
results immediately.
30+
31+
You can get an idea of `RUM` with the following diagram:
2832

2933
[![How RUM stores additional information](img/gin_rum.svg)](https://postgrespro.ru/docs/enterprise/current/rum?lang=en)
3034

31-
A drawback of `RUM` is that it has slower build and insert times than `GIN`.
35+
The drawback of `RUM` is that it has slower build and insert time as compared to `GIN`
3236
This is because we need to store additional information besides keys and because
33-
`RUM` uses generic Write-Ahead Log (WAL) records.
37+
because `RUM` stores additional information together with keys and uses generic WAL records.
3438

3539
## License
3640

TODO

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
1. with naturalOrder=true make scan the rest to be consistent with seqscan [done]
22
2. add leftlink to data page to privide backward scan on index (<=| op) [done]
3-
3. Compression of ItemPointer for use_alternative_order
3+
3. ItemPointer compression for indexes with order_by_attach
44
4. Compression addInfo
55
5. Remove FROM_STRATEGY ugly magick [done]
66

src/rum.h

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ typedef struct RumPageOpaqueData
4646
BlockNumber rightlink; /* next page if any */
4747
OffsetNumber maxoff; /* number entries on RUM_DATA page: number of
4848
* heap ItemPointers on RUM_DATA|RUM_LEAF page
49-
* or number of PostingItems on RUM_DATA &
49+
* or number of RumPostingItems on RUM_DATA &
5050
* ~RUM_LEAF page. On RUM_LIST page, number of
5151
* heap tuples. */
5252
OffsetNumber freespace;
@@ -150,19 +150,19 @@ typedef struct RumMetaPageData
150150
* (which is InvalidBlockNumber/0) as well as from all normal item
151151
* pointers (which have item numbers in the range 1..MaxHeapTuplesPerPage).
152152
*/
153-
#define ItemPointerSetMin(p) \
153+
#define RumItemPointerSetMin(p) \
154154
ItemPointerSet((p), (BlockNumber)0, (OffsetNumber)0)
155-
#define ItemPointerIsMin(p) \
155+
#define RumItemPointerIsMin(p) \
156156
(RumItemPointerGetOffsetNumber(p) == (OffsetNumber)0 && \
157157
RumItemPointerGetBlockNumber(p) == (BlockNumber)0)
158-
#define ItemPointerSetMax(p) \
158+
#define RumItemPointerSetMax(p) \
159159
ItemPointerSet((p), InvalidBlockNumber, (OffsetNumber)0xfffe)
160-
#define ItemPointerIsMax(p) \
160+
#define RumItemPointerIsMax(p) \
161161
(RumItemPointerGetOffsetNumber(p) == (OffsetNumber)0xfffe && \
162162
RumItemPointerGetBlockNumber(p) == InvalidBlockNumber)
163163
#define ItemPointerSetLossyPage(p, b) \
164164
ItemPointerSet((p), (b), (OffsetNumber)0xffff)
165-
#define ItemPointerIsLossyPage(p) \
165+
#define RumItemPointerIsLossyPage(p) \
166166
(RumItemPointerGetOffsetNumber(p) == (OffsetNumber)0xffff && \
167167
RumItemPointerGetBlockNumber(p) != InvalidBlockNumber)
168168

@@ -175,7 +175,7 @@ typedef struct RumItem
175175

176176
#define RumItemSetMin(item) \
177177
do { \
178-
ItemPointerSetMin(&((item)->iptr)); \
178+
RumItemPointerSetMin(&((item)->iptr)); \
179179
(item)->addInfoIsNull = true; \
180180
(item)->addInfo = (Datum) 0; \
181181
} while (0)
@@ -188,12 +188,12 @@ typedef struct
188188
/* We use BlockIdData not BlockNumber to avoid padding space wastage */
189189
BlockIdData child_blkno;
190190
RumItem item;
191-
} PostingItem;
191+
} RumPostingItem;
192192

193-
#define PostingItemGetBlockNumber(pointer) \
193+
#define RumPostingItemGetBlockNumber(pointer) \
194194
BlockIdGetBlockNumber(&(pointer)->child_blkno)
195195

196-
#define PostingItemSetBlockNumber(pointer, blockNumber) \
196+
#define RumPostingItemSetBlockNumber(pointer, blockNumber) \
197197
BlockIdSet(&((pointer)->child_blkno), (blockNumber))
198198

199199
/*
@@ -265,8 +265,8 @@ typedef signed char RumNullCategory;
265265
* Data (posting tree) pages
266266
*/
267267
/*
268-
* FIXME -- Currently RumItem is placed as a pages right bound and PostingItem
269-
* is placed as a non-leaf pages item. Both RumItem and PostingItem stores
268+
* FIXME -- Currently RumItem is placed as a pages right bound and RumPostingItem
269+
* is placed as a non-leaf pages item. Both RumItem and RumPostingItem stores
270270
* AddInfo as a raw Datum, which is bogus. It is fine for pass-by-value
271271
* attributes, but it isn't for pass-by-reference, which may have variable
272272
* length of data. This AddInfo is used only by order_by_attach indexes, so it
@@ -278,12 +278,12 @@ typedef signed char RumNullCategory;
278278
#define RumDataPageGetData(page) \
279279
(PageGetContents(page) + MAXALIGN(sizeof(RumItem)))
280280
#define RumDataPageGetItem(page,i) \
281-
(RumDataPageGetData(page) + ((i)-1) * sizeof(PostingItem))
281+
(RumDataPageGetData(page) + ((i)-1) * sizeof(RumPostingItem))
282282

283283
#define RumDataPageGetFreeSpace(page) \
284284
(BLCKSZ - MAXALIGN(SizeOfPageHeaderData) \
285285
- MAXALIGN(sizeof(RumItem)) /* right bound */ \
286-
- RumPageGetOpaque(page)->maxoff * sizeof(PostingItem) \
286+
- RumPageGetOpaque(page)->maxoff * sizeof(RumPostingItem) \
287287
- MAXALIGN(sizeof(RumPageOpaqueData)))
288288

289289
#define RumMaxLeafDataItems \
@@ -513,7 +513,7 @@ typedef struct RumBtreeData
513513
uint32 nitem;
514514
uint32 curitem;
515515

516-
PostingItem pitem;
516+
RumPostingItem pitem;
517517
} RumBtreeData;
518518

519519
extern RumBtreeStack *rumPrepareFindLeafPage(RumBtree btree, BlockNumber blkno);

src/rumbtree.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,7 @@ rumReFindLeafPage(RumBtree btree, RumBtreeStack * stack)
102102
* item pointer is less than item pointer previous to rightmost.
103103
*/
104104
if (compareRumItem(btree->rumstate, btree->entryAttnum,
105-
&(((PostingItem *) RumDataPageGetItem(page, maxoff - 1))->item),
105+
&(((RumPostingItem *) RumDataPageGetItem(page, maxoff - 1))->item),
106106
&btree->items[btree->curitem]) >= 0)
107107
{
108108
break;

0 commit comments

Comments
 (0)