-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
enhancementAn enhancement to an existing feature for functionalityAn enhancement to an existing feature for functionalitylow-hanging-fruitLabel for issues that are easy to implementLabel for issues that are easy to implement
Description
What type of enhancement is this?
Performance
What subsystems and features will be improved?
Compression, Query planner
What does the enhancement do?
Currently we apply Batch Sorted Merge only in queries on columnstores with segmentby when the query needs sorting on orderby columns. Example:
CREATE TABLE ht_seg_order (time int, device int, value int) WITH (
tsdb.hypertable, tsdb.partition_column = 'time',
tsdb.compress, tsdb.compress_segmentby = 'device', tsdb.compress_orderby = 'time DESC'
);
-- Can use Batch Sorted Merge:
select * from ht_seg_order order by time DESC;
We can use BSM on unordered chunks in more scenarios:
- For queries sorted on
orderbycolumns on columnstores withoutsegmentby - For queries sorted on
segmentby+orderbycolumns.
Examples, assuming that the tables have been filled via Direct Compress Insert/Copy with no client sorting:
CREATE TABLE ht_order (time int, device int, value int) WITH (
tsdb.hypertable, tsdb.partition_column = 'time',
tsdb.compress, tsdb.compress_orderby = 'time DESC'
);
-- Should be able to use Batch Sorted Merge:
select * from ht_order order by time DESC;
select * from ht_seg_order order by device, time DESC;
Implementation challenges
Should be relatively easy to implement.
We should add TS benchmarks to see any performance gain from doing BSM vs. no BSM parallel execution.
In a local experiment over ht_metrics_unordered table with 8.1M rows filled with time-overlapping Direct Compress Inserts from TSBench ht_metrics for one value of device, below query run in ~2.5 sec with BSM and in 4.5 sec without BSM with 2x parallelism.
CREATE TABLE ht_metrics_unordered(
time timestamptz,
device int,
value float
) WITH (
tsdb.hypertable, tsdb.partition_column = 'time',
tsdb.compress, tsdb.compress_segmentby = 'device', tsdb.compress_orderby = 'time DESC'
);
SET timescaledb.enable_direct_compress_insert = true;
INSERT INTO ht_metrics_unordered SELECT * FROM ht_metrics
where time > '2020-01-04' and time < '2020-01-25' and device = 1;
INSERT INTO ht_metrics_unordered SELECT * FROM ht_metrics
where time > '2020-01-03' and time < '2020-01-28' and device = 1;
INSERT INTO ht_metrics_unordered SELECT * FROM ht_metrics
where time > '2020-01-06' and time < '2020-01-21' and device = 1;
INSERT INTO ht_metrics_unordered SELECT * FROM ht_metrics
where time > '2020-01-08' and time < '2020-01-22' and device = 1;
INSERT INTO ht_metrics_unordered SELECT * FROM ht_metrics
where time > '2020-01-09' and time < '2020-01-28' and device = 1;
-- Can use BSM
postgres=# explain analyze select * from ht_metrics_unordered order by time;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Custom Scan (ChunkAppend) on ht_metrics_unordered (cost=107.15..85098.73 rows=8120537 width=20) (actual time=5.641..2303.312 rows=8121795 loops=1)
Order: ht_metrics_unordered."time"
...
-> Custom Scan (ColumnarScan) on _hyper_28_179_chunk (cost=80.10..11426.26 rows=1091008 width=20) (actual time=3.781..451.876 rows=1091866 loops=1)
-> Sort (cost=80.09..82.83 rows=1093 width=88) (actual time=2.310..5.697 rows=1093 loops=1)
Sort Key: compress_hyper_29_180_chunk._ts_meta_min_1
Sort Method: quicksort Memory: 134kB
-> Seq Scan on compress_hyper_29_180_chunk (cost=0.00..24.93 rows=1093 width=88) (actual time=1.779..2.122 rows=1093 loops=1)
Planning Time: 0.933 ms
Execution Time: 2500.302 ms
-- Cannot currently use BSM but the data processed is the same:
postgres=# explain analyze select * from ht_metrics_unordered order by device, time;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
------
Gather Merge (cost=571554.37..1361077.19 rows=6766866 width=20) (actual time=2446.764..4226.953 rows=8121795 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=570554.34..579012.93 rows=3383433 width=20) (actual time=2384.348..2754.357 rows=2707265 loops=3)
Sort Key: ht_metrics_unordered.device, ht_metrics_unordered."time"
Sort Method: external merge Disk: 90768kB
Worker 0: Sort Method: external merge Disk: 96912kB
Worker 1: Sort Method: external merge Disk: 82760kB
-> Parallel Append (cost=0.03..64845.08 rows=3383433 width=20) (actual time=3.225..439.964 rows=2707265 loops=3)
-> Custom Scan (ColumnarScan) on _hyper_28_173_chunk (cost=0.05..7237.43 rows=719821 width=20) (actual time=0.051..106.885 rows=1223996 loops
=1)
-> Parallel Seq Scan on compress_hyper_29_174_chunk (cost=0.00..39.21 rows=721 width=88) (actual time=0.007..0.388 rows=1226 loops=1)
...
Planning Time: 0.962 ms
Execution Time: 4457.521 ms
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementAn enhancement to an existing feature for functionalityAn enhancement to an existing feature for functionalitylow-hanging-fruitLabel for issues that are easy to implementLabel for issues that are easy to implement