Skip to content

Commit ee20eae

Browse files
craig[bot]DrewKimballiskettaneh
committed
153067: opt: make outside-of-histogram estimates more pessimistic r=DrewKimball a=DrewKimball #### opt: improve outside-histogram optimizer tests This commit makes the following improvements to the `outside-histogram` optimizer test: * Fixes a minor mistake in the test, where the upper bounds on a histogram for a unique column were incorrect. * Simplifies the test code for setting session variable defaults. * Adds comments explaining each test case. Epic: None Release note: None #### opt: add unique and distinct column to outside-histogram test This commit adds a column `u` with a unique index, and a column `d` with a non-unique index to the outside-histogram test table. Both columns use the same histogram as the primary key `k` (and therefore have no duplicate values). This commit also adds tests that are similar to `Q1`, but using the new columns instead of `k`. This shows the optimizer's behavior when the "good" plan requires an index join, and when filtering on `d`, shows how the optimizer behaves when no plan limits the max cardinality of the query. Epic: None Release note: None #### opt: add outside-histogram test for missing enum value This commit adds a pair of `outside-histogram` optimizer test cases that simulate a query against an enum column for which one of the values was not sampled. Epic: None Release note: None #### opt: add outside-histogram tests with large table This commit adds to the existing optimizer tests for filters that select values outside of histograms. The new tests are against a large table to show how the optimizer's trust of low-selectivity estimates varies with table size. Epic: None Release note: None #### opt: make outside-of-histogram estimates more pessimistic This commit makes rowcount estimation fall back on distinct count estimates when a constraint includes zero histogram values. This biases the optimizer toward less risky plans when less information is known about the filtered values. The pessimistic logic is triggered when the estimate derived from a histogram is smaller than `table_row_count / 10,000`. This threshold is chosen because we choose samples such that we expect to sample *nearly* every value with multiplicity down to `table_row_count / 10,000` (see computeNumberSamples). Selecivity estimates from a histogram below this resolution are suspect, since there is increasing likelihood that a value was missed either due to being omitted from the sample, or due to staleness. Informs #130201 Release note (sql change): Added a clamp for row-count estimates over very large tables so that the optimizer assumes that at least one distinct value will be scanned. This reduces the chances of a catastrophic underestimate. The new logic is off by default, gated by a session setting `optimizer_clamp_low_histogram_selectivity`. #### opt: use pessimistic estimates for inequality filters This commit changes row count estimates for inequality filters, so that we expect at least `rowCount / (bucketCount * 100)` rows to "survive" the filter. This is in line with Postgres, which clamps inequality rowcount estimates in a similar fashion. Informs #130201 Release note (sql change): Added a clamp for the estimated selectivity of inequality predicates that are unbounded on one or both sides (ex: `x > 5`). This reduces the chances of a catastrophic understimate causing the optimizer to choose a poorly-constrained scan. The new logic is off by default, gated by the session setting `optimizer_clamp_inequality_selectivity`. #### sql/opt: add telemetry counters for selectivity clamping This commit adds telemetry counters for the new histogram selectivity clamping behavior, as well as log messages to indicate when the new behavior applies in a query's trace. Epic: None Release note: None 155554: kvserver: split meta1 and meta2 r=iskettaneh a=iskettaneh Previously, we used to start meta1 and meta2 in one range at bootstrap, and rely on load-based splitting to split them if needed. However, in some cases, load-based splitting doesn't work when it decides to split a point in meta1 (meta1 is not allowed to split). This PR does two things: 1) At bootstrap, we create two separate ranges for meta1 and meta2. 2) Use spanconfig to install a split point at the start of meta2. This will prevent the two ranges from getting merged together, also, it will split meta1 and meta2 ranges for clusters that were bootstrapped before this PR. In order to ensure that we get the descriptors correct, I started a cluster (when meta1 and meta2 were on the same range), and manually split them and captured the descriptors: ``` [1] Meta Key: /Meta1/""     Range r1:       StartKey: /Min       EndKey:   /Meta2/""       Replicas: (n1,s1):1       Generation: 1 [2] Meta Key: /Meta1/Max     Range r79:       StartKey: /Meta2/""       EndKey:   /System/NodeLiveness       Replicas: (n1,s1):1       Generation: 1 [3] Meta Key: /Meta2/System/NodeLiveness     Range r79:       StartKey: /Meta2/""       EndKey:   /System/NodeLiveness       Replicas: (n1,s1):1       Generation: 1 [4] Meta Key: /Meta2/System/NodeLivenessMax     Range r2:       StartKey: /System/NodeLiveness       EndKey:   /System/NodeLivenessMax       Replicas: (n1,s1):1       Generation: 0 ``` After this commit, the first few ranges are bootstrapped with the following descriptors: ``` [1] Meta Key: /Meta1/"" Range r1: StartKey: /Min EndKey: /Meta2/"" Replicas: (n1,s1):1 Generation: 0 [2] Meta Key: /Meta1/Max Range r2: StartKey: /Meta2/"" EndKey: /System/NodeLiveness Replicas: (n1,s1):1 Generation: 0 [3] Meta Key: /Meta2/System/NodeLiveness Range r2: StartKey: /Meta2/"" EndKey: /System/NodeLiveness Replicas: (n1,s1):1 Generation: 0 [4] Meta Key: /Meta2/System/NodeLivenessMax Range r3: StartKey: /System/NodeLiveness EndKey: /System/NodeLivenessMax Replicas: (n1,s1):1 Generation: 0 ``` Fixes: #119421 Release note: None Co-authored-by: Drew Kimball <[email protected]> Co-authored-by: iskettaneh <[email protected]>
3 parents cd17f4e + 3d33fef + 0e31d7f commit ee20eae

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+4404
-958
lines changed

pkg/ccl/logictestccl/testdata/logic_test/crdb_internal_tenant

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -251,12 +251,12 @@ txn_id txn_fingerprint_id query implicit_txn session_id start_time end_tim
251251
query ITTI
252252
SELECT range_id, start_pretty, end_pretty, lease_holder FROM crdb_internal.ranges
253253
----
254-
79 /Tenant/10 /Tenant/11 1
254+
80 /Tenant/10 /Tenant/11 1
255255

256256
query ITT
257257
SELECT range_id, start_pretty, end_pretty FROM crdb_internal.ranges_no_leases
258258
----
259-
79 /Tenant/10 /Tenant/11
259+
80 /Tenant/10 /Tenant/11
260260

261261
query IT
262262
SELECT zone_id, target FROM crdb_internal.zones ORDER BY 1

pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -471,7 +471,7 @@ skipif config multiregion-9node-3region-3azs-vec-off
471471
query I retry
472472
SELECT DISTINCT range_id FROM [SHOW RANGES FROM TABLE messages_rbr]
473473
----
474-
84
474+
85
475475

476476
# Update does not fail when accessing all rows in messages_rbr because lookup
477477
# join does not error out the lookup table in phase 1.

pkg/ccl/logictestccl/testdata/logic_test/regional_by_row_insert_fast_path

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -200,7 +200,7 @@ query T rowsort
200200
SELECT message FROM [SHOW KV TRACE FOR SESSION]
201201
WHERE message LIKE '%batch%' AND message LIKE '%Scan%'
202202
----
203-
r78: sending batch 4 Scan to (n1,s1):1
203+
r79: sending batch 4 Scan to (n1,s1):1
204204

205205
# Regression test for #115377.
206206
statement ok

pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/basic

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,8 @@ reconcile
77

88
mutations
99
----
10-
upsert /{Min-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
10+
upsert /M{in-eta2} ttl_seconds=3600 num_replicas=5
11+
upsert /{Meta2-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
1112
upsert /System/NodeLiveness{-Max} ttl_seconds=600 num_replicas=5
1213
upsert /System/{NodeLivenessMax-tsd} range system
1314
upsert /System{/tsd-tse} range default
@@ -117,7 +118,7 @@ upsert /Table/10{7-8} num_replicas=7
117118
delete /Table/11{2-3}
118119
upsert /Table/11{2-3} num_replicas=7
119120

120-
state offset=47
121+
state offset=48
121122
----
122123
...
123124
/Table/4{6-7} database system (host)
@@ -288,7 +289,7 @@ upsert /Table/7{5-6} ttl_seconds=100 ignore_strict_gc=true
288289
delete /Table/7{6-7}
289290
upsert /Table/7{6-7} ttl_seconds=100 ignore_strict_gc=true num_replicas=5 rangefeed_enabled=true
290291

291-
state offset=5 limit=42
292+
state offset=6 limit=42
292293
----
293294
...
294295
/Table/{0-4} ttl_seconds=100 ignore_strict_gc=true num_replicas=5 rangefeed_enabled=true

pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/indexes

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ mutations
1818
----
1919
upsert /Table/10{6-7} range default
2020

21-
state offset=47
21+
state offset=48
2222
----
2323
...
2424
/Table/4{6-7} database system (host)
@@ -69,7 +69,7 @@ delete /Table/10{6-7}
6969
upsert /Table/106/{2-3} num_replicas=7 num_voters=5
7070
upsert /Table/10{6/3-7} num_replicas=7
7171

72-
state offset=47
72+
state offset=48
7373
----
7474
...
7575
/Table/4{6-7} database system (host)
@@ -122,7 +122,7 @@ upsert /Table/106/{2-3} ttl_seconds=25 num_replicas=7 num_vot
122122
delete /Table/10{6/3-7}
123123
upsert /Table/10{6/3-7} ttl_seconds=3600 num_replicas=7
124124

125-
state offset=47
125+
state offset=48
126126
----
127127
...
128128
/Table/4{6-7} database system (host)
@@ -165,7 +165,7 @@ exec-sql
165165
ALTER TABLE db.t CONFIGURE ZONE USING num_replicas = 9
166166
----
167167

168-
state offset=47
168+
state offset=48
169169
----
170170
...
171171
/Table/4{6-7} database system (host)
@@ -221,7 +221,7 @@ delete /Table/106{-/2}
221221
delete /Table/106/{2-3}
222222
delete /Table/10{6/3-7}
223223

224-
state offset=46
224+
state offset=47
225225
----
226226
...
227227
/Table/4{5-6} ttl_seconds=7200 ignore_strict_gc=true num_replicas=5 rangefeed_enabled=true exclude_data_from_backup=true

pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/basic

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ mutations
1919

2020
# We should observe placeholder entries for both tenants (installed when
2121
# creating tenant records).
22-
state offset=47
22+
state offset=48
2323
----
2424
...
2525
/Table/4{6-7} database system (host)
@@ -130,7 +130,7 @@ upsert /Tenant/10/Table/7{4-5} database system (tenant)
130130
upsert /Tenant/10/Table/7{5-6} database system (tenant)
131131
upsert /Tenant/10/Table/7{6-7} database system (tenant)
132132

133-
state offset=47
133+
state offset=48
134134
----
135135
...
136136
/Table/4{6-7} database system (host)
@@ -250,7 +250,7 @@ upsert /Tenant/10/Table/10{7-8} rangefeed_enabled=true
250250
upsert /Tenant/10/Table/11{2-3} rangefeed_enabled=true
251251
upsert /Tenant/10/Table/11{3-4} rangefeed_enabled=true
252252

253-
state offset=81
253+
state offset=82
254254
----
255255
...
256256
/Tenant/10/Table/{7-8} database system (tenant)

pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/protectedts

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ mutations
1616

1717
# We should observe placeholder entries for both tenants (installed when
1818
# creating tenant records).
19-
state offset=47
19+
state offset=48
2020
----
2121
...
2222
/Table/4{6-7} database system (host)
@@ -204,9 +204,10 @@ mutations tenant=10
204204
delete {source=10,target=10}
205205

206206
# All system span config targets should have been removed at this point.
207-
state limit=4
207+
state limit=5
208208
----
209-
/{Min-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
209+
/M{in-eta2} ttl_seconds=3600 num_replicas=5
210+
/{Meta2-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
210211
/System/NodeLiveness{-Max} ttl_seconds=600 num_replicas=5
211212
/System/{NodeLivenessMax-tsd} range system
212213
/System{/tsd-tse} range default

pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/range_tenants

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ initialize tenant=11
3535
# We should observe placeholder entries for both tenants (installed when
3636
# creating tenant records). tenant=11 should start off with whatever RANGE
3737
# TENANT was at the time.
38-
state offset=47
38+
state offset=48
3939
----
4040
...
4141
/Table/4{6-7} database system (host)
@@ -81,7 +81,7 @@ reconcile tenant=10
8181
mutations discard tenant=10
8282
----
8383

84-
state offset=47
84+
state offset=48
8585
----
8686
...
8787
/Table/4{6-7} database system (host)
@@ -214,7 +214,7 @@ reconcile tenant=11
214214
mutations discard tenant=11
215215
----
216216

217-
state offset=81
217+
state offset=82
218218
----
219219
...
220220
/Tenant/10/Table/{7-8} database system (tenant)

pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/multitenant/tenant_end_key_split

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ initialize tenant=11
1414

1515
# A record IS written for a key that logically belongs to the next tenant,
1616
# tenant=12, because tenant=12 DOES NOT exist.
17-
state offset=59
17+
state offset=60
1818
----
1919
...
2020
/Table/{59-60} database system (host)
@@ -49,7 +49,7 @@ reconcile tenant=11
4949
# Peek near the start of the span_configurations table where tenant=11's records
5050
# are stored. The first one is from the start of its keyspace to start of
5151
# table with ID=4: /Tenant/11{-/Table/4}.
52-
state offset=60 limit=3
52+
state offset=61 limit=3
5353
----
5454
...
5555
/Table/6{0-1} database system (host)
@@ -60,7 +60,7 @@ state offset=60 limit=3
6060
# Peek near the end of the span_configurations table where tenant=11's records
6161
# are stored. The last one is for its last system table. Right now the split is
6262
# at /Tenant/12. Which is fine.
63-
state offset=103
63+
state offset=104
6464
----
6565
...
6666
/Tenant/11/Table/3{5-6} database system (tenant)
@@ -186,7 +186,7 @@ initialize tenant=10
186186

187187
# A record IS NOT written for a key that logically belongs to the next tenant,
188188
# tenant=11, because tenant=11 DOES exist.
189-
state offset=59 limit=5
189+
state offset=60 limit=5
190190
----
191191
...
192192
/Table/{59-60} database system (host)

pkg/ccl/spanconfigccl/spanconfigreconcilerccl/testdata/named_zones

Lines changed: 25 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,10 @@ reconcile
77
mutations discard
88
----
99

10-
state limit=5
10+
state limit=6
1111
----
12-
/{Min-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
12+
/M{in-eta2} ttl_seconds=3600 num_replicas=5
13+
/{Meta2-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
1314
/System/NodeLiveness{-Max} ttl_seconds=600 num_replicas=5
1415
/System/{NodeLivenessMax-tsd} range system
1516
/System{/tsd-tse} range default
@@ -51,9 +52,10 @@ upsert /System/{NodeLivenessMax-tsd} range default
5152
delete /System{tse-/SystemSpanConfigKeys}
5253
upsert /System{tse-/SystemSpanConfigKeys} range default
5354

54-
state limit=5
55+
state limit=6
5556
----
56-
/{Min-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
57+
/M{in-eta2} ttl_seconds=3600 num_replicas=5
58+
/{Meta2-System/NodeLiveness} ttl_seconds=3600 num_replicas=5
5759
/System/NodeLiveness{-Max} ttl_seconds=600 num_replicas=7
5860
/System/{NodeLivenessMax-tsd} range default
5961
/System{/tsd-tse} ttl_seconds=42
@@ -69,14 +71,17 @@ ALTER RANGE timeseries CONFIGURE ZONE DISCARD;
6971

7072
mutations
7173
----
72-
delete /{Min-System/NodeLiveness}
73-
upsert /{Min-System/NodeLiveness} range default
74+
delete /M{in-eta2}
75+
upsert /M{in-eta2} range default
76+
delete /{Meta2-System/NodeLiveness}
77+
upsert /{Meta2-System/NodeLiveness} range default
7478
delete /System{/tsd-tse}
7579
upsert /System{/tsd-tse} range default
7680

77-
state limit=5
81+
state limit=6
7882
----
79-
/{Min-System/NodeLiveness} range default
83+
/M{in-eta2} range default
84+
/{Meta2-System/NodeLiveness} range default
8085
/System/NodeLiveness{-Max} ttl_seconds=600 num_replicas=7
8186
/System/{NodeLivenessMax-tsd} range default
8287
/System{/tsd-tse} range default
@@ -100,8 +105,10 @@ ALTER RANGE default CONFIGURE ZONE USING gc.ttlseconds = 50;
100105

101106
mutations
102107
----
103-
delete /{Min-System/NodeLiveness}
104-
upsert /{Min-System/NodeLiveness} ttl_seconds=50
108+
delete /M{in-eta2}
109+
upsert /M{in-eta2} ttl_seconds=50
110+
delete /{Meta2-System/NodeLiveness}
111+
upsert /{Meta2-System/NodeLiveness} ttl_seconds=50
105112
delete /System/{NodeLivenessMax-tsd}
106113
upsert /System/{NodeLivenessMax-tsd} ttl_seconds=50
107114
delete /System{/tsd-tse}
@@ -111,16 +118,17 @@ upsert /System{tse-/SystemSpanConfigKeys} ttl_seconds=50
111118
delete /Table/10{6-7}
112119
upsert /Table/10{6-7} ttl_seconds=50
113120

114-
state limit=5
121+
state limit=6
115122
----
116-
/{Min-System/NodeLiveness} ttl_seconds=50
123+
/M{in-eta2} ttl_seconds=50
124+
/{Meta2-System/NodeLiveness} ttl_seconds=50
117125
/System/NodeLiveness{-Max} ttl_seconds=600 num_replicas=7
118126
/System/{NodeLivenessMax-tsd} ttl_seconds=50
119127
/System{/tsd-tse} ttl_seconds=50
120128
/System{tse-/SystemSpanConfigKeys} ttl_seconds=50
121129
...
122130

123-
state offset=46
131+
state offset=47
124132
----
125133
...
126134
/Table/4{5-6} ttl_seconds=7200 ignore_strict_gc=true num_replicas=5 rangefeed_enabled=true exclude_data_from_backup=true
@@ -165,7 +173,7 @@ mutations
165173
----
166174
upsert /Table/10{7-8} ttl_seconds=50
167175

168-
state offset=46
176+
state offset=47
169177
----
170178
...
171179
/Table/4{5-6} ttl_seconds=7200 ignore_strict_gc=true num_replicas=5 rangefeed_enabled=true exclude_data_from_backup=true
@@ -213,9 +221,10 @@ upsert /System/{NodeLivenessMax-tsd} ttl_seconds=100
213221
delete /System{tse-/SystemSpanConfigKeys}
214222
upsert /System{tse-/SystemSpanConfigKeys} ttl_seconds=100
215223

216-
state limit=5
224+
state limit=6
217225
----
218-
/{Min-System/NodeLiveness} ttl_seconds=50
226+
/M{in-eta2} ttl_seconds=50
227+
/{Meta2-System/NodeLiveness} ttl_seconds=50
219228
/System/NodeLiveness{-Max} ttl_seconds=600 num_replicas=7
220229
/System/{NodeLivenessMax-tsd} ttl_seconds=100
221230
/System{/tsd-tse} ttl_seconds=50

0 commit comments

Comments
 (0)