You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sql: don't throw errors for skipped auto stats jobs
Previously, auto stats jobs would throw errors and increase failed jobs
counters if they attempted to start while a stats collection was already
in progress on the table. For large clusters with
'sql.stats.automatic_job_check_before_creating_job.enabled' set to true,
this could create quite a few failed jobs. These failed jobs don't seem
to cause any performance issues, but they clutter logs, potentially
obscuring real problems and alarming customers, who then file tickets
with support to figure out why their jobs are failing.
This patch:
* refactors the autostats checks to reduce code duplication.
* swallows the error for concurrent auto stats creation, logging at
INFO level instead.
* changes the create stats jobs test so that it no longer expects these
jobs creations to fail and instead expects the stats to not be
collected.
* fixes a bug in the create stats jobs test that would cause it to hang
instead of exiting on error.
* adds a cluster setting,
sql.stats.error_on_concurrent_create_stats.enabled, which controls
this new behavior. By default the old behavior is maintained.
Fixes: #148413
Release note (ops change): CockroachDB now has a cluster setting,
sql.stats.error_on_concurrent_create_stats.enabled, which modifies how
it reacts to concurrent auto stats jobs. The default, true, maintains
the previous behavior. Setting this to false will cause the concurrent
auto stats job to be skipped with just a log entry and no increased
error counters.
Copy file name to clipboardExpand all lines: docs/generated/settings/settings-for-tenants.txt
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -355,6 +355,7 @@ sql.stats.automatic_partial_collection.fraction_stale_rows float 0.05 target fra
355
355
sql.stats.automatic_partial_collection.min_stale_rows integer 100 target minimum number of stale rows per table that will trigger a partial statistics refresh application
sql.stats.detailed_latency_metrics.enabled boolean false label latency metrics with the statement fingerprint. Workloads with tens of thousands of distinct query fingerprints should leave this setting false. (experimental, affects performance for workloads with high fingerprint cardinality) application
358
+
sql.stats.error_on_concurrent_create_stats.enabled boolean true set to true to error on concurrent CREATE STATISTICS jobs, instead of skipping them application
358
359
sql.stats.flush.enabled boolean true if set, SQL execution statistics are periodically flushed to disk application
359
360
sql.stats.flush.interval duration 10m0s the interval at which SQL execution statistics are flushed to disk, this value must be less than or equal to 1 hour application
360
361
sql.stats.forecasts.enabled boolean true when true, enables generation of statistics forecasts by default for all tables application
Copy file name to clipboardExpand all lines: docs/generated/settings/settings.html
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -310,6 +310,7 @@
310
310
<tr><td><divid="setting-sql-stats-automatic-partial-collection-min-stale-rows" class="anchored"><code>sql.stats.automatic_partial_collection.min_stale_rows</code></div></td><td>integer</td><td><code>100</code></td><td>target minimum number of stale rows per table that will trigger a partial statistics refresh</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
311
311
<tr><td><divid="setting-sql-stats-cleanup-recurrence" class="anchored"><code>sql.stats.cleanup.recurrence</code></div></td><td>string</td><td><code>@hourly</code></td><td>cron-tab recurrence for SQL Stats cleanup job</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
312
312
<tr><td><divid="setting-sql-stats-detailed-latency-metrics-enabled" class="anchored"><code>sql.stats.detailed_latency_metrics.enabled</code></div></td><td>boolean</td><td><code>false</code></td><td>label latency metrics with the statement fingerprint. Workloads with tens of thousands of distinct query fingerprints should leave this setting false. (experimental, affects performance for workloads with high fingerprint cardinality)</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
313
+
<tr><td><divid="setting-sql-stats-error-on-concurrent-create-stats-enabled" class="anchored"><code>sql.stats.error_on_concurrent_create_stats.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>set to true to error on concurrent CREATE STATISTICS jobs, instead of skipping them</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
313
314
<tr><td><divid="setting-sql-stats-flush-enabled" class="anchored"><code>sql.stats.flush.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>if set, SQL execution statistics are periodically flushed to disk</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
314
315
<tr><td><divid="setting-sql-stats-flush-interval" class="anchored"><code>sql.stats.flush.interval</code></div></td><td>duration</td><td><code>10m0s</code></td><td>the interval at which SQL execution statistics are flushed to disk, this value must be less than or equal to 1 hour</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
315
316
<tr><td><divid="setting-sql-stats-forecasts-enabled" class="anchored"><code>sql.stats.forecasts.enabled</code></div></td><td>boolean</td><td><code>true</code></td><td>when true, enables generation of statistics forecasts by default for all tables</td><td>Serverless/Dedicated/Self-Hosted</td></tr>
0 commit comments