You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Optimize SQL and InfluxQL queries to improve performance and reduce their memory and compute (CPU) requirements.
21
22
Learn how to use observability tools to analyze query execution and view metrics.
22
23
23
24
-[Why is my query slow?](#why-is-my-query-slow)
24
25
-[Strategies for improving query performance](#strategies-for-improving-query-performance)
26
+
-[Query only the data you need](#query-only-the-data-you-need)
25
27
-[Analyze and troubleshoot queries](#analyze-and-troubleshoot-queries)
26
28
27
29
## Why is my query slow?
28
30
29
31
Query performance depends on time range and complexity.
30
32
If a query is slower than you expect, it might be due to the following reasons:
31
33
32
-
- It queries data from a large time range.
34
+
- It queries data from a large time range.
33
35
- It includes intensive operations, such as querying many string values or `ORDER BY` sorting or re-sorting large amounts of data.
34
36
35
37
## Strategies for improving query performance
36
38
37
39
The following design strategies generally improve query performance and resource use:
38
40
39
41
- Follow [schema design best practices](/influxdb/cloud-dedicated/write-data/best-practices/schema-design/) to make querying easier and more performant.
40
-
- Query only the data you need--for example, include a [`WHERE` clause](/influxdb/cloud-dedicated/reference/sql/where/) that filters data by a time range.
41
-
InfluxDB v3 stores data in a Parquet file for each measurement and day, and retrieves files from the Object store to answer a query.
42
-
The smaller the time range in your query, the fewer files InfluxDB needs to retrieve from the Object store.
42
+
-[Query only the data you need](#query-only-the-data-you-need).
43
43
-[Downsample data](/influxdb/cloud-dedicated/process-data/downsample/) to reduce the amount of data you need to query.
44
44
45
45
Some bottlenecks may be out of your control and are the result of a suboptimal execution plan, such as:
@@ -52,9 +52,39 @@ Some bottlenecks may be out of your control and are the result of a suboptimal e
52
52
{{% note %}}
53
53
#### Analyze query plans to view metrics and recognize bottlenecks
54
54
55
-
To view runtime metrics for a query, such as the number of files scanned, use the [`EXPLAIN ANALYZE` keywords](/influxdb/cloud-dedicated/reference/sql/explain/#explain-analyze) and learn how to [analyze a query plan](/influxdb/cloud-dedicated/query-data/troubleshoot-and-optimize/analyze-query-plan/).
55
+
To view runtime metrics for a query, such as the number of files scanned, use
56
+
the [`EXPLAIN ANALYZE` keywords](/influxdb/cloud-dedicated/reference/sql/explain/#explain-analyze)
57
+
and learn how to [analyze a query plan](/influxdb/cloud-dedicated/query-data/troubleshoot-and-optimize/analyze-query-plan/).
56
58
{{% /note %}}
57
59
60
+
### Query only the data you need
61
+
62
+
#### Include a WHERE clause
63
+
64
+
InfluxDB v3 stores data in a Parquet file for each measurement and day, and
65
+
retrieves files from the Object store to answer a query.
66
+
To reduce the number of files that a query needs to retrieve from the Object store,
67
+
include a [`WHERE` clause](/influxdb/cloud-dedicated/reference/sql/where/) that
68
+
filters data by a time range.
69
+
70
+
#### SELECT only columns you need
71
+
72
+
Because InfluxDB v3 is a columnar database, it only processes the columns
73
+
selected in a query, which can mitigate the query performance impact of
||`--retention-period`| Database retention period (default is `0s`, infinite) |
106
106
||`--max-tables`| Maximum tables per database (default is 500, `0` uses default) |
107
-
||`--max-columns`| Maximum columns per table (default is 250, `0` uses default) |
107
+
||`--max-columns`| Maximum columns per table (default is 1000, `0` uses default) |
108
108
||`--template-tag`| Tag to add to partition template (can include multiple of this flag) |
109
109
||`--template-tag-bucket`| Tag and number of buckets to partition tag values into separated by a comma--for example: `tag1,100` (can include multiple of this flag) |
110
110
||`--template-timeformat`| Timestamp format for partition template (default is `%Y-%m-%d`) |
@@ -91,8 +100,9 @@ question as you design your schema.
91
100
- String
92
101
- Boolean
93
102
94
-
{{% product-name %}} doesn't index tag values or field values.
95
-
Tag keys, field keys, and other metadata are indexed to optimize performance.
103
+
{{% product-name %}} indexes tag keys, field keys, and other metadata
104
+
to optimize performance.
105
+
It doesn't index tag values or field values.
96
106
97
107
{{% note %}}
98
108
The InfluxDB v3 storage engine supports infinite tag value and series cardinality.
@@ -106,26 +116,37 @@ cardinality doesn't affect the overall performance of your database.
106
116
107
117
### Do not use duplicate names for tags and fields
108
118
109
-
Tags and fields within the same table can't be named the same.
110
-
All tags and fields are stored as unique columns in a table representing the
111
-
table on disk.
119
+
Use unique names for tags and fields within the same table.
120
+
{{% product-name %}} stores tags and fields as unique columns in a table that
121
+
represents the table on disk.
112
122
If you attempt to write a table that contains tags or fields with the same name,
113
123
the write fails due to a column conflict.
114
124
115
-
### Tables can contain up to 250 columns
125
+
### Maximum number of columns per table
126
+
127
+
A table has a [maximum number of columns](/influxdb/cloud-dedicated/admin/databases/#column-limit).
128
+
Each row must include a time column.
129
+
As a result, a table can have the following:
130
+
131
+
- a time column
132
+
- field and tag columns up to the configured maximum.
116
133
117
-
A table can contain **up to 250 columns**. Each row requires a time column,
118
-
but the rest represent tags and fields stored in the table.
119
-
Therefore, a table can contain one time column and 249 total field and tag columns.
120
-
If you attempt to write to a table and exceed the 250 column limit, the
121
-
write request fails and InfluxDB returns an error.
134
+
If you attempt to write to a table and exceed the column limit, then the write
135
+
request fails and InfluxDB returns an error.
136
+
137
+
InfluxData identified 1000 columns as the safe limit for maintaining system
138
+
performance and stability.
139
+
Exceeding this threshold can result in
140
+
[wide schemas](#avoid-wide-schemas), which can negatively impact performance
141
+
and resource use, depending on the shape and data types in your schema.
122
142
123
143
---
124
144
125
145
## Design for performance
126
146
127
-
How you structure your schema within a table can affect the overall
128
-
performance of queries against that table.
147
+
How you structure your schema within a table can affect resource use and
148
+
the performance of queries against that table.
149
+
129
150
The following guidelines help to optimize query performance:
130
151
131
152
-[Avoid wide schemas](#avoid-wide-schemas)
@@ -135,26 +156,45 @@ The following guidelines help to optimize query performance:
135
156
136
157
### Avoid wide schemas
137
158
138
-
A wide schema is one with many tags and fields and corresponding columns for each.
139
-
With the InfluxDB v3 storage engine, wide schemas don't impact query execution performance.
140
-
Because InfluxDB v3 is a columnar database, it executes queries only against columns selected in the query.
159
+
A wide schema refers to a schema with a large number of columns (tags and fields).
141
160
142
-
Although a wide schema won't affect query performance, it can lead to the following:
161
+
Wide schemas can lead to the following issues:
143
162
144
-
- More resources required for persisting and compacting data during ingestion.
145
-
- Decreased sorting performance due to complex primary keys with [too many tags](#avoid-too-many-tags).
163
+
- Increased resource usage for persisting and compacting data during ingestion.
164
+
- Reduced sorting performance due to complex primary keys with [too many tags](#avoid-too-many-tags).
165
+
- Reduced query performance when [using non-specific queries](#avoid-non-specific-queries).
146
166
147
-
The InfluxDB v3 storage engine has a
148
-
[limit of 250 columns per table](#tables-can-contain-up-to-250-columns).
167
+
To prevent wide schema issues, limit the number of tags and fields stored in a table.
168
+
If you need to store more than the [maximum number of columns](/influxdb/cloud-dedicated/admin/databases/),
169
+
consider segmenting your fields into separate tables.
149
170
150
-
To avoid a wide schema, limit the number of tags and fields stored in a table.
151
-
If you need to store more than 249 total tags and fields, consider segmenting
152
-
your fields into a separate table.
171
+
#### Avoid non-specific queries
172
+
173
+
Because InfluxDB v3 is a columnar database, it only processes the columns
174
+
selected in a query, which can mitigate the query performance impact of wide schemas.
175
+
If you [query only the data that you need](/influxdb/cloud-dedicated/query-data/troubleshoot-and-optimize/optimize-queries/#strategies-for-improving-query-performance),
176
+
then a wide schema might not impact query performance.
177
+
178
+
However, a non-specific query that retrieves a large number of columns from a
179
+
wide schema
180
+
is slower and less efficient than a more targeted query--for example, consider
181
+
the following queries:
182
+
183
+
-`SELECT time,a,b,c`
184
+
-`SELECT *`
185
+
186
+
If the table contains 10 columns, the difference in performance between the
187
+
two queries is minimal.
188
+
In a table with over 1000 columns, the `SELECT *` query is slower and
189
+
less efficient.
153
190
154
191
#### Avoid too many tags
155
192
156
-
In InfluxDB, the primary key for a row is the combination of the point's timestamp and _tag set_ - the collection of [tag keys](/influxdb/cloud-dedicated/reference/glossary/#tag-key) and [tag values](/influxdb/cloud-dedicated/reference/glossary/#tag-value) on the point.
157
-
A point that contains more tags has a more complex primary key, which could impact sorting performance if you sort using all parts of the key.
193
+
In InfluxDB, the primary key for a row is the combination of the point's
194
+
timestamp and _tag set_ - the collection of [tag keys](/influxdb/cloud-dedicated/reference/glossary/#tag-key)
195
+
and [tag values](/influxdb/cloud-dedicated/reference/glossary/#tag-value) on the point.
196
+
A point that contains more tags has a more complex primary key, which could
197
+
impact sorting performance if you sort using all parts of the key.
158
198
159
199
### Avoid sparse schemas
160
200
@@ -275,7 +315,8 @@ Without regular expressions, your queries will be easier to write and more perfo
275
315
276
316
#### Not recommended {.orange}
277
317
278
-
For example, consider the following [line protocol](/influxdb/cloud-dedicated/reference/syntax/line-protocol/) that embeds multiple attributes (location, model, and ID) into a `sensor` tag value:
318
+
For example, consider the following [line protocol](/influxdb/cloud-dedicated/reference/syntax/line-protocol/)
319
+
that embeds multiple attributes (location, model, and ID) into a `sensor` tag value:
A **bucket** is a named location where time series data is stored.
@@ -30,6 +34,8 @@ support InfluxQL and the InfluxDB v1 API `/write` and `/query` endpoints, which
30
34
See how to [map v1 databases and retention policies to buckets](/influxdb/cloud-serverless/guides/api-compatibility/v1/#map-v1-databases-and-retention-policies-to-buckets).
31
35
32
36
**If coming from InfluxDB v2 or InfluxDB Cloud**, _buckets_ are functionally equivalent.
37
+
38
+
**If coming from InfluxDB Cloud Dedicated or InfluxDB Clustered**, _database_ and _bucket_ are synonymous.
0 commit comments