You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/shared/v3-distributed-admin-custom-partitions/_index.md
+15-15Lines changed: 15 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
When writing data to {{< product-name >}}, the InfluxDB v3 storage engine stores data in [Apache Parquet](https://parquet.apache.org/) format in the [Object store](/influxdb/cloud-dedicated/reference/internals/storage-engine/#object-store). Each Parquet file represents a _partition_--a logical grouping of data.
1
+
When writing data to {{< product-name >}}, the InfluxDB v3 storage engine stores data in [Apache Parquet](https://parquet.apache.org/) format in the [Object store](/influxdb/version/reference/internals/storage-engine/#object-store). Each Parquet file represents a _partition_--a logical grouping of data.
2
2
By default, InfluxDB partitions each table _by day_.
3
3
If this default strategy yields unsatisfactory performance for single-series queries,
4
4
you can define a custom partitioning strategy by specifying tag values and different time intervals to optimize query performance for your specific schema and workload.
@@ -20,8 +20,8 @@ you can define a custom partitioning strategy by specifying tag values and diffe
20
20
>
21
21
> Consider custom partitioning if:
22
22
>
23
-
> 1. You have taken steps to [optimize your queries](/influxdb/cloud-dedicated/query-data/troubleshoot-and-optimize/optimize-queries/), and
24
-
> 2. Performance for _single-series queries_ (querying for a specific [tag value](/influxdb/cloud-dedicated/reference/glossary/#tag-value) or [tag set](/influxdb/cloud-dedicated/reference/glossary/#tag-set)) is still unsatisfactory.
23
+
> 1. You have taken steps to [optimize your queries](/influxdb/version/query-data/troubleshoot-and-optimize/optimize-queries/), and
24
+
> 2. Performance for _single-series queries_ (querying for a specific [tag value](/influxdb/version/reference/glossary/#tag-value) or [tag set](/influxdb/version/reference/glossary/#tag-set)) is still unsatisfactory.
25
25
>
26
26
> Before choosing a partitioning strategy, weigh the [advantages](#advantages), [disadvantages](#disadvantages), and [limitations](#limitations) of custom partitioning.
27
27
@@ -40,21 +40,21 @@ storage structure to improve query performance specific to your schema and workl
40
40
## Disadvantages
41
41
42
42
Using custom partitioning may increase the load on other parts of the
but you can scale each part individually to address the added load.
45
45
46
46
{{% note %}}
47
47
_The weight of these disadvantages depends upon the cardinality of
48
48
tags and the specificity of time intervals used for partitioning._
49
49
{{% /note %}}
50
50
51
-
-**Increased load on the [Ingester](/influxdb/cloud-dedicated/reference/internals/storage-engine/#ingester)**
51
+
-**Increased load on the [Ingester](/influxdb/version/reference/internals/storage-engine/#ingester)**
52
52
as it groups data into smaller partitions and files.
53
-
-**Increased load on the [Catalog](/influxdb/cloud-dedicated/reference/internals/storage-engine/#catalog)**
53
+
-**Increased load on the [Catalog](/influxdb/version/reference/internals/storage-engine/#catalog)**
54
54
as more references to partition Parquet file locations are stored and queried.
55
-
-**Increased load on the [Compactor](/influxdb/cloud-dedicated/reference/internals/storage-engine/#compactor)**
55
+
-**Increased load on the [Compactor](/influxdb/version/reference/internals/storage-engine/#compactor)**
56
56
as it needs to compact more partition Parquet files.
57
-
-**Increased costs associated with [Object storage](/influxdb/cloud-dedicated/reference/internals/storage-engine/#object-storage)**
57
+
-**Increased costs associated with [Object storage](/influxdb/version/reference/internals/storage-engine/#object-storage)**
58
58
as more partition Parquet files are created and stored.
59
59
-**Increased latency**. The amount of time for InfluxDB to process a query and return results increases linearly, although slightly, with the total partition count for a table.
60
60
-**Risk of decreased performance for queries that don't use tags in the WHERE clause**.
@@ -74,10 +74,10 @@ After you have considered the [advantages](#advantages), [disadvantages](#disadv
74
74
custom partitioning, use the guides in this section to:
2. Follow [best practices](/influxdb/cloud-dedicated/admin/custom-partitions/best-practices/) for defining partitions and managing partition
77
+
2. Follow [best practices](/influxdb/version/admin/custom-partitions/best-practices/) for defining partitions and managing partition
78
78
growth
79
-
3.[Define custom partitions](/influxdb/cloud-dedicated/admin/custom-partitions/define-custom-partitions/) for your data
80
-
4. Take steps to [limit the number of partition files](/influxdb/cloud-dedicated/admin/custom-partitions/best-practices/#limit-the-number-of-partition-files)
79
+
3.[Define custom partitions](/influxdb/version/admin/custom-partitions/define-custom-partitions/) for your data
80
+
4. Take steps to [limit the number of partition files](/influxdb/version/admin/custom-partitions/best-practices/#limit-the-number-of-partition-files)
81
81
82
82
## How partitioning works
83
83
@@ -88,7 +88,7 @@ and determines the time interval that InfluxDB partitions data by.
88
88
Partition templates use tag values and
89
89
[Rust strftime date and time formatting syntax](https://docs.rs/chrono/latest/chrono/format/strftime/index.html).
90
90
91
-
_For more detailed information, see [Partition templates](/influxdb/cloud-dedicated/admin/custom-partitions/partition-templates/)._
91
+
_For more detailed information, see [Partition templates](/influxdb/version/admin/custom-partitions/partition-templates/)._
Copy file name to clipboardExpand all lines: content/shared/v3-distributed-admin-custom-partitions/best-practices.md
+17-26Lines changed: 17 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ query engine to more quickly identify what partitions contain the relevant data.
21
21
22
22
Partitioning using distinct values of tags with many (10K+) unique values can
23
23
actually hurt query performance as partitions are created for each unique tag value.
24
-
Instead, use [tag buckets](/influxdb/cloud-dedicated/admin/custom-partitions/partition-templates/#tag-bucket-part-templates)
24
+
Instead, use [tag buckets](/influxdb/version/admin/custom-partitions/partition-templates/#tag-bucket-part-templates)
25
25
to partition by high-cardinality tags.
26
26
This method of partitioning groups tag values into "buckets" and partitions by bucket.
27
27
{{% /note %}}
@@ -33,46 +33,37 @@ If points don't have a value for the tag, InfluxDB can't store them in the corre
33
33
34
34
## Avoid over-partitioning
35
35
36
-
As you plan your partitioning strategy, keep in mind that data can be
37
-
"over-partitioned"--meaning partitions are so granular that queries end up
38
-
having to retrieve and read many partitions from the object store, which
39
-
hurts query performance.
36
+
As you plan your partitioning strategy, keep in mind that over-partitioning your data can hurt query performance. If partitions are too granular, queries may need to retrieve and read many partitions from the [Object store](/influxdb/version/reference/internals/storage-engine/#object-store).
40
37
41
-
- Balance the partition time interval with the actual amount of data written
42
-
during each interval. If a single interval doesn't contain a lot of data,
43
-
it is better to partition by larger time intervals.
44
-
- Don't partition by tags that you typically don't use in your query workload.
45
-
- Don't partition by distinct values of high-cardinality tags.
46
-
Instead, [use tag buckets](#use-tag-buckets-for-high-cardinality-tags) to
47
-
partition by these tags.
38
+
- Balance the partition time interval with the actual amount of data written during each interval. If a single interval doesn't contain a lot of data, partition by larger time intervals.
39
+
- Avoid partitioning by tags that you typically don't use in your query workload.
40
+
- Avoid partitioning by distinct values of high-cardinality tags. Instead, [use tag buckets](#use-tag-buckets-for-high-cardinality-tags) to partition by these tags.
48
41
49
42
## Limit the number of partition files
50
43
51
-
Avoid exceeding **10,000** total partition files.
44
+
Avoid exceeding **10,000** total partitions.
52
45
Limiting the total partition count can help manage system performance and costs.
53
46
54
-
While planning your strategy include the following steps to keep the total
55
-
partition count below 10,000 files over the next few years:
47
+
While planning your strategy, take the following steps to limit your total
48
+
partition count.
49
+
We currently recommend planning to keep the total partition count below 10,000.
56
50
57
51
-[Estimate the total partition count](#estimate-the-total-partition-count) for the lifespan of your data
58
-
- Take the following steps to limit the total partition count:
59
-
60
-
-**Set a [database retention period](/influxdb/cloud-dedicated/admin/databases/#retention-period)**
61
-
to prevent the number of files from growing unbounded.
62
-
-**Partition by month or year** to [avoid over-partitioning](#avoid-over-partitioning)
63
-
and creating too many partition files.
64
-
-**Don't partition on high cardinality tags** unless you also use [tag buckets](#use-tag-buckets-for-high-cardinality-tags)
52
+
-**Set a [database retention period](/influxdb/version/admin/databases/#retention-period)**
53
+
to prevent the number of partitions from growing unbounded
54
+
-**Partition by month or year** to [avoid over-partitioning](#avoid-over-partitioning)
55
+
-**Don't partition on high cardinality tags** unless you also use [tag buckets](#use-tag-buckets-for-high-cardinality-tags)
65
56
66
57
### Estimate the total partition count
67
58
68
-
Use the following formula to estimate the total partition file count over the
59
+
Use the following formula to estimate the total partition count over the
-`total_partition_count`: The number of partition files in [Object storage](/influxdb/cloud-dedicated/reference/internals/storage-engine/#object-storage)
66
+
-`total_partition_count`: The number of partition files in [Object storage](/influxdb/version/reference/internals/storage-engine/#object-storage)
76
67
-`cardinality_of_partitioned_tag`: The number of distinct values for a tag
77
-
-`data_lifespan`: The [database retention period](/influxdb/cloud-dedicated/admin/databases/#retention-period), if set, or the expected lifetime of the database
78
-
-`partition_duration`: The partition time interval, defined by the [tine part template](/influxdb/cloud-dedicated/admin/custom-partitions/partition-templates/#time-part-templates)
68
+
-`data_lifespan`: The [database retention period](/influxdb/version/admin/databases/#retention-period), if set, or the expected lifetime of the database
69
+
-`partition_duration`: The partition time interval, defined by the [time part template](/influxdb/version/admin/custom-partitions/partition-templates/#time-part-templates)
0 commit comments