From 14c0ea1f55e439bc3a43a14c8d736f0f21cd3934 Mon Sep 17 00:00:00 2001 From: billy-the-fish Date: Wed, 10 Dec 2025 12:48:35 +0100 Subject: [PATCH 1/2] chore: add tiering workflow. --- .../data-tiering/about-data-tiering.md | 61 ++++++++++++++++--- 1 file changed, 51 insertions(+), 10 deletions(-) diff --git a/use-timescale/data-tiering/about-data-tiering.md b/use-timescale/data-tiering/about-data-tiering.md index f858d98539..b45a8eb667 100644 --- a/use-timescale/data-tiering/about-data-tiering.md +++ b/use-timescale/data-tiering/about-data-tiering.md @@ -14,7 +14,7 @@ import NotSupportedAzure from "versionContent/_partials/_not-supported-for-azure # About storage tiers -The tiered storage architecture in $CLOUD_LONG includes a high-performance storage tier and a low-cost object storage tier. You use the high-performance tier for data that requires quick access, and the object tier for rarely used historical data. Tiering policies move older data asynchronously and periodically from high-performance to low-cost storage, sparing you the need to do it manually. Chunks from a single hypertable, including compressed chunks, can stretch across these two storage tiers. +The tiered storage architecture in $CLOUD_LONG includes a high-performance storage tier and a low-cost object storage tier. You use the high-performance tier for data that requires quick access, and the object tier for rarely used historical data. Tiering policies move older data asynchronously and periodically from high-performance to low-cost storage, sparing you the need to do it manually. Chunks from a single $HYPERTABLE, including compressed chunks, can stretch across these two storage tiers. ![$CLOUD_LONG tiered storage](https://assets.timescale.com/docs/images/timescale-tiered-storage-architecture.png) @@ -33,7 +33,7 @@ $CLOUD_LONG high-performance storage comes in the following types: -Once you [enable tiered storage][manage-tiering], you can start moving rarely used data to the object tier. The object tier is based on AWS S3 and stores your data in the [Apache Parquet][parquet] format. Within a Parquet file, a set of rows is grouped together to form a row group. Within a row group, values for a single column across multiple rows are stored together. The original size of the data in your $SERVICE_SHORT, compressed or uncompressed, does not correspond directly to its size in S3. A compressed hypertable may even take more space in S3 than it does in $CLOUD_LONG. +Once you [enable tiered storage][manage-tiering], you can start moving rarely used data to the object tier. The object tier is based on AWS S3 and stores your data in the [Apache Parquet][parquet] format. Within a Parquet file, a set of rows is grouped together to form a row group. Within a row group, values for a single column across multiple rows are stored together. The original size of the data in your $SERVICE_SHORT, compressed or uncompressed, does not correspond directly to its size in S3. A compressed $HYPERTABLE may even take more space in S3 than it does in $CLOUD_LONG. @@ -89,19 +89,19 @@ The object storage tier is more than an archiving solution. It is also: - **Scalable:** scale past the restrictions of even the enhanced high-performance storage tier. - **Online:** your data is always there and can be [queried when needed][querying-tiered-data]. -By default, tiered data is not included when you query from a $SERVICE_LONG. To access tiered data, you [enable tiered reads][querying-tiered-data] for a query, a session, or even for all sessions. After you enable tiered reads, when you run regular SQL queries, a behind-the-scenes process transparently pulls data from wherever it's located: the standard high-performance storage tier, the object storage tier, or both. You can `JOIN` against tiered data, build views, and even define continuous aggregates on it. In fact, because the implementation of continuous aggregates also uses hypertables, they can be tiered to low-cost storage as well. +By default, tiered data is not included when you query from a $SERVICE_LONG. To access tiered data, you [enable tiered reads][querying-tiered-data] for a query, a session, or even for all sessions. After you enable tiered reads, when you run regular SQL queries, a behind-the-scenes process transparently pulls data from wherever it's located: the standard high-performance storage tier, the object storage tier, or both. You can `JOIN` against tiered data, build views, and even define continuous aggregates on it. In fact, because the implementation of continuous aggregates also uses $HYPERTABLEs, they can be tiered to low-cost storage as well. The low-cost storage tier comes with the following limitations: - **Limited schema modifications**: some schema modifications are not allowed - on hypertables with tiered chunks. + on $HYPERTABLEs with tiered chunks. - _Allowed_ modifications include: renaming the hypertable, adding columns - with `NULL` defaults, adding indexes, changing or renaming the hypertable + _Allowed_ modifications include: renaming the $HYPERTABLE, adding columns + with `NULL` defaults, adding indexes, changing or renaming the $HYPERTABLE schema, and adding `CHECK` constraints. For `CHECK` constraints, only untiered data is verified. Columns can also be deleted, but you cannot subsequently add a new column - to a tiered hypertable with the same name as the now-deleted column. + to a tiered $HYPERTABLE with the same name as the now-deleted column. _Disallowed_ modifications include: adding a column with non-`NULL` defaults, renaming a column, changing the data type of a @@ -121,16 +121,57 @@ The low-cost storage tier comes with the following limitations: execution time of queries in latency-sensitive environments, especially lighter queries. -* **Number of dimensions**: you cannot use tiered storage with hypertables - partitioned on more than one dimension. Make sure your hypertables are +* **Number of dimensions**: you cannot use tiered storage with $HYPERTABLEs + partitioned on more than one dimension. Make sure your $HYPERTABLEs are partitioned on time only, before you enable tiered storage. +## The tiered storage workflow + +The typical workflow for using tiered storage in $CLOUD_LONG is: + + + +1. **[Enable tiered storage][manage-tiering]** + + You enable tiered storage for each $SERVICE_SHORT individually. + +1. **[Tier your data][move-data]** + + Choose how to move data to the low-cost tier: + - **Automated tiering**: create an interval-based policy using + `add_tiering_policy()` to automatically tier chunks older than a + specified age. By default, policies run hourly, and continuously manage + data placement. + - **Manual tiering**: identify specific chunks to tier by querying the + `timescaledb_information.chunks` view, then use the `tier_chunk()` + function to move individual chunks. The tiering process is asynchronous, + chunks are scheduled for migration and handled by background services. + +1. **[Query your data][querying-tiered-data]**: + 1. To access data stored in the object tier, set `timescaledb.enable_tiered_reads = true` for your session, query, or + all future sessions. + + Without this setting, queries only access data in the high-performance tier. + 1. Run standard SQL queries against your $HYPERTABLEs. + + The query planner automatically determines which chunks to access across both storage + tiers based on your query filters. Chunk pruning, row group pruning, + and column pruning optimize query performance. + +1. **[Monitor and manage][monitor-data]**: + + Track tiered chunks using the `timescaledb_osm.chunks_queued_for_tiering` + view. Modify or remove tiering policies as needed using `alter_job` and + `remove_tiering_policy`. + + [blog-data-tiering]: https://www.timescale.com/blog/expanding-the-boundaries-of-postgresql-announcing-a-bottomless-consumption-based-object-storage-layer-built-on-amazon-s3/ [querying-tiered-data]: /use-timescale/:currentVersion:/data-tiering/querying-tiered-data/ [parquet]: https://parquet.apache.org/ -[manage-tiering]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/#enable-tiered-storage +[manage-tiering]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/#low-cost-object-storage-tier [move-data]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/#automate-tiering-with-policies +[monitor-data]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/#tier-chunks [hypercore]: /use-timescale/:currentVersion:/hypercore [aws-gp3]: https://docs.aws.amazon.com/ebs/latest/userguide/general-purpose.html [ebs-io2]: https://docs.aws.amazon.com/ebs/latest/userguide/provisioned-iops.html#io2-block-express From 810e8b938c56229071f4f625104d075b0eb9b548 Mon Sep 17 00:00:00 2001 From: billy-the-fish Date: Wed, 10 Dec 2025 13:00:28 +0100 Subject: [PATCH 2/2] chore: add tiering workflow. --- use-timescale/data-tiering/about-data-tiering.md | 2 +- use-timescale/data-tiering/index.md | 14 ++++++++------ 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/use-timescale/data-tiering/about-data-tiering.md b/use-timescale/data-tiering/about-data-tiering.md index b45a8eb667..d749c2a10a 100644 --- a/use-timescale/data-tiering/about-data-tiering.md +++ b/use-timescale/data-tiering/about-data-tiering.md @@ -127,7 +127,7 @@ The low-cost storage tier comes with the following limitations: ## The tiered storage workflow -The typical workflow for using tiered storage in $CLOUD_LONG is: +The typical workflow to use tiered storage in $CLOUD_LONG is: diff --git a/use-timescale/data-tiering/index.md b/use-timescale/data-tiering/index.md index 046e307f69..19fabaafbe 100644 --- a/use-timescale/data-tiering/index.md +++ b/use-timescale/data-tiering/index.md @@ -39,12 +39,13 @@ we do the work for you. -In this section, you: -* [Learn more about storage tiers][about-data-tiering]: understand how the tiers are built and how they differ. -* [Manage storage and tiering][enabling-data-tiering]: configure high-performance storage, object storage, and data tiering. -* [Query tiered data][querying-tiered-data]: query the data in the object storage. -* [Learn about replicas and forks with tiered data][replicas-and-forks]: understand how tiered storage works - with forks and replicas of your $SERVICE_SHORT. +In this section, you see: +* [How tiered storage work][about-data-tiering]: understand how the tiers are built and how they differ +* [The tiered storage workflow][data-tiering-workflow]: the steps to enable, manage and query data in low-cost storage +* [Manage storage and tiering][enabling-data-tiering]: configure high-performance storage, object storage, and data tiering +* [Query tiered data][querying-tiered-data]: query the data in the object storage +* [Replicas and forks with tiered data][replicas-and-forks]: understand how tiered storage works with forks + and replicas of your $SERVICE_SHORT. @@ -60,6 +61,7 @@ Coupled with other optimizations, $CLOUD_LONG high-performance storage makes sur [about-data-tiering]: /use-timescale/:currentVersion:/data-tiering/about-data-tiering/ +[data-tiering-workflow]: /use-timescale/:currentVersion:/data-tiering/about-data-tiering/#the-tiered-storage-workflow [enabling-data-tiering]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/ [replicas-and-forks]: /use-timescale/:currentVersion:/data-tiering/tiered-data-replicas-forks/ [creating-data-tiering-policy]: /use-timescale/:currentVersion:/data-tiering/enabling-data-tiering/#automate-tiering-with-policies