diff --git a/api/compression/compress_chunk.md b/api/compression/compress_chunk.md
index d964535992..9f67b98b96 100644
--- a/api/compression/compress_chunk.md
+++ b/api/compression/compress_chunk.md
@@ -51,11 +51,11 @@ SELECT compress_chunk('_timescaledb_internal._hyper_1_2_chunk');
## Optional arguments
-| Name | Type | Default | Required | Description |
-|----------------------|--|---------|--|----------------------------------------------------------------------------------------------------------------------------------------------------|
-| `chunk` | REGCLASS | - |✔| Name of the chunk to add to the $COLUMNSTORE. |
-| `if_not_columnstore` | BOOLEAN | `true` |✖| Set to `false` so this job fails with an error rather than a warning if `chunk` is already in the $COLUMNSTORE. |
-| `recompress` | BOOLEAN | `false` |✖| Set to true to recompress. In-memory recompression will be attempted first; otherwise it will fall back to internal decompress/compress. |
+| Name | Type | Default | Required | Description |
+|----------------------|--|---------|--|--------------------------------------------------------------------------------------------------------------------------------|
+| `chunk` | REGCLASS | - |✔| Name of the chunk to add to the $COLUMNSTORE. |
+| `if_not_columnstore` | BOOLEAN | `true` |✖| Set to `false` so this job fails with an error rather than a warning if `chunk` is already in the $COLUMNSTORE. |
+| `recompress` | BOOLEAN | `false` |✖| Set to true to recompress. In-memory recompression is attempted first; it falls back to internal decompress/compress. |
## Returns
diff --git a/api/hypercore/convert_to_columnstore.md b/api/hypercore/convert_to_columnstore.md
index 916c0d9847..7856de202c 100644
--- a/api/hypercore/convert_to_columnstore.md
+++ b/api/hypercore/convert_to_columnstore.md
@@ -34,11 +34,11 @@ CALL convert_to_columnstore('_timescaledb_internal._hyper_1_2_chunk');
## Arguments
-| Name | Type | Default | Required | Description |
-|----------------------|--|---------|--|----------------------------------------------------------------------------------------------------------------------------------------------------|
-| `chunk` | REGCLASS | - |✔| Name of the chunk to add to the $COLUMNSTORE. |
-| `if_not_columnstore` | BOOLEAN | `true` |✖| Set to `false` so this job fails with an error rather than a warning if `chunk` is already in the $COLUMNSTORE. |
-| `recompress` | BOOLEAN | `false` |✖| Set to true to recompress. In-memory recompression will be attempted first; otherwise it will fall back to internal decompress/compress. |
+| Name | Type | Default | Required | Description |
+|----------------------|--|---------|--|---------------------------------------------------------------------------------------------------------------------------------|
+| `chunk` | REGCLASS | - |✔| Name of the chunk to add to the $COLUMNSTORE. |
+| `if_not_columnstore` | BOOLEAN | `true` |✖| Set to `false` so this job fails with an error rather than a warning if `chunk` is already in the $COLUMNSTORE. |
+| `recompress` | BOOLEAN | `false` |✖| Set to true to recompress. In-memory recompression is attempted first; it falls back to internal decompress/compress. |
## Returns
diff --git a/use-timescale/write-data/index.md b/use-timescale/write-data/index.md
index 51b86efb3d..7b90313133 100644
--- a/use-timescale/write-data/index.md
+++ b/use-timescale/write-data/index.md
@@ -18,12 +18,12 @@ using `INSERT`, `UPDATE`, and `DELETE` statements.
* [Upsert data][upsert] into hypertables
* [Delete data][delete] from hypertables
-For more information about using third-party tools to write data
-into $TIMESCALE_DB, see the [Ingest data from other sources][ingest-data] section.
+To find out how to add and sync data to your $SERVICE_SHORT from other sources, see
+[Import and sync][ingest-data].
[about-writing-data]: /use-timescale/:currentVersion:/write-data/about-writing-data/
[delete]: /use-timescale/:currentVersion:/write-data/delete/
-[ingest-data]: /use-timescale/:currentVersion:/ingest-data/
+[ingest-data]: /migrate/:currentVersion:
[insert]: /use-timescale/:currentVersion:/write-data/insert/
[update]: /use-timescale/:currentVersion:/write-data/update/
[upsert]: /use-timescale/:currentVersion:/write-data/upsert/
diff --git a/use-timescale/write-data/insert.md b/use-timescale/write-data/insert.md
index 3d79f95444..220d88e031 100644
--- a/use-timescale/write-data/insert.md
+++ b/use-timescale/write-data/insert.md
@@ -1,22 +1,25 @@
---
title: Insert data
-excerpt: Insert single and multiple rows and return data in TimescaleDB with SQL
+excerpt: Insert single and multiple rows and bulk load data into TimescaleDB with SQL
products: [cloud, mst, self_hosted]
-keywords: [ingest]
-tags: [insert, write, hypertables]
+keywords: [ingest, bulk load]
+tags: [insert, write, hypertables, copy]
---
import EarlyAccess2230 from "versionContent/_partials/_early_access_2_23_0.mdx";
# Insert data
-Insert data into a hypertable with a standard [`INSERT`][postgres-insert] SQL
-command.
+You insert data into a $HYPERTABLE using the following standard SQL commands:
+
+- `INSERT`: single rows or small batches
+- `COPY`: bulk data loading
+
+To improve performance, insert time series data directly to the columnstore using [direct compress][direct-compress].
## Insert a single row
-To insert a single row into a hypertable, use the syntax `INSERT INTO ...
-VALUES`. For example, to insert data into a hypertable named `conditions`:
+To insert a single row into a $HYPERTABLE, use the syntax `INSERT INTO ... VALUES`:
```sql
INSERT INTO conditions(time, location, temperature, humidity)
@@ -25,11 +28,11 @@ INSERT INTO conditions(time, location, temperature, humidity)
## Insert multiple rows
-You can also insert multiple rows into a hypertable using a single `INSERT`
-call. This works even for thousands of rows at a time. This is more efficient
-than inserting data row-by-row, and is recommended when possible.
+A more efficient method to insert row-by-row is to insert multiple rows into a $HYPERTABLE using a single
+`INSERT` call. This works even for thousands of rows at a time. $TIMESCALE_DB batches the rows by chunk, then writes to
+each chunk in a single transaction.
-Use the same syntax, separating rows with a comma:
+You use the same syntax, separating rows with a comma:
```sql
INSERT INTO conditions
@@ -39,18 +42,13 @@ INSERT INTO conditions
(NOW(), 'garage', 77.0, 65.2);
```
-
-
-You can insert multiple rows belonging to different
-chunks within the same `INSERT` statement. Behind the scenes, $TIMESCALE_DB batches the rows by chunk, and writes to each chunk in a single
-transaction.
-
-
+If you `INSERT` unsorted data, call [convert_to_columnstore('', recompress => true)][convert_to_columnstore]
+on the $CHUNK to reorder and optimize your data.
## Insert and return data
-In the same `INSERT` command, you can return some or all of the inserted data by
-adding a `RETURNING` clause. For example, to return all the inserted data, run:
+You can return some or all of the inserted data by adding a `RETURNING` clause to the `INSERT` command. For example,
+to return all the inserted data, run:
```sql
INSERT INTO conditions
@@ -67,29 +65,96 @@ time | location | temperature | humidity
(1 row)
```
-## Direct compress on INSERT
+If you `INSERT` unsorted data, call [convert_to_columnstore('', recompress => true)][convert_to_columnstore]
+on the $CHUNK to reorder and optimize your data.
-This columnar format enables fast scanning and
-aggregation, optimizing performance for analytical workloads while also saving significant storage space. In the
-$COLUMNSTORE conversion, $HYPERTABLE chunks are compressed by up to 98%, and organized for efficient, large-scale
-queries.
+## Bulk insert with COPY
-To improve performance, you can compress data during `INSERT` so that it is injected directly into chunks
-in the $COLUMNSTORE rather than waiting for the policy.
+The `COPY` command is the most efficient way to load large amounts of data into a $HYPERTABLE. For
+bulk data loading, `COPY` can be 2-3x faster or more than `INSERT`, especially when combined with
+[direct compress][direct-compress].
-To enable direct compress on INSERT, enable the following [GUC parameters][gucs]:
+`COPY` supports loading from:
-```sql
-SET timescaledb.enable_compressed_insert = true;
-SET timescaledb.enable_compressed_insert_sort_batches = true;
-SET timescaledb.enable_compressed_insert_client_sorted = true;
-```
+- **CSV files**:
+
+ ```sql
+ COPY conditions(time, location, temperature, humidity)
+ FROM '/path/to/data.csv'
+ WITH (FORMAT CSV, HEADER);
+ ```
-When you set `enable_compressed_insert_client_sorted` to `true`, you must ensure that data in the input
-stream is sorted.
+- **Standard input**
+
+ To load data from your application or script using standard input:
+
+ ```sql
+ COPY conditions(time, location, temperature, humidity)
+ FROM STDIN
+ WITH (FORMAT CSV);
+ ```
+
+ To signal the end of input, add `\.` on a new line.
+
+- **Program output**
+
+ To load data generated by a program or script:
+
+ ```sql
+ COPY conditions(time, location, temperature, humidity)
+ FROM PROGRAM 'generate_data.sh'
+ WITH (FORMAT CSV);
+ ```
+
+If you `COPY` unsorted data, call [convert_to_columnstore('', recompress => true)][convert_to_columnstore]
+on the $CHUNK to reorder and optimize your data.
+
+## Improve performance with direct compress
+The columnar format in the $COLUMNSTORE enables fast scanning and aggregation, optimizing performance for
+analytical workloads while also saving significant storage space. In the $COLUMNSTORE conversion, $HYPERTABLE chunks are
+compressed by up to 98%, and organized for efficient, large-scale queries.
+
+To improve performance, compress data during the `INSERT` and `COPY` operations so that it is injected
+directly into chunks in the $COLUMNSTORE rather than waiting for the policy. Direct compress writes data in the
+compressed format in memory, significantly reducing I/O and improving ingestion performance.
+
+When you enable direct compress, ensure that your data is already sorted by the table's compression `order_by` columns.
+Incorrectly sorted data results in poor compression and query performance.
+
+- **Enable direct compress on `INSERT`**
+
+ Set the following [GUC parameters][gucs]:
+ ```sql
+ SET timescaledb.enable_direct_compress_insert = true;
+ SET timescaledb.enable_direct_compress_insert_client_sorted = true;
+ ```
+
+- **Enable direct compress on `COPY`**
+
+ Set the following [GUC parameter][gucs]:
+
+ ```sql
+ SET timescaledb.enable_direct_compress_copy = true;
+ SET timescaledb.enable_direct_compress_copy_client_sorted = true;
+ ```
+
+ - **Optimal batch size**: best results with batches of 1,000 to 10,000 records
+ - **Cardinality**: high cardinality datasets do not compress well and may degrade query performance
+ - **Batch format**: the columnstore is optimized for 1,000 records per batch per segment
+ - **WAL efficiency**: compressed batches are written to WAL rather than individual tuples
+ - **Continuous aggregates**: not supported with direct compress
+ - **Unique constraints**: tables with unique constraints cannot use direct compress
+
+
+
+
+[postgres-insert]: https://www.postgresql.org/docs/current/sql-insert.html
+[postgres-copy]: https://www.postgresql.org/docs/current/sql-copy.html
+[upsert]: /use-timescale/:currentVersion:/write-data/upsert/
+[gucs]: /api/:currentVersion:/configuration/gucs/
[postgres-update]: https://www.postgresql.org/docs/current/sql-update.html
[hypertable-create-table]: /api/:currentVersion:/hypertable/create_table/
[add_columnstore_policy]: /api/:currentVersion:/hypercore/add_columnstore_policy/
@@ -97,6 +162,4 @@ stream is sorted.
[create_table_arguments]: /api/:currentVersion:/hypertable/create_table/#arguments
[alter_job_samples]: /api/:currentVersion:/jobs-automation/alter_job/#samples
[convert_to_columnstore]: /api/:currentVersion:/hypercore/convert_to_columnstore/
-[gucs]: /api/:currentVersion:/configuration/gucs/
-
-[postgres-insert]: https://www.postgresql.org/docs/current/sql-insert.html
+[direct-compress]: /use-timescale/:currentVersion:/write-data/insert/#improve-performance-with-direct-compress
diff --git a/use-timescale/write-data/upsert.md b/use-timescale/write-data/upsert.md
index 70ca96dbab..5bffab3f27 100644
--- a/use-timescale/write-data/upsert.md
+++ b/use-timescale/write-data/upsert.md
@@ -2,77 +2,57 @@
title: Upsert data
excerpt: Insert a new row or update an existing row in a hypertable using UPSERT
products: [cloud, mst, self_hosted]
-keywords: [upsert, hypertables]
+keywords: [upsert, hypertables, bulk load, copy]
+tags: [insert, write, unique constraints]
---
# Upsert data
-Upserting is an operation that performs both:
+Upserting is an operation to add data to your database where:
-* Inserting a new row if a matching row doesn't already exist
-* Either updating the existing row, or doing nothing, if a matching row
- already exists
+* **A matching row does not exist**: inserts a new row
+* **A matching row exists**: either updates the existing row, or does nothing
-Upserts only work when you have a unique index or constraint. A matching row is
-one that has identical values for the columns covered by the index or
-constraint.
+## Upsert, unique indexes, and constraints
-
-
-In $PG, a primary key is a unique index with a `NOT NULL` constraint.
+Upserts work when you have a unique index or constraint. A matching row is one that has identical values for the columns
+covered by the index or constraint. In $PG, a primary key is a unique index with a `NOT NULL` constraint.
If you have a primary key, you automatically have a unique index.
-
-
-## Create a table with a unique constraint
-
-The examples in this section use a `conditions` table with a unique constraint
-on the columns `(time, location)`. To create a unique constraint, use `UNIQUE
-()` while defining your table:
-
-```sql
-CREATE TABLE conditions (
- time TIMESTAMPTZ NOT NULL,
- location TEXT NOT NULL,
- temperature DOUBLE PRECISION NULL,
- humidity DOUBLE PRECISION NULL,
- UNIQUE (time, location)
-);
-```
+Unique constraints must include all partitioning columns. That means unique
+constraints on a $HYPERTABLE must include the time column. If you added other
+partitioning columns to your $HYPERTABLE, the constraint must include those as
+well. For more information, see [Enforce constraints with unique indexes][hypertables-and-unique-indexes].
-You can also create a unique constraint after the table is created. Use the
-syntax `ALTER TABLE ... ADD CONSTRAINT ... UNIQUE`. In this example, the
-constraint is named `conditions_time_location`:
-```sql
-ALTER TABLE conditions
- ADD CONSTRAINT conditions_time_location
- UNIQUE (time, location);
-```
+The examples in this page use a `conditions` table with a unique constraint
+on the columns `(time, location)`. To create a unique constraint, either:
-When you add a unique constraint to a table, you can't insert data that violates
-the constraint. In other words, if you try to insert data that has identical
-values to another row, within the columns covered by the constraint, you get an
-error.
+- Use `UNIQUE ()` when you define your table:
-
-
-Unique constraints must include all partitioning columns. That means unique
-constraints on a hypertable must include the time column. If you added other
-partitioning columns to your hypertable, the constraint must include those as
-well. For more information, see the section on
-[hypertables and unique indexes](/use-timescale/latest/hypertables/hypertables-and-unique-indexes/).
+ ```sql
+ CREATE TABLE conditions (
+ time TIMESTAMPTZ NOT NULL,
+ location TEXT NOT NULL,
+ temperature DOUBLE PRECISION NULL,
+ humidity DOUBLE PRECISION NULL,
+ UNIQUE (time, location)
+ );
+ ```
-
+- Use `ALTER TABLE` after the table is created:
-## Insert or update data to a table with a unique constraint
+ ```sql
+ ALTER TABLE conditions
+ ADD CONSTRAINT conditions_time_location
+ UNIQUE (time, location);
+ ```
-You can tell the database to insert new data if it doesn't violate the
-constraint, and to update the existing row if it does. Use the syntax `INSERT
-INTO ... VALUES ... ON CONFLICT ... DO UPDATE`.
+## Insert or update data
-For example, to update the `temperature` and `humidity` values if a row with the
-specified `time` and `location` already exists, run:
+To insert new data that doesn't violate the constraint, and to update the existing row if it does, use the syntax
+`INSERT INTO ... VALUES ... ON CONFLICT ... DO UPDATE`. For example, to update the `temperature` and `humidity` values
+if a row with the specified `time` and `location` already exists, run:
```sql
INSERT INTO conditions
@@ -82,12 +62,11 @@ INSERT INTO conditions
humidity = excluded.humidity;
```
-## Insert or do nothing to a table with a unique constraint
+## Insert or do nothing
-You can also tell the database to do nothing if the constraint is violated. The
-new data is not inserted, and the old row is not updated. This is useful when
-writing many rows as one batch, to prevent the entire transaction from failing.
-The database engine skips the row and moves on.
+You can also do nothing if the constraint is violated. The new data is not inserted, and the old row is not updated,
+the database engine skips the row and moves on. This is useful to prevent the entire transaction from failing when
+writing many rows as one batch.
To insert or do nothing, use the syntax `INSERT INTO ... VALUES ... ON CONFLICT
DO NOTHING`:
@@ -98,4 +77,47 @@ INSERT INTO conditions
ON CONFLICT DO NOTHING;
```
+## Bulk upsert using COPY
+
+When you need to upsert large amounts of data, `COPY` is significantly faster than `INSERT`. However, `COPY` doesn't
+support `ON CONFLICT` clauses directly. Best practice is to use a staging table. This two-step approach combines the
+speed of `COPY` for bulk loading with the flexibility of `INSERT...ON CONFLICT` for upsert logic. For large datasets,
+this is much faster than using `INSERT...ON CONFLICT` directly.
+
+To load data efficiently with `COPY`, then upsert:
+
+
+
+1. **Create a staging table with the same structure as the destination table**
+ ```sql
+ CREATE TEMP TABLE conditions_staging (LIKE conditions);
+ ```
+
+1. **Use `COPY` to bulk load data into the staging table**
+ ```sql
+ COPY conditions_staging(time, location, temperature, humidity)
+ FROM '/path/to/data.csv'
+ WITH (FORMAT CSV, HEADER);
+ ```
+
+1. **Upsert from the staging table to the destination table**
+ ```sql
+ INSERT INTO conditions
+ SELECT * FROM conditions_staging
+ ON CONFLICT (time, location) DO UPDATE
+ SET temperature = EXCLUDED.temperature,
+ humidity = EXCLUDED.humidity;
+ ```
+ To skip duplicate rows, set `ON CONFLICT (time, location) DO NOTHING`.
+
+1. **Clean up the staging table**
+ ```sql
+ DROP TABLE conditions_staging;
+ ```
+
+
+
+
[postgres-upsert]: https://www.postgresql.org/docs/current/static/sql-insert.html#SQL-ON-CONFLICT
+[postgres-copy]: https://www.postgresql.org/docs/current/sql-copy.html
+[hypertables-and-unique-indexes]: /use-timescale/:currentVersion:/hypertables/hypertables-and-unique-indexes/