| title | Ingest data | |||||
|---|---|---|---|---|---|---|
| excerpt | Ingest data into your services running in Managed Service for TimescaleDB | |||||
| products |
|
|||||
| keywords |
|
|||||
| tags |
|
There are several different ways of ingesting your data into $MST_LONG. This section contains instructions to:
- Bulk upload from a
.csvfile - Insert data directly using a client driver, such as JDBC, ODBC, or Node.js
- Insert data directly using a message queue, such as Kafka
Before you begin, make sure you have
created your $MST_SERVICE_LONG,
and can connect to it using psql.
-
Use
psqlto connect to your $MST_SERVICE_SHORT.psql -h <HOSTNAME> -p <PORT> -U <USERNAME> -W -d <DATABASE_NAME>
You retrieve the $MST_SERVICE_SHORT URL, port, and login credentials from the $MST_SERVICE_SHORT overview in the $MST_SHORT dashboard.
-
Create a new database for your data. In this example, the new database is called
new_db:CREATE DATABASE new_db; \c new_db;
-
Create a new SQL table in your database. The columns you create for the table must match the columns in your source data. In this example, the table is storing weather condition data, and has columns for the timestamp, location, and temperature:
CREATE TABLE conditions ( time TIMESTAMPTZ NOT NULL, location text NOT NULL, temperature DOUBLE PRECISION NULL );
-
Load the
timescaledb$PG extension:CREATE EXTENSION timescaledb; \dx
-
Convert the SQL table into a hypertable:
The `by_range` dimension builder is an addition to $TIMESCALE_DB 2.13.SELECT create_hypertable('conditions', by_range('time'));
When you have successfully set up your new database, you can ingest data using one of these methods.
If you have a dataset stored in a .csv file, you can import it into an empty
hypertable. You need to begin by creating the new table, before you
import the data.
Before you begin, make sure you have prepared your new database.
-
Insert data into the new hypertable using the
timescaledb-parallel-copytool. You should already have the tool installed, but you can install it manually from our GitHub repository if you need to. In this example, we are inserting the data using four workers:timescaledb-parallel-copy \ --connection '<service_url>' \ --table conditions \ --file ~/Downloads/example.csv \ --workers 4 \ --copy-options "CSV" \ --skip-header
We recommend that you set the number of workers lower than the number of available CPU cores on your client machine or server, to prevent the workers having to compete for resources. This helps your ingest go faster.
-
OPTIONAL: If you don't want to use the
timescaledb-parallel-copytool, or if you have a very small dataset, you can use the $PGCOPYcommand instead:psql '<service_url>/new_db?sslmode=require' -c "\copy conditions FROM <example.csv> WITH (FORMAT CSV, HEADER)"
You can use a client driver such as JDBC, Python, or Node.js, to insert data directly into your new database.
See the $PG instructions for using the ODBC driver.
See the Code Quick Starts for using various languages, including Python and node.js.
If you have data stored in a message queue, you can import it into your $MST_SERVICE_SHORT. This section provides instructions on using the Kafka Connect $PG connector.
This connector deploys $PG change events from Kafka Connect to a runtime service. It monitors one or more schemas in a $MST_SERVICE_LONG, and writes all change events to Kafka topics, which can then be independently consumed by one or more clients. Kafka Connect can be distributed to provide fault tolerance, which ensures the connectors are running and continually keeping up with changes in the database.
You can also use the $PG connector as a library without Kafka or Kafka Connect. This allows applications and services to directly connect to $MST_SHORT and obtain the ordered change events. In this environment, the application must record the progress of the connector so that when it is restarted, the connect can continue where it left off. This approach can be useful for less critical use cases. However, for production use cases, we recommend that you use the connector with Kafka and Kafka Connect.
See these instructions for using the Kafka connector.