Ingest data

There are several different ways of ingesting your data into $MST_LONG. This section contains instructions to:

Bulk upload from a .csv file
Insert data directly using a client driver, such as JDBC, ODBC, or Node.js
Insert data directly using a message queue, such as Kafka

Before you begin, make sure you have created your $MST_SERVICE_LONG, and can connect to it using psql.

Preparing your new database

Use psql to connect to your $MST_SERVICE_SHORT.
```
psql -h <HOSTNAME> -p <PORT> -U <USERNAME> -W -d <DATABASE_NAME>
```
You retrieve the $MST_SERVICE_SHORT URL, port, and login credentials from the $MST_SERVICE_SHORT overview in the $MST_SHORT dashboard.
Create a new database for your data. In this example, the new database is called new_db:
```
CREATE DATABASE new_db;
\c new_db;
```
Create a new SQL table in your database. The columns you create for the table must match the columns in your source data. In this example, the table is storing weather condition data, and has columns for the timestamp, location, and temperature:
```
CREATE TABLE conditions (
  time        TIMESTAMPTZ         NOT NULL,
  location    text                NOT NULL,
  temperature DOUBLE PRECISION    NULL
);
```
Load the timescaledb $PG extension:
```
CREATE EXTENSION timescaledb;
\dx
```
Convert the SQL table into a hypertable:
```
SELECT create_hypertable('conditions', by_range('time'));
```
The `by_range` dimension builder is an addition to $TIMESCALE_DB 2.13.

When you have successfully set up your new database, you can ingest data using one of these methods.

Bulk upload from CSV files

If you have a dataset stored in a .csv file, you can import it into an empty hypertable. You need to begin by creating the new table, before you import the data.

Before you begin, make sure you have prepared your new database.

Bulk uploading from a CSV file

Insert data into the new hypertable using the timescaledb-parallel-copy tool. You should already have the tool installed, but you can install it manually from our GitHub repository if you need to. In this example, we are inserting the data using four workers:
```
timescaledb-parallel-copy \
--connection '<service_url>' \
--table conditions \
--file ~/Downloads/example.csv \
--workers 4 \
--copy-options "CSV" \
--skip-header
```
We recommend that you set the number of workers lower than the number of available CPU cores on your client machine or server, to prevent the workers having to compete for resources. This helps your ingest go faster.
OPTIONAL: If you don't want to use the timescaledb-parallel-copy tool, or if you have a very small dataset, you can use the $PG COPY command instead:
```
psql '<service_url>/new_db?sslmode=require' -c "\copy conditions FROM <example.csv> WITH (FORMAT CSV, HEADER)"
```

Insert data directly using a client driver

You can use a client driver such as JDBC, Python, or Node.js, to insert data directly into your new database.

See the $PG instructions for using the ODBC driver.

See the Code Quick Starts for using various languages, including Python and node.js.

Insert data directly using a message queue

If you have data stored in a message queue, you can import it into your $MST_SERVICE_SHORT. This section provides instructions on using the Kafka Connect $PG connector.

This connector deploys $PG change events from Kafka Connect to a runtime service. It monitors one or more schemas in a $MST_SERVICE_LONG, and writes all change events to Kafka topics, which can then be independently consumed by one or more clients. Kafka Connect can be distributed to provide fault tolerance, which ensures the connectors are running and continually keeping up with changes in the database.

You can also use the $PG connector as a library without Kafka or Kafka Connect. This allows applications and services to directly connect to $MST_SHORT and obtain the ordered change events. In this environment, the application must record the progress of the connector so that when it is restarted, the connect can continue where it left off. This approach can be useful for less critical use cases. However, for production use cases, we recommend that you use the connector with Kafka and Kafka Connect.

See these instructions for using the Kafka connector.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingest data

Preparing your new database

Bulk upload from CSV files

Bulk uploading from a CSV file

Insert data directly using a client driver

Insert data directly using a message queue

FilesExpand file tree

ingest-data.md

Latest commit

History

ingest-data.md

File metadata and controls

Ingest data

Preparing your new database

Bulk upload from CSV files

Bulk uploading from a CSV file

Insert data directly using a client driver

Insert data directly using a message queue