Skip to content

Commit 6b28dda

Browse files
committed
feat(dwh): update
1 parent 9744fae commit 6b28dda

File tree

2 files changed

+75
-67
lines changed

2 files changed

+75
-67
lines changed

pages/data-warehouse/index.mdx

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -37,13 +37,6 @@ meta:
3737
url="/data-warehouse/how-to/"
3838
/>
3939
<SummaryCard
40-
title="How-tos"
41-
icon="help-circle-outline"
42-
description="Check our guides to creating, using, and managing Data Warehouse for ClickHouse® deployments and their features."
43-
label="View How-tos"
44-
url="/data-warehouse/how-to/"
45-
/>
46-
<SummaryCard
4740
title="Additional Content"
4841
icon="book-open-outline"
4942
description="Go further with detailed, in-depth information on Data Warehouse for ClickHouse®."

pages/data-warehouse/quickstart.mdx

Lines changed: 75 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -85,93 +85,108 @@ You are now connected to your Data Warehouse for ClickHouse® deployment and can
8585

8686
## How to import and query an example data set
8787

88-
### Creating a database and ingesting data
89-
9088
<Message type="note">
91-
This example is based on the **New York Taxi Data** from the [Official ClickHouse documentation](https://clickhouse.com/docs/getting-started/example-datasets/nyc-taxi).
89+
This example is based on the **Environmental Sensors Data** from the [Official ClickHouse® documentation](https://clickhouse.com/docs/getting-started/example-datasets/environmental-sensors).
9290
</Message>
9391

94-
1. Run the command below to create a new database:
95-
```sql
96-
CREATE DATABASE nyc_taxi;
97-
```
92+
### Creating a database and ingesting data
9893

99-
2. Create a new table in the database you just created:
94+
1. Run the command below to create a new table in the default database:
10095
```sql
101-
CREATE TABLE nyc_taxi.trips_small (
102-
trip_id UInt32,
103-
pickup_datetime DateTime,
104-
dropoff_datetime DateTime,
105-
pickup_longitude Nullable(Float64),
106-
pickup_latitude Nullable(Float64),
107-
dropoff_longitude Nullable(Float64),
108-
dropoff_latitude Nullable(Float64),
109-
passenger_count UInt8,
110-
trip_distance Float32,
111-
fare_amount Float32,
112-
extra Float32,
113-
tip_amount Float32,
114-
tolls_amount Float32,
115-
total_amount Float32,
116-
payment_type Enum('CSH' = 1, 'CRE' = 2, 'NOC' = 3, 'DIS' = 4, 'UNK' = 5),
117-
pickup_ntaname LowCardinality(String),
118-
dropoff_ntaname LowCardinality(String)
96+
CREATE TABLE sensors
97+
(
98+
sensor_id UInt16,
99+
sensor_type Enum('BME280', 'BMP180', 'BMP280', 'DHT22', 'DS18B20', 'HPM', 'HTU21D', 'PMS1003', 'PMS3003', 'PMS5003', 'PMS6003', 'PMS7003', 'PPD42NS', 'SDS011'),
100+
location UInt32,
101+
lat Float32,
102+
lon Float32,
103+
timestamp DateTime,
104+
P1 Float32,
105+
P2 Float32,
106+
P0 Float32,
107+
durP1 Float32,
108+
ratioP1 Float32,
109+
durP2 Float32,
110+
ratioP2 Float32,
111+
pressure Float32,
112+
altitude Float32,
113+
pressure_sealevel Float32,
114+
temperature Float32,
115+
humidity Float32,
116+
date Date MATERIALIZED toDate(timestamp)
119117
)
120118
ENGINE = MergeTree
121-
PRIMARY KEY (pickup_datetime, dropoff_datetime);
119+
ORDER BY (timestamp, sensor_id);
120+
122121
```
123122

124-
3. Insert data from an Amazon S3 bucket
123+
2. Insert data in the table you just created:
125124
```sql
126-
INSERT INTO nyc_taxi.trips_small
127-
SELECT
128-
trip_id,
129-
pickup_datetime,
130-
dropoff_datetime,
131-
pickup_longitude,
132-
pickup_latitude,
133-
dropoff_longitude,
134-
dropoff_latitude,
135-
passenger_count,
136-
trip_distance,
137-
fare_amount,
138-
extra,
139-
tip_amount,
140-
tolls_amount,
141-
total_amount,
142-
payment_type,
143-
pickup_ntaname,
144-
dropoff_ntaname
145-
FROM s3(
146-
'https://datasets-documentation.s3.eu-west-3.amazonaws.com/nyc-taxi/trips_{0..2}.gz',
147-
'TabSeparatedWithNames'
148-
);
125+
INSERT INTO sensors
126+
SELECT *
127+
FROM s3Cluster(
128+
'default',
129+
'https://datawarehouse-samples.s3.fr-par.scw.cloud/2019-06_bmp180.csv.zst',
130+
'CSVWithNames',
131+
$$ sensor_id UInt16,
132+
sensor_type String,
133+
location UInt32,
134+
lat Float32,
135+
lon Float32,
136+
timestamp DateTime,
137+
P1 Float32,
138+
P2 Float32,
139+
P0 Float32,
140+
durP1 Float32,
141+
ratioP1 Float32,
142+
durP2 Float32,
143+
ratioP2 Float32,
144+
pressure Float32,
145+
altitude Float32,
146+
pressure_sealevel Float32,
147+
temperature Float32,
148+
humidity Float32 $$
149+
)
150+
SETTINGS
151+
format_csv_delimiter = ';',
152+
input_format_allow_errors_ratio = '0.5',
153+
input_format_allow_errors_num = 10000,
154+
input_format_parallel_parsing = 0,
155+
date_time_input_format = 'best_effort',
156+
max_insert_threads = 32,
157+
parallel_distributed_insert_select = 1;
158+
149159
```
150160

151161
### Querying the database
152162

153163
1. Run the command below to count the rows you inserted in the table:
154164
```sql
155165
SELECT count()
156-
FROM nyc_taxi.trips_small;
166+
FROM sensors;
157167
```
158168

159169
2. Run the command below to list the 10 first rows of your table:
160170
```sql
161171
SELECT *
162-
FROM nyc_taxi.trips_small
172+
FROM sensors
163173
LIMIT 10;
164174
```
165175

166-
3. Run the command below to display the 10 first neighborhoods with the most frequent pickups:
176+
3. Run the command below to display the storage used by the `sensors` table:
167177
```sql
168178
SELECT
169-
pickup_ntaname,
170-
count(*) AS count
171-
FROM nyc_taxi.trips_small WHERE pickup_ntaname != ''
172-
GROUP BY pickup_ntaname
173-
ORDER BY count DESC
174-
LIMIT 10;
179+
disk_name,
180+
formatReadableSize(sum(data_compressed_bytes) AS size) AS compressed,
181+
formatReadableSize(sum(data_uncompressed_bytes) AS usize) AS uncompressed,
182+
round(usize / size, 2) AS compr_rate,
183+
sum(rows) AS rows,
184+
count() AS part_count
185+
FROM system.parts
186+
WHERE (active = 1) AND (table = 'sensors')
187+
GROUP BY
188+
disk_name
189+
ORDER BY size DESC;
175190
```
176191

177192
To perform more in-depth tests with larger data sets, refer to our [dedicated documentation](/data-warehouse/reference-content/example-datasets/).

0 commit comments

Comments
 (0)