You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guides/creating-tables.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -42,23 +42,23 @@ In the example above, `my_first_table` is a `MergeTree` table with four columns:
42
42
There are many engines to choose from, but for a simple table on a single-node ClickHouse server, [MergeTree](/engines/table-engines/mergetree-family/mergetree.md) is your likely choice.
43
43
:::
44
44
45
-
## A Brief Intro to Primary Keys
45
+
## A Brief Intro to Primary Keys {#a-brief-intro-to-primary-keys}
46
46
47
-
Before you go any further, it is important to understand how primary keys work in ClickHouse (the implementation
48
-
of primary keys might seem unexpected!):
47
+
Before you go any further, it is important to understand how primary keys work in ClickHouse (the implementation
48
+
of primary keys might seem unexpected!):
49
49
50
-
- primary keys in ClickHouse are **_not unique_** for each row in a table
50
+
- primary keys in ClickHouse are **_not unique_** for each row in a table
51
51
52
-
The primary key of a ClickHouse table determines how the data is sorted when written to disk. Every 8,192 rows or 10MB of
53
-
data (referred to as the **index granularity**) creates an entry in the primary key index file. This granularity concept
54
-
creates a **sparse index** that can easily fit in memory, and the granules represent a stripe of the smallest amount of
55
-
column data that gets processed during `SELECT` queries.
52
+
The primary key of a ClickHouse table determines how the data is sorted when written to disk. Every 8,192 rows or 10MB of
53
+
data (referred to as the **index granularity**) creates an entry in the primary key index file. This granularity concept
54
+
creates a **sparse index** that can easily fit in memory, and the granules represent a stripe of the smallest amount of
55
+
column data that gets processed during `SELECT` queries.
56
56
57
-
The primary key can be defined using the `PRIMARY KEY` parameter. If you define a table without a `PRIMARY KEY` specified,
58
-
then the key becomes the tuple specified in the `ORDER BY` clause. If you specify both a `PRIMARY KEY` and an `ORDER BY`, the primary key must be a prefix of the sort order.
57
+
The primary key can be defined using the `PRIMARY KEY` parameter. If you define a table without a `PRIMARY KEY` specified,
58
+
then the key becomes the tuple specified in the `ORDER BY` clause. If you specify both a `PRIMARY KEY` and an `ORDER BY`, the primary key must be a prefix of the sort order.
59
59
60
-
The primary key is also the sorting key, which is a tuple of `(user_id, timestamp)`. Therefore, the data stored in each
61
-
column file will be sorted by `user_id`, then `timestamp`.
60
+
The primary key is also the sorting key, which is a tuple of `(user_id, timestamp)`. Therefore, the data stored in each
61
+
column file will be sorted by `user_id`, then `timestamp`.
62
62
63
63
:::tip
64
64
For more details, check out the [Modeling Data training module](https://learn.clickhouse.com/visitor_catalog_class/show/1328860/?utm_source=clickhouse&utm_medium=docs) in ClickHouse Academy.
Copy file name to clipboardExpand all lines: docs/tutorial.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,15 +8,15 @@ import SQLConsoleDetail from '@site/docs/_snippets/_launch_sql_console.md';
8
8
9
9
# Advanced Tutorial
10
10
11
-
## What to Expect from This Tutorial?
11
+
## What to Expect from This Tutorial? {#what-to-expect-from-this-tutorial}
12
12
13
13
In this tutorial, you will create a table and insert a large dataset (two million rows of the [New York taxi data](/getting-started/example-datasets/nyc-taxi.md)). Then you will run queries on the dataset, including an example of how to create a dictionary and use it to perform a JOIN.
14
14
15
15
:::note
16
16
This tutorial assumes you have access to a running ClickHouse service. If not, check out the [Quick Start](./quick-start.mdx).
17
17
:::
18
18
19
-
## 1. Create a New Table
19
+
## 1. Create a New Table {#1-create-a-new-table}
20
20
21
21
The New York City taxi data contains the details of millions of taxi rides, with columns like pickup and drop-off times and locations, cost, tip amount, tolls, payment type and so on. Let's create a table to store this data...
22
22
@@ -81,7 +81,7 @@ The New York City taxi data contains the details of millions of taxi rides, with
81
81
ORDER BY pickup_datetime;
82
82
```
83
83
84
-
## 2. Insert the Dataset
84
+
## 2. Insert the Dataset {#2-insert-the-dataset}
85
85
86
86
Now that you have a table created, let's add the NYC taxi data. It is in CSV files in S3, and you can load the data from there.
87
87
@@ -163,7 +163,7 @@ Now that you have a table created, let's add the NYC taxi data. It is in CSV fil
163
163
164
164
This query has to process 2M rows and return 190 values, but notice it does this in about 1 second. The `pickup_ntaname` column represents the name of the neighborhood in New York City where the taxi ride originated.
165
165
166
-
## 3. Analyze the Data
166
+
## 3. Analyze the Data {#3-analyze-the-data}
167
167
168
168
Let's run some queries to analyze the 2M rows of data...
169
169
@@ -341,7 +341,7 @@ Let's run some queries to analyze the 2M rows of data...
## 4. Create a Dictionary {#4-create-a-dictionary}
345
345
346
346
If you are new to ClickHouse, it is important to understand how ***dictionaries*** work. A simple way of thinking about a dictionary is a mapping of key->value pairs that is stored in memory. The details and all the options for dictionaries are linked at the end of the tutorial.
347
347
@@ -442,7 +442,7 @@ If you are new to ClickHouse, it is important to understand how ***dictionaries*
442
442
```
443
443
444
444
445
-
## 5. Perform a Join
445
+
## 5. Perform a Join {#5-perform-a-join}
446
446
447
447
Let's write some queries that join the `taxi_zone_dictionary` with your `trips` table.
448
448
@@ -487,7 +487,7 @@ Let's write some queries that join the `taxi_zone_dictionary` with your `trips`
487
487
LIMIT 1000
488
488
```
489
489
490
-
#### Congrats!
490
+
#### Congrats! {#congrats}
491
491
492
492
Well done - you made it through the tutorial, and hopefully you have a better understanding of how to use ClickHouse. Here are some options for what to do next:
0 commit comments