Skip to content

Commit a25bc15

Browse files
committed
finish 2 shards and 1 replica
1 parent cf526d6 commit a25bc15

File tree

8 files changed

+757
-54
lines changed

8 files changed

+757
-54
lines changed

docs/deployment-guides/replication-sharding-examples/01_1_shard_2_replicas.md

Lines changed: 86 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -8,22 +8,22 @@ description: 'Page describing an example architecture with five servers configur
88

99
import Image from '@theme/IdealImage';
1010
import ReplicationShardingTerminology from '@site/docs/_snippets/_replication-sharding-terminology.md';
11+
import ReplicationArchitecture from '@site/static/images/deployment-guides/replication-sharding-examples/replication.png';
1112
import ConfigFileNote from '@site/docs/_snippets/_config-files.md';
1213
import KeeperConfigFileNote from '@site/docs/_snippets/_keeper-config-files.md';
13-
import ReplicationArchitecture from '@site/static/images/deployment-guides/replication-sharding-examples/replication.png';
1414
import ConfigExplanation from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_config_explanation.mdx';
1515
import ListenHost from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_listen_host.mdx';
1616
import ServerParameterTable from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_server_parameter_table.mdx';
1717
import KeeperConfig from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_keeper_config.mdx';
1818
import KeeperConfigExplanation from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_keeper_explanation.mdx';
1919
import VerifyKeeperStatus from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_verify_keeper_using_mntr.mdx';
2020
import DedicatedKeeperServers from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_dedicated_keeper_servers.mdx';
21+
import ExampleFiles from '@site/docs/deployment-guides/replication-sharding-examples/_snippets/_working_example.mdx';
2122

2223
> In this example, you'll learn how to set up a simple ClickHouse cluster which
2324
replicates the data. There are five servers configured. Two are used to host
2425
copies of the data. The other three servers are used to coordinate the replication
25-
of data. With this example, we'll create a database and table that will be
26-
replicated across both data nodes using the `ReplicatedMergeTree` table engine.
26+
of data.
2727

2828
The architecture of the cluster you will be setting up is shown below:
2929

@@ -41,6 +41,8 @@ The architecture of the cluster you will be setting up is shown below:
4141

4242
## Set up directory structure and test environment {#set-up}
4343

44+
<ExampleFiles/>
45+
4446
In this tutorial, you will use [Docker compose](https://docs.docker.com/compose/) to
4547
set up the ClickHouse cluster. This setup could be modified to work
4648
for separate local machines, virtual machines or cloud instances as well.
@@ -533,11 +535,11 @@ SHOW DATABASES;
533535

534536
## Create a table on the cluster {#creating-a-table}
535537

536-
Now that the database has been created, create a distributed table on the cluster.
538+
Now that the database has been created, create a table on the cluster.
537539
Run the following query from any of the host clients:
538540

539541
```sql
540-
CREATE TABLE IF NOT EXISTS uk.uk_price_paid
542+
CREATE TABLE IF NOT EXISTS uk.uk_price_paid_local
541543
--highlight-next-line
542544
ON CLUSTER cluster_1S_2R
543545
(
@@ -587,10 +589,13 @@ SHOW TABLES IN uk;
587589

588590
## Insert data {#inserting-data}
589591

590-
Now insert data from `clickhouse-01`:
592+
As the data set is large and takes a few minutes to completely ingest, we will
593+
insert only a small subset to begin with.
594+
595+
Insert a smaller subset of the data using the query below from `clickhouse-01`:
591596

592597
```sql
593-
INSERT INTO uk.uk_price_paid
598+
INSERT INTO uk.uk_price_paid_local
594599
SELECT
595600
toUInt32(price_string) AS price,
596601
parseDateTimeBestEffortUS(time) AS date,
@@ -626,18 +631,27 @@ FROM url(
626631
d String,
627632
e String'
628633
) SETTINGS max_http_get_redirects=10;
634+
LIMIT 10000;
629635
```
630636

631-
Query the table from `clickhouse-02` or `clickhouse-01`:
637+
Notice that the data is completely replicated on each host:
632638

633-
```sql title="Query"
634-
SELECT count(*) FROM uk.uk_price_paid;
635-
```
639+
```sql
640+
-- clickhouse-01
641+
SELECT count(*)
642+
FROM uk.uk_price_paid_local
636643

637-
```response title="Response"
638-
┌──count()─┐
639-
1. │ 30212555 │ -- 30.21 million
640-
└──────────┘
644+
-- ┌─count()─┐
645+
-- 1.│ 10000 │
646+
-- └─────────┘
647+
648+
-- clickhouse-02
649+
SELECT count(*)
650+
FROM uk.uk_price_paid_local
651+
652+
-- ┌─count()─┐
653+
-- 1.│ 10000 │
654+
-- └─────────┘
641655
```
642656

643657
To demonstrate what happens when one of the hosts fails, create a simple test database
@@ -715,11 +729,67 @@ SELECT * FROM test.test_table
715729
└────┴────────────────────┘
716730
```
717731

732+
If at this stage you would like to ingest the full UK property price dataset
733+
to play around with, you can run the following queries to do so:
734+
735+
```sql
736+
TRUNCATE TABLE uk.uk_price_paid_local ON CLUSTER cluster_1S_2R;
737+
INSERT INTO uk.uk_price_paid_local
738+
SELECT
739+
toUInt32(price_string) AS price,
740+
parseDateTimeBestEffortUS(time) AS date,
741+
splitByChar(' ', postcode)[1] AS postcode1,
742+
splitByChar(' ', postcode)[2] AS postcode2,
743+
transform(a, ['T', 'S', 'D', 'F', 'O'], ['terraced', 'semi-detached', 'detached', 'flat', 'other']) AS type,
744+
b = 'Y' AS is_new,
745+
transform(c, ['F', 'L', 'U'], ['freehold', 'leasehold', 'unknown']) AS duration,
746+
addr1,
747+
addr2,
748+
street,
749+
locality,
750+
town,
751+
district,
752+
county
753+
FROM url(
754+
'http://prod1.publicdata.landregistry.gov.uk.s3-website-eu-west-1.amazonaws.com/pp-complete.csv',
755+
'CSV',
756+
'uuid_string String,
757+
price_string String,
758+
time String,
759+
postcode String,
760+
a String,
761+
b String,
762+
c String,
763+
addr1 String,
764+
addr2 String,
765+
street String,
766+
locality String,
767+
town String,
768+
district String,
769+
county String,
770+
d String,
771+
e String'
772+
) SETTINGS max_http_get_redirects=10;
773+
LIMIT 10000;
774+
```
775+
776+
Query the table from `clickhouse-02` or `clickhouse-01`:
777+
778+
```sql title="Query"
779+
SELECT count(*) FROM uk.uk_price_paid_local;
780+
```
781+
782+
```response title="Response"
783+
┌──count()─┐
784+
1. │ 30212555 │ -- 30.21 million
785+
└──────────┘
786+
```
787+
718788
</VerticalStepper>
719789

720790
## Conclusion {#conclusion}
721791

722-
As you saw, the advantage of this cluster topology is that with two replicas,
792+
The advantage of this cluster topology is that with two replicas,
723793
your data exists on two separate hosts. If one host fails, the other replica
724794
continues serving data without any loss. This eliminates single points of
725795
failure at the storage level.

0 commit comments

Comments
 (0)