You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/about-us/history.md
+10-9Lines changed: 10 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,16 +2,17 @@
2
2
slug: /en/about-us/history
3
3
sidebar_label: ClickHouse History
4
4
sidebar_position: 40
5
-
description: Where it all began...
5
+
description: History of ClickHouse development
6
+
tags: ['history', 'development', 'Metrica']
6
7
---
7
8
8
9
# ClickHouse History {#clickhouse-history}
9
10
10
-
ClickHouse has been developed initially to power [Yandex.Metrica](https://metrica.yandex.com/), [the second largest web analytics platform in the world](http://w3techs.com/technologies/overview/traffic_analysis/all), and continues to be the core component of this system. With more than 13 trillion records in the database and more than 20 billion events daily, ClickHouse allows generating custom reports on the fly directly from non-aggregated data. This article briefly covers the goals of ClickHouse in the early stages of its development.
11
+
ClickHouse was initially developed to power [Yandex.Metrica](https://metrica.yandex.com/), [the second largest web analytics platform in the world](http://w3techs.com/technologies/overview/traffic_analysis/all), and continues to be its core component. With more than 13 trillion records in the database and more than 20 billion events daily, ClickHouse allows generating custom reports on the fly directly from non-aggregated data. This article briefly covers the goals of ClickHouse in the early stages of its development.
11
12
12
-
Yandex.Metrica builds customized reports on the fly based on hits and sessions, with arbitrary segments defined by the user. Doing so often requires building complex aggregates, such as the number of unique users. New data for building a report arrives in real-time.
13
+
Yandex.Metrica builds customized reports on the fly based on hits and sessions, with arbitrary segments defined by the user. Doing so often requires building complex aggregates, such as the number of unique users, with new data for building reports arriving in real-time.
13
14
14
-
As of April 2014, Yandex.Metrica was tracking about 12 billion events (page views and clicks) daily. All these events must be storedto build custom reports. A single query may require scanning millions of rows within a few hundred milliseconds, or hundreds of millions of rows in just a few seconds.
15
+
As of April 2014, Yandex.Metrica was tracking about 12 billion events (page views and clicks) daily. All these events needed to be stored, in order to build custom reports. A single query may have required scanning millions of rows within a few hundred milliseconds, or hundreds of millions of rows in just a few seconds.
15
16
16
17
## Usage in Yandex.Metrica and Other Yandex Services {#usage-in-yandex-metrica-and-other-yandex-services}
17
18
@@ -26,30 +27,30 @@ ClickHouse also plays a key role in the following processes:
26
27
- Running queries for debugging the Yandex.Metrica engine.
27
28
- Analyzing logs from the API and the user interface.
28
29
29
-
Nowadays, there are multiple dozen ClickHouse installations in other Yandex services and departments: search verticals, e-commerce, advertisement, business analytics, mobile development, personal services, and others.
30
+
Nowadays, there are a multiple dozen ClickHouse installations in other Yandex services and departments: search verticals, e-commerce, advertisement, business analytics, mobile development, personal services, and others.
30
31
31
32
## Aggregated and Non-aggregated Data {#aggregated-and-non-aggregated-data}
32
33
33
34
There is a widespread opinion that to calculate statistics effectively, you must aggregate data since this reduces the volume of data.
34
35
35
-
But data aggregation comes with a lot of limitations:
36
+
However data aggregation comes with a lot of limitations:
36
37
37
38
- You must have a pre-defined list of required reports.
38
39
- The user can’t make custom reports.
39
40
- When aggregating over a large number of distinct keys, the data volume is barely reduced, so aggregation is useless.
40
41
- For a large number of reports, there are too many aggregation variations (combinatorial explosion).
41
42
- When aggregating keys with high cardinality (such as URLs), the volume of data is not reduced by much (less than twofold).
42
43
- For this reason, the volume of data with aggregation might grow instead of shrink.
43
-
- Users do not view all the reports we generate for them. A large portion of those calculations is useless.
44
-
- The logical integrity of data may be violated for various aggregations.
44
+
- Users do not view all the reports we generate for them. A large portion of those calculations are useless.
45
+
- The logical integrity of the data may be violated for various aggregations.
45
46
46
47
If we do not aggregate anything and work with non-aggregated data, this might reduce the volume of calculations.
47
48
48
49
However, with aggregation, a significant part of the work is taken offline and completed relatively calmly. In contrast, online calculations require calculating as fast as possible, since the user is waiting for the result.
49
50
50
51
Yandex.Metrica has a specialized system for aggregating data called Metrage, which was used for the majority of reports.
51
52
Starting in 2009, Yandex.Metrica also used a specialized OLAP database for non-aggregated data called OLAPServer, which was previously used for the report builder.
52
-
OLAPServer worked well for non-aggregated data, but it had many restrictions that did not allow it to be used for all reports as desired. These included the lack of support for data types (only numbers), and the inability to incrementally update data in real-time (it could only be done by rewriting data daily). OLAPServer is not a DBMS, but a specialized DB.
53
+
OLAPServer worked well for non-aggregated data, but it had many restrictions that did not allow it to be used for all reports as desired. These included a lack of support for data types (numbers only), and the inability to incrementally update data in real-time (it could only be done by rewriting data daily). OLAPServer is not a DBMS, but a specialized DB.
53
54
54
55
The initial goal for ClickHouse was to remove the limitations of OLAPServer and solve the problem of working with non-aggregated data for all reports, but over the years, it has grown into a general-purpose database management system suitable for a wide range of analytical tasks.
Copy file name to clipboardExpand all lines: docs/en/cloud/reference/cloud-compatibility.md
+1-15Lines changed: 1 addition & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -122,19 +122,5 @@ ClickHouse Cloud is tuned for variable workloads, and for that reason most syste
122
122
As part of creating the ClickHouse service, we create a default database, and the default user that has broad permissions to this database. This initial user can create additional users and assign their permissions to this database. Beyond this, the ability to enable the following security features within the database using Kerberos, LDAP, or SSL X.509 certificate authentication are not supported at this time.
123
123
124
124
## Roadmap
125
-
The table below summarizes our efforts to expand some of the capabilities described above. If you have feedback, please [submit it here](mailto:[email protected]).
The table below summarizes our efforts to expand some of the capabilities described above. If you have feedback, please [submit it here](mailto:[email protected]).
Copy file name to clipboardExpand all lines: docs/en/guides/developer/lightweight-update.md
+2-6Lines changed: 2 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,12 +5,8 @@ title: Lightweight Update
5
5
keywords: [lightweight update]
6
6
---
7
7
8
-
import CloudAvailableBadge from '@theme/badges/CloudAvailableBadge';
9
-
10
8
## Lightweight Update
11
9
12
-
<CloudAvailableBadge/>
13
-
14
10
When lightweight updates are enabled, updated rows are marked as updated immediately and subsequent `SELECT` queries will automatically return with the changed values. When lightweight updates are not enabled, you may have to wait for your mutations to be applied via a background process to see the changed values.
15
11
16
12
Lightweight updates can be enabled for `MergeTree`-family tables by enabling the query-level setting `apply_mutations_on_fly`.
@@ -23,7 +19,7 @@ SET apply_mutations_on_fly = 1;
23
19
24
20
Let's create a table and run some mutations:
25
21
```sql
26
-
CREATETABLEtest_on_fly_mutations (id UInt64, v String)
22
+
CREATETABLEtest_on_fly_mutations (id UInt64, v String)
27
23
ENGINE = MergeTree ORDER BY id;
28
24
29
25
-- Disable background materialization of mutations to showcase
@@ -93,4 +89,4 @@ These behaviours are controlled by the following settings:
93
89
-`mutations_execute_nondeterministic_on_initiator` - if true, non-deterministic functions are executed on the initiator replica and are replaced as literals in `UPDATE` and `DELETE` queries. Default value: `false`.
94
90
-`mutations_execute_subqueries_on_initiator` - if true, scalar subqueries are executed on the initiator replica and are replaced as literals in `UPDATE` and `DELETE` queries. Default value: `false`.
95
91
-`mutations_max_literal_size_to_replace` - The maximum size of serialized literals in bytes to replace in `UPDATE` and `DELETE` queries. Default value: `16384` (16 KiB).
Copy file name to clipboardExpand all lines: docs/en/integrations/language-clients/java/client-v1.md
+2-10Lines changed: 2 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,21 +1,13 @@
1
-
---
2
-
sidebar_label: Client V1
3
-
sidebar_position: 3
4
-
keywords: [clickhouse, java, client, integrate]
5
-
description: Java ClickHouse Connector v1
6
-
slug: /en/integrations/java/client-v1
7
-
---
8
-
9
1
import Tabs from '@theme/Tabs';
10
2
import TabItem from '@theme/TabItem';
11
3
import CodeBlock from '@theme/CodeBlock';
12
4
13
-
# Client (V1)
5
+
# Client (0.7.x and earlier)
14
6
15
7
Java client library to communicate with a DB server thru its protocols. Current implementation supports only [HTTP interface](/docs/en/interfaces/http). The library provides own API to send requests to a server.
16
8
17
9
:::warning Deprecation
18
-
This library will be deprecated soon. Use Client-v2 for new projects
10
+
This library will be deprecated soon. Use the latest [Java Client](/docs/en/integrations/language-clients/java/client-v2.md) for new projects
Copy file name to clipboardExpand all lines: docs/en/integrations/language-clients/java/client-v2.md
+8-3Lines changed: 8 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
---
2
-
sidebar_label: Client V2
2
+
sidebar_label: Client 0.8+
3
3
sidebar_position: 2
4
4
keywords: [clickhouse, java, client, integrate]
5
-
description: Java ClickHouse Connector v2
5
+
description: Java ClickHouse Connector 0.8+
6
6
slug: /en/integrations/java/client-v2
7
7
---
8
8
@@ -12,7 +12,12 @@ import CodeBlock from '@theme/CodeBlock';
12
12
13
13
# Java Client (V2)
14
14
15
-
Java client library to communicate with a DB server through its protocols. The current implementation only supports the [HTTP interface](/docs/en/interfaces/http). The library provides its own API to send requests to a server. The library also provides tools to work with different binary data formats (RowBinary* & Native*).
15
+
Java client library to communicate with a DB server through its protocols. The current implementation only supports the [HTTP interface](/docs/en/interfaces/http).
16
+
The library provides its own API to send requests to a server. The library also provides tools to work with different binary data formats (RowBinary* & Native*).
17
+
18
+
:::note
19
+
If you're looking for a prior version of the java client docs, please see [here](/docs/en/integrations/language-clients/java/client-v1.md).
0 commit comments