Skip to content

Commit d27a80a

Browse files
Apply suggestions from code review
Co-authored-by: Shaun Struwig <[email protected]>
1 parent 07dd4c1 commit d27a80a

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

docs/use-cases/observability/clickstack/migration/elastic/migrating-data.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -524,7 +524,7 @@ This strict schema has a number of benefits:
524524

525525
- **Data validation** – enforcing a strict schema avoids the risk of column explosion, outside of specific structures.
526526
- **Avoids risk of column explosion**: although the JSON type scales to potentially thousands of columns, where subcolumns are stored as dedicated columns, this can lead to a column file explosion where an excessive number of column files are created that impacts performance. To mitigate this, the underlying [Dynamic type](/sql-reference/data-types/dynamic) used by JSON offers a [`max_dynamic_paths`](/sql-reference/data-types/newjson#reading-json-paths-as-sub-columns) parameter, which limits the number of unique paths stored as separate column files. Once the threshold is reached, additional paths are stored in a shared column file using a compact encoded format, maintaining performance and storage efficiency while supporting flexible data ingestion. Accessing this shared column file is, however, not as performant. Note, however, that the JSON column can be used with [type hints](/integrations/data-formats/json/schema#using-type-hints-and-skipping-paths). "Hinted" columns will deliver the same performance as dedicated columns.
527-
- **Simpler introspection of paths and types** - Although the JSON type supports [introspection functions](/sql-reference/data-types/newjson#introspection-functions) to determine the types and paths that have been inferred, static structures can be simpler to explore e.g. with `DESCRIBE`.
527+
- **Simpler introspection of paths and types**: although the JSON type supports [introspection functions](/sql-reference/data-types/newjson#introspection-functions) to determine the types and paths that have been inferred, static structures can be simpler to explore e.g. with `DESCRIBE`.
528528
<br/>
529529
Alternatively, users can simply create a table with one `JSON` column.
530530

@@ -580,6 +580,7 @@ export ELASTICDUMP_INPUT_PASSWORD=
580580
export CLICKHOUSE_HOST=
581581
export CLICKHOUSE_PASSWORD=
582582
export CLICKHOUSE_USER=default
583+
583584
# command to run - modify as required
584585
elasticdump --input=${ELASTICSEARCH_URL} --type=data --input-index ${ELASTICSEARCH_INDEX} --output=$ --sourceOnly --searchAfter --pit=true |
585586
clickhouse-client --host ${CLICKHOUSE_HOST} --secure --password ${CLICKHOUSE_PASSWORD} --user ${CLICKHOUSE_USER} --max_insert_block_size=1000 \
@@ -594,7 +595,7 @@ Note the use of the following flags for `elasticdump`:
594595
- `sourceOnly` flag ensuring we omit metadata fields in our response.
595596
- `searchAfter` flag to use the [`searchAfter` API](https://www.elastic.co/docs/reference/elasticsearch/rest-apis/paginate-search-results#search-after) for efficient pagination of results.
596597
- `pit=true` to ensure consistent results between queries using the [point in time API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-open-point-in-time).
597-
598+
<br/>
598599
Our ClickHouse client parameters here (aside from credentials):
599600

600601
- `max_insert_block_size=1000` - ClickHouse client will send data once this number of rows is reached. Increasing improves throughput at the expense of time to formulate a block - thus increasing time till data appears in ClickHouse.

0 commit comments

Comments
 (0)