Merge pull request #3342 from ClickHouse/anchors

gingerwizard · web-flow · commit 472637f7a555 · 2025-02-25T13:33:17.000Z
Enable explicit anchors
diff --git a/docs/guides/creating-tables.md b/docs/guides/creating-tables.md
@@ -42,23 +42,23 @@ In the example above, `my_first_table` is a `MergeTree` table with four columns:
   There are many engines to choose from, but for a simple table on a single-node ClickHouse server, [MergeTree](/engines/table-engines/mergetree-family/mergetree.md) is your likely choice.
   :::
 
-  ## A Brief Intro to Primary Keys
+## A Brief Intro to Primary Keys {#a-brief-intro-to-primary-keys}
 
-  Before you go any further, it is important to understand how primary keys work in ClickHouse (the implementation
-  of primary keys might seem unexpected!):
+Before you go any further, it is important to understand how primary keys work in ClickHouse (the implementation
+of primary keys might seem unexpected!):
 
-    - primary keys in ClickHouse are **_not unique_** for each row in a table
+  - primary keys in ClickHouse are **_not unique_** for each row in a table
 
-  The primary key of a ClickHouse table determines how the data is sorted when written to disk. Every 8,192 rows or 10MB of
-  data (referred to as the **index granularity**) creates an entry in the primary key index file. This granularity concept
-  creates a **sparse index** that can easily fit in memory, and the granules represent a stripe of the smallest amount of
-  column data that gets processed during `SELECT` queries.
+The primary key of a ClickHouse table determines how the data is sorted when written to disk. Every 8,192 rows or 10MB of
+data (referred to as the **index granularity**) creates an entry in the primary key index file. This granularity concept
+creates a **sparse index** that can easily fit in memory, and the granules represent a stripe of the smallest amount of
+column data that gets processed during `SELECT` queries.
 
-  The primary key can be defined using the `PRIMARY KEY` parameter. If you define a table without a `PRIMARY KEY` specified,
-  then the key becomes the tuple specified in the `ORDER BY` clause. If you specify both a `PRIMARY KEY` and an `ORDER BY`, the primary key must be a prefix of the sort order.
+The primary key can be defined using the `PRIMARY KEY` parameter. If you define a table without a `PRIMARY KEY` specified,
+then the key becomes the tuple specified in the `ORDER BY` clause. If you specify both a `PRIMARY KEY` and an `ORDER BY`, the primary key must be a prefix of the sort order.
 
-  The primary key is also the sorting key, which is a tuple of `(user_id, timestamp)`.  Therefore, the data stored in each
-  column file will be sorted by `user_id`, then `timestamp`.
+The primary key is also the sorting key, which is a tuple of `(user_id, timestamp)`.  Therefore, the data stored in each
+column file will be sorted by `user_id`, then `timestamp`.
 
 :::tip
 For more details, check out the [Modeling Data training module](https://learn.clickhouse.com/visitor_catalog_class/show/1328860/?utm_source=clickhouse&utm_medium=docs) in ClickHouse Academy.
diff --git a/docs/tutorial.md b/docs/tutorial.md
@@ -8,15 +8,15 @@ import SQLConsoleDetail from '@site/docs/_snippets/_launch_sql_console.md';
 
 # Advanced Tutorial
 
-## What to Expect from This Tutorial?
+## What to Expect from This Tutorial? {#what-to-expect-from-this-tutorial}
 
 In this tutorial, you will create a table and insert a large dataset (two million rows of the [New York taxi data](/getting-started/example-datasets/nyc-taxi.md)). Then you will run queries on the dataset, including an example of how to create a dictionary and use it to perform a JOIN.
 
 :::note
 This tutorial assumes you have access to a running ClickHouse service.  If not, check out the [Quick Start](./quick-start.mdx).
 :::
 
-## 1. Create a New Table
+## 1. Create a New Table {#1-create-a-new-table}
 
 The New York City taxi data contains the details of millions of taxi rides, with columns like pickup and drop-off times and locations, cost, tip amount, tolls, payment type and so on. Let's create a table to store this data...
 
@@ -81,7 +81,7 @@ The New York City taxi data contains the details of millions of taxi rides, with
     ORDER BY pickup_datetime;
     ```
 
-## 2. Insert the Dataset
+## 2. Insert the Dataset {#2-insert-the-dataset}
 
 Now that you have a table created, let's add the NYC taxi data. It is in CSV files in S3, and you can load the data from there.
 
@@ -163,7 +163,7 @@ Now that you have a table created, let's add the NYC taxi data. It is in CSV fil
 
     This query has to process 2M rows and return 190 values, but notice it does this in about 1 second. The `pickup_ntaname` column represents the name of the neighborhood in New York City where the taxi ride originated.
 
-## 3. Analyze the Data
+## 3. Analyze the Data {#3-analyze-the-data}
 
 Let's run some queries to analyze the 2M rows of data...
 
@@ -341,7 +341,7 @@ Let's run some queries to analyze the 2M rows of data...
     │ 2015-07-01 00:41:48 │ 2015-07-01 00:44:45 │          6.3 │                 -94 │                  132 │ JFK          │ 2015 │   1 │    0 │
     │ 2015-07-01 01:06:18 │ 2015-07-01 01:14:43 │        11.76 │                  37 │                  132 │ JFK          │ 2015 │   1 │    1 │
     ```
-## 4. Create a Dictionary
+## 4. Create a Dictionary {#4-create-a-dictionary}
 
 If you are new to ClickHouse, it is important to understand how ***dictionaries*** work. A simple way of thinking about a dictionary is a mapping of key->value pairs that is stored in memory. The details and all the options for dictionaries are linked at the end of the tutorial.
 
@@ -442,7 +442,7 @@ If you are new to ClickHouse, it is important to understand how ***dictionaries*
     ```
 
 
-## 5. Perform a Join
+## 5. Perform a Join {#5-perform-a-join}
 
 Let's write some queries that join the `taxi_zone_dictionary` with your `trips` table.
 
@@ -487,7 +487,7 @@ Let's write some queries that join the `taxi_zone_dictionary` with your `trips`
     LIMIT 1000
     ```
 
-#### Congrats!
+#### Congrats! {#congrats}
 
 Well done - you made it through the tutorial, and hopefully you have a better understanding of how to use ClickHouse. Here are some options for what to do next:
 
diff --git a/scripts/.markdownlint-cli2.yaml b/scripts/.markdownlint-cli2.yaml
@@ -7,7 +7,7 @@ config:
   default:                false
   MD040:                  false  # Fenced code blocks should have a language specified
   links-url-type:         false  # Disallow relative links to a .md or .mdx file
-  custom-anchor-headings: false  # Headings must have a custom anchor which is unique per page eg. # A Heading {#a-heading}
+  custom-anchor-headings: true  # Headings must have a custom anchor which is unique per page eg. # A Heading {#a-heading}
 
   # Keep this item last due to length
   proper-names:                     # MD044
@@ -22,6 +22,8 @@ ignores:
   - "docs/zh"
   - "docs/en/whats-new"
   - "docs/en/_placeholders"
+  - "docs/operations/settings/settings.md" # autogenerated
+  - "docs/operations/settings/settings-formats.md" # autogenerated
 customRules:
   # add custom rules here
   - "./markdownlint/rules/links_url_type.js"
diff --git a/scripts/markdownlint/rules/headings_have_custom_anchors.js b/scripts/markdownlint/rules/headings_have_custom_anchors.js
@@ -21,6 +21,9 @@ module.exports = {
     function: (params, onError) => {
         const headingIds = {};
         filterTokens(params, "heading_open", (token) => {
+            if (token.markup === "#") {
+                return;
+            }
             const headingLine = params.lines[token.map[0]];
             const match = /\{#([a-zA-Z0-9_-]+)\}/.exec(headingLine);