Skip to content

Commit 5063265

Browse files
committed
updated search scripts
1 parent 2abdda4 commit 5063265

File tree

5 files changed

+182
-104
lines changed

5 files changed

+182
-104
lines changed

docs/en/managing-data/deleting-data/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
slug: /en/deletes/overview
3-
title: Overview
3+
title: Delete Overview
44
description: How to delete data in ClickHouse
55
keywords: [delete, truncate, drop, lightweight delete]
66
---

scripts/search/README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,11 @@ options:
3434
| **Date** | **Average nDCG** | **Results** | **Changes** |
3535
|-------------|------------------|----------------------------------------------------------------------------------------------|----------------------------------------------|
3636
| 20/01/2024 | 0.4700 | [View Results](https://pastila.nl/?008231f5/bc107912f8a5074d70201e27b1a66c6c#cB/yJOsZPOWi9h8xAkuTUQ==) | Baseline |
37-
| 21/01/2024 | 0.4783 | [View Results](https://pastila.nl/?00bb2c2f/936a9a3af62a9bdda186af5f37f55782#m7Hg0i9F1YCesMW6ot25yA==) | Index `_` character and move language to English |
37+
| 21/01/2024 | 0.5021 | [View Results](https://pastila.nl/?00bb2c2f/936a9a3af62a9bdda186af5f37f55782#m7Hg0i9F1YCesMW6ot25yA==) | Index `_` character and move language to English |
38+
39+
40+
## Issues
41+
42+
1. Some pages are not optimized for retrieval e.g.
43+
a. https://clickhouse.com/docs/en/sql-reference/aggregate-functions/combinators#-if will never return for `countIf`.
44+
2.

scripts/search/compute_ndcg.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,13 +3,19 @@
33
import argparse
44
from algoliasearch.search.client import SearchClientSync
55

6+
ALGOLIA_INDEX_NAME = "clickhouse"
7+
8+
69
# Initialize Algolia client
710
ALGOLIA_APP_ID = "5H9UG7CX5W"
811
ALGOLIA_API_KEY = "4a7bf25cf3edbef29d78d5e1eecfdca5"
9-
ALGOLIA_INDEX_NAME = "clickhouse"
1012

11-
client = SearchClientSync(ALGOLIA_APP_ID, ALGOLIA_API_KEY)
13+
# old search engine using crawler
14+
# ALGOLIA_APP_ID = "62VCH2MD74"
15+
# ALGOLIA_API_KEY = "b78244d947484fe3ece7bc5472e9f2af"
16+
1217

18+
client = SearchClientSync(ALGOLIA_APP_ID, ALGOLIA_API_KEY)
1319

1420
def compute_dcg(relevance_scores):
1521
"""Compute Discounted Cumulative Gain (DCG)."""
@@ -32,7 +38,6 @@ def main(input_csv, detailed, k=3):
3238
with open(input_csv, mode='r', newline='', encoding='utf-8') as file:
3339
reader = csv.reader(file)
3440
rows = list(reader)
35-
3641
results = []
3742
total_ndcg = 0
3843
for row in rows:

0 commit comments

Comments
 (0)