Skip to content

Commit 64bf611

Browse files
authored
Merge pull request #3102 from ClickHouse/add_style_check
2 parents 5626a28 + 16fc045 commit 64bf611

File tree

175 files changed

+1338
-846
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

175 files changed

+1338
-846
lines changed

.github/workflows/check-build.yml

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
---
2+
name: Style check
3+
4+
env:
5+
# Force the stdout and stderr streams to be unbuffered
6+
PYTHONUNBUFFERED: 1
7+
8+
on:
9+
pull_request:
10+
types:
11+
- synchronize
12+
- reopened
13+
- opened
14+
jobs:
15+
stylecheck:
16+
runs-on: ubuntu-latest
17+
18+
steps:
19+
# Step 1: Check out the repository
20+
- name: Check out repository
21+
uses: actions/checkout@v3
22+
23+
# Step 2: Set up environment if required (e.g., installing Aspell)
24+
- name: Install Aspell
25+
run: sudo apt-get update && sudo apt-get install -y aspell aspell-en
26+
27+
# Step 3: Run the spellcheck script
28+
- name: Run spellcheck
29+
run: |
30+
./scripts/check-doc-aspell
31+
continue-on-error: true
32+
id: spellcheck
33+
34+
# Step 4: Fail the build if the script returns exit code 1
35+
- name: Check exit code
36+
run: |
37+
if [ ${{ steps.spellcheck.outcome }} == 'failure' ]; then
38+
echo "Spellcheck failed. See the logs for details."
39+
exit 1
40+
fi

.github/workflows/pull-request.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@ on: # yamllint disable-line rule:truthy
1414
- synchronize
1515
- reopened
1616
- opened
17-
branches-ignore:
18-
- 'new-nav'
1917

2018
# Cancel the previous wf run in PRs.
2119
concurrency:

docs/en/about-us/adopters.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ The following list of companies using ClickHouse and their success stories is as
132132
| [DNSMonster](https://dnsmonster.dev/) | Software & Technology | DNS Monitoring ||| [GitHub Repository](https://github.com/mosajjal/dnsmonster) |
133133
| [Darwinium](https://www.darwinium.com/) | Software & Technology | Security and Fraud Analytics ||| [Blog Post, July 2022](https://clickhouse.com/blog/fast-feature-rich-and-mutable-clickhouse-powers-darwiniums-security-and-fraud-analytics-use-cases) |
134134
| [Dash0](https://www.dash0.com/) | APM Platform | Main product ||| [Careers page](https://careers.dash0.com/senior-product-engineer-backend/en) |
135-
| [Dashdive](https://www.dashdive.com/) | Infrastructure management | Analytics ||| [Hackernews, 2024](https://news.ycombinator.com/item?id=39178753) |
135+
| [Dashdive](https://www.dashdive.com/) | Infrastructure management | Analytics ||| [Hacker News, 2024](https://news.ycombinator.com/item?id=39178753) |
136136
| [Dassana](https://lake.dassana.io/) | Cloud data platform | Main product | - | - | [Blog Post, Jan 2023](https://clickhouse.com/blog/clickhouse-powers-dassanas-security-data-lake) [Direct reference, April 2022](https://news.ycombinator.com/item?id=31111432) |
137137
| [Datafold](https://www.datafold.com/) | Data Reliability Platform |||| [Job advertisement, April 2022](https://www.datafold.com/careers) |
138138
| [Dataliance for China Telecom](https://www.chinatelecomglobal.com/) | Telecom | Analytics ||| [Slides in Chinese, January 2018](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup12/telecom.pdf) |
@@ -146,7 +146,7 @@ The following list of companies using ClickHouse and their success stories is as
146146
| [Didi](https://web.didiglobal.com/) | Transportation & Ride Sharing | Observability | 400+ logging, 40 tracing | PBs/day / 40GB/s write throughput, 15M queries/day, 200 QPS peak | [Blog, Apr 2024](https://clickhouse.com/blog/didi-migrates-from-elasticsearch-to-clickHouse-for-a-new-generation-log-storage-system) |
147147
| [DigiCert](https://www.digicert.com) | Network Security | DNS Platform || over 35 billion events per day | [Job posting, Aug 2022](https://www.indeed.com/viewjob?t=Senior+Principal+Software+Engineer+Architect&c=DigiCert&l=Lehi,+UT&jk=403c35f96c46cf37&rtk=1g9mnof7qk7dv800) |
148148
| [Disney+](https://www.disneyplus.com/) | Video Streaming | Analytics || 395 TiB | [Meetup Video, December 2022](https://www.youtube.com/watch?v=CVVp6N8Xeoc&list=PL0Z2YDlm0b3iNDUzpY1S3L_iV4nARda_U&index=8) [Slides, December 2022](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup67/Disney%20plus%20ClickHouse.pdf) |
149-
| [Dittofeed](https://dittofeed.com/) | Software & Technology | Open Source Customer Engagement ||| [Hackernews, June 2023](https://news.ycombinator.com/item?id=36061344) |
149+
| [Dittofeed](https://dittofeed.com/) | Software & Technology | Open Source Customer Engagement ||| [Hacker News, June 2023](https://news.ycombinator.com/item?id=36061344) |
150150
| [Diva-e](https://www.diva-e.com) | Digital consulting | Main Product ||| [Slides in English, September 2019](https://github.com/ClickHouse/clickhouse-presentations/blob/master/meetup29/ClickHouse-MeetUp-Unusual-Applications-sd-2019-09-17.pdf) |
151151
| [Dolphin Emulator](https://dolphin-emu.org/) | Games | Analytics ||| [Twitter, September 2022](https://twitter.com/delroth_/status/1567300096160665601) |
152152
| [DoorDash](https://www.doordash.com/home) | E-commerce | Monitoring ||| [Meetup, December 2024](https://github.com/ClickHouse/clickhouse-presentations/blob/master/2024-meetup-san-francisco/Clickhouse%20Meetup%20Slides%20(1).pdf) |

docs/en/chdb/guides/jupysql.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ title: JupySQL and chDB
33
sidebar_label: JupySQL
44
slug: /en/chdb/guides/jupysql
55
description: How to install chDB for Bun
6-
keywords: [chdb, jupysql]
6+
keywords: [chdb, JupySQL]
77
---
88

9-
[JupySQL](https://jupysql.ploomber.io/en/latest/quick-start.html) is a Python library that lets you run SQL in Jupyter notebooks and the iPython shell.
9+
[JupySQL](https://jupysql.ploomber.io/en/latest/quick-start.html) is a Python library that lets you run SQL in Jupyter notebooks and the IPython shell.
1010
In this guide, we're going to learn how to query data using chDB and JupySQL.
1111

1212
<div class='vimeo-container'>
@@ -22,13 +22,13 @@ python -m venv .venv
2222
source .venv/bin/activate
2323
```
2424

25-
And then, we'll install JupySQL, iPython, and Jupyter Lab:
25+
And then, we'll install JupySQL, IPython, and Jupyter Lab:
2626

2727
```bash
2828
pip install jupysql ipython jupyterlab
2929
```
3030

31-
We can use JupySQL in iPython, which we can launch by running:
31+
We can use JupySQL in IPython, which we can launch by running:
3232

3333
```bash
3434
ipython
@@ -65,7 +65,7 @@ for file in files:
6565

6666
## Configuring chDB and JupySQL
6767

68-
Next, let's import chDB's `dbapi` module:
68+
Next, let's import the `dbapi` module for chDB:
6969

7070
```python
7171
from chdb import dbapi
@@ -168,7 +168,7 @@ The default database doesn't persist data on disk, so we need to create another
168168
%sql CREATE DATABASE atp
169169
```
170170

171-
And now we're going to create a table called `rankings` whos schema will be derived from the structure of the data in the CSV files:
171+
And now we're going to create a table called `rankings` whose schema will be derived from the structure of the data in the CSV files:
172172

173173
```python
174174
%%sql

docs/en/chdb/guides/query-remote-clickhouse.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ You can also use the code in a Python script or in your favorite notebook.
4141
## An intro to ClickPy
4242

4343
The remote ClickHouse server that we're going to query is [ClickPy](https://clickpy.clickhouse.com).
44-
ClickPy keeps track of all the downloads of PyPi packages and lets you explore the stats of packages via a UI.
44+
ClickPy keeps track of all the downloads of PyPI packages and lets you explore the stats of packages via a UI.
4545
The underlying database is available to query using the `play` user.
4646

4747
You can learn more about ClickPy in [its GitHub repository](https://github.com/ClickHouse/clickpy).
@@ -150,7 +150,7 @@ df.head(n=5)
150150
4 2018-03-02 5 23842
151151
```
152152

153-
We can then compute the ratio of Open AI downloads to scikit-learn downloads like this:
153+
We can then compute the ratio of Open AI downloads to `scikit-learn` downloads like this:
154154

155155
```python
156156
df['ratio'] = df['y_openai'] / df['y_sklearn']

docs/en/chdb/guides/querying-apache-arrow.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: How to query Apache Arrow with chDB
33
sidebar_label: Querying Apache Arrow
44
slug: /en/chdb/guides/apache-arrow
55
description: In this guide, we'll learn how to query Apache Arrow tables with chDB
6-
keywords: [chdb, apache-arrow]
6+
keywords: [chdb, Apache Arrow]
77
---
88

99
[Apache Arrow](https://arrow.apache.org/) is a standardized column-oriented memory format that's gained popularity in the data community.
@@ -25,7 +25,7 @@ Make sure you have version 2.0.2 or higher:
2525
pip install "chdb>=2.0.2"
2626
```
2727

28-
And now we're going to install pyarrow, pandas, and ipython:
28+
And now we're going to install PyArrow, pandas, and ipython:
2929

3030
```bash
3131
pip install pyarrow pandas ipython
@@ -55,7 +55,7 @@ If you want to download more files, use `aws s3 ls` to get a list of all the fil
5555

5656

5757

58-
Next, we'll import the Parquet module from the pyarrow package:
58+
Next, we'll import the Parquet module from the `pyarrow` package:
5959

6060
```python
6161
import pyarrow.parquet as pq

docs/en/chdb/guides/querying-parquet.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Make sure you have version 2.0.2 or higher:
2525
pip install "chdb>=2.0.2"
2626
```
2727

28-
And now we're going to install iPython:
28+
And now we're going to install IPython:
2929

3030
```bash
3131
pip install ipython

docs/en/chdb/guides/querying-s3-bucket.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Make sure you have version 2.0.2 or higher:
2525
pip install "chdb>=2.0.2"
2626
```
2727

28-
And now we're going to install iPython:
28+
And now we're going to install IPython:
2929

3030
```bash
3131
pip install ipython

docs/en/chdb/install/nodejs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Installing chDB for NodeJS
33
sidebar_label: NodeJS
44
slug: /en/chdb/install/nodejs
55
description: How to install chDB for NodeJS
6-
keywords: [chdb, embedded, clickhouse-lite, nodejs, install]
6+
keywords: [chdb, embedded, clickhouse-lite, NodeJS, install]
77
---
88

99
# Installing chDB for NodeJS

docs/en/chdb/install/python.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ res = chdb.query('select * from file("data.csv", CSV)', 'CSV'); print(res)
6767
print(f"SQL read {res.rows_read()} rows, {res.bytes_read()} bytes, elapsed {res.elapsed()} seconds")
6868
```
6969

70-
**Pandas dataframe output**
70+
**Pandas DataFrame output**
7171
```python
7272
# See more in https://clickhouse.com/docs/en/interfaces/formats
7373
chdb.query('select * from file("data.parquet", Parquet)', 'Dataframe')
@@ -165,7 +165,7 @@ Some notes on the chDB Python UDF (User Defined Function) decorator.
165165
import json
166166
...
167167
```
168-
6. The Python interpertor used is the same as the one used to run the script. You can get it from `sys.executable`.
168+
6. The Python interpreter used is the same as the one used to run the script. You can get it from `sys.executable`.
169169

170170
see also: [test_udf.py](https://github.com/chdb-io/chdb/blob/main/tests/test_udf.py).
171171

@@ -207,7 +207,7 @@ chdb.query(
207207

208208
1. You must inherit from chdb.PyReader class and implement the `read` method.
209209
2. The `read` method should:
210-
1. return a list of lists, the first demension is the column, the second dimension is the row, the columns order should be the same as the first arg `col_names` of `read`.
210+
1. return a list of lists, the first dimension is the column, the second dimension is the row, the columns order should be the same as the first arg `col_names` of `read`.
211211
1. return an empty list when there is no more data to read.
212212
1. be stateful, the cursor should be updated in the `read` method.
213213
3. An optional `get_schema` method can be implemented to return the schema of the table. The prototype is `def get_schema(self) -> List[Tuple[str, str]]:`, the return value is a list of tuples, each tuple contains the column name and the column type. The column type should be one of [the following](/en/sql-reference/data-types).
@@ -247,7 +247,7 @@ See also: [test_query_py.py](https://github.com/chdb-io/chdb/blob/main/tests/tes
247247

248248
## Limitations
249249

250-
1. Column types supported: pandas.Series, pyarrow.array, chdb.PyReader
250+
1. Column types supported: `pandas.Series`, `pyarrow.array`,`chdb.PyReader`
251251
1. Data types supported: Int, UInt, Float, String, Date, DateTime, Decimal
252252
1. Python Object type will be converted to String
253253
1. Pandas DataFrame performance is all of the best, Arrow Table is better than PyReader

0 commit comments

Comments
 (0)