Skip to content

Commit 5370f75

Browse files
authored
ops: add pre-commit and hooks for code checking (#641)
* ops: add pre-commit and hooks for checking * chore: pass pre-commit checks * docs: add pre-commit setup instructions * fix: run maturin before other hooks on relevant file changes
1 parent 40e84b6 commit 5370f75

File tree

62 files changed

+201
-146
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+201
-146
lines changed

.github/ISSUE_TEMPLATE/💡-feature-request.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,4 @@ assignees: ''
1717

1818
---
1919
❤️ Contributors, please refer to 📙[Contributing Guide](https://cocoindex.io/docs/about/contributing).
20-
Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like **`I'm working on it`** or **`Can I work on this issue?`** to avoid duplicating work. Our [Discord server](https://discord.com/invite/zpA9S2DR7s) is always open and friendly.
20+
Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like **`I'm working on it`** or **`Can I work on this issue?`** to avoid duplicating work. Our [Discord server](https://discord.com/invite/zpA9S2DR7s) is always open and friendly.

.github/scripts/update_version.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@ else
1919
fi
2020

2121
# Update Cargo.toml
22-
sed "${SED_INLINE[@]}" "s/^version = .*/version = \"$VERSION\"/" Cargo.toml
22+
sed "${SED_INLINE[@]}" "s/^version = .*/version = \"$VERSION\"/" Cargo.toml

.pre-commit-config.yaml

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
ci:
2+
autofix_prs: false
3+
autoupdate_schedule: 'monthly'
4+
5+
repos:
6+
- repo: https://github.com/pre-commit/pre-commit-hooks
7+
rev: v5.0.0
8+
hooks:
9+
- id: check-case-conflict
10+
# Check for files with names that would conflict on a case-insensitive
11+
# filesystem like MacOS HFS+ or Windows FAT.
12+
- id: check-merge-conflict
13+
# Check for files that contain merge conflict strings.
14+
- id: check-symlinks
15+
# Checks for symlinks which do not point to anything.
16+
exclude: ".*(.github.*)$"
17+
- id: detect-private-key
18+
# Checks for the existence of private keys.
19+
- id: end-of-file-fixer
20+
# Makes sure files end in a newline and only a newline.
21+
exclude: ".*(data.*|licenses.*|_static.*|\\.ya?ml|\\.jpe?g|\\.png|\\.svg|\\.webp)$"
22+
- id: trailing-whitespace
23+
# Trims trailing whitespace.
24+
exclude_types: [python] # Covered by Ruff W291.
25+
exclude: ".*(data.*|licenses.*|_static.*|\\.ya?ml|\\.jpe?g|\\.png|\\.svg|\\.webp)$"
26+
27+
- repo: local
28+
hooks:
29+
- id: maturin-develop
30+
name: maturin develop
31+
entry: maturin develop
32+
language: system
33+
files: ^(python/|src/|Cargo\.toml|pyproject\.toml)
34+
pass_filenames: false
35+
36+
- id: cargo-fmt
37+
name: cargo fmt
38+
entry: cargo fmt
39+
language: system
40+
types: [rust]
41+
pass_filenames: false
42+
43+
- id: cargo-test
44+
name: cargo test
45+
entry: cargo test
46+
language: system
47+
types: [rust]
48+
pass_filenames: false
49+
50+
- id: mypy-check
51+
name: mypy type check
52+
entry: mypy
53+
language: system
54+
types: [python]
55+
pass_filenames: false
56+
57+
- repo: https://github.com/astral-sh/ruff-pre-commit
58+
rev: v0.12.0
59+
hooks:
60+
- id: ruff-format
61+
types: [python]
62+
pass_filenames: true
63+
64+
- repo: https://github.com/christophmeissner/pytest-pre-commit
65+
rev: 1.0.0
66+
hooks:
67+
- id: pytest
68+
language: system
69+
types: [python]
70+
pass_filenames: false
71+
always_run: false

.vscode/settings.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@
66
],
77
"editor.formatOnSave": true,
88
"python.formatting.provider": "ruff"
9-
}
9+
}

CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
We love contributions from our community ❤️. Please check out our [contributing guide](https://cocoindex.io/docs/about/contributing).
1+
We love contributions from our community ❤️. Please check out our [contributing guide](https://cocoindex.io/docs/about/contributing).

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,10 @@ Unlike a workflow orchestration framework where data is usually opaque, in CocoI
3232

3333
```python
3434
# import
35-
data['content'] = flow_builder.add_source(...)
35+
data['content'] = flow_builder.add_source(...)
3636

3737
# transform
38-
data['out'] = data['content']
38+
data['out'] = data['content']
3939
.transform(...)
4040
.transform(...)
4141

@@ -56,17 +56,17 @@ As a data framework, CocoIndex takes it to the next level on data freshness. **I
5656
The frameworks takes care of
5757
- Change data capture.
5858
- Figure out what exactly needs to be updated, and only updating that without having to recompute everything.
59-
59+
6060
This makes it fast to reflect any source updates to the target store. If you have concerns with surfacing stale data to AI agents and are spending lots of efforts working on infra piece to optimize the latency, the framework actually handles it for you.
6161

6262

6363
## Quick Start:
64-
If you're new to CocoIndex, we recommend checking out
64+
If you're new to CocoIndex, we recommend checking out
6565
- 📖 [Documentation](https://cocoindex.io/docs)
6666
-[Quick Start Guide](https://cocoindex.io/docs/getting_started/quickstart)
67-
- 🎬 [Quick Start Video Tutorial](https://youtu.be/gv5R8nOXsWU?si=9ioeKYkMEnYevTXT)
67+
- 🎬 [Quick Start Video Tutorial](https://youtu.be/gv5R8nOXsWU?si=9ioeKYkMEnYevTXT)
6868

69-
### Setup
69+
### Setup
7070

7171
1. Install CocoIndex Python library
7272

@@ -136,8 +136,8 @@ It defines an index flow like this:
136136
| [Google Drive Text Embedding](examples/gdrive_text_embedding) | Index text documents from Google Drive |
137137
| [Docs to Knowledge Graph](examples/docs_to_knowledge_graph) | Extract relationships from Markdown documents and build a knowledge graph |
138138
| [Embeddings to Qdrant](examples/text_embedding_qdrant) | Index documents in a Qdrant collection for semantic search |
139-
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
140-
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
139+
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
140+
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
141141
| [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
142142

143143
More coming and stay tuned 👀!
@@ -159,7 +159,7 @@ Join our community here:
159159
- 📜 [Read our blog posts](https://cocoindex.io/blogs/)
160160

161161
## Support us:
162-
We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star ⭐ at GitHub repo [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex) to stay tuned and help us grow.
162+
We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star ⭐ at GitHub repo [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex) to stay tuned and help us grow.
163163

164164
## License
165165
CocoIndex is Apache 2.0 licensed.

check.sh

Lines changed: 0 additions & 12 deletions
This file was deleted.

docs/docs/about/contributing.md

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -15,22 +15,22 @@ We use [GitHub Issues](https://github.com/cocoindex-io/cocoindex/issues) to trac
1515

1616
We tag issues with the ["good first issue"](https://github.com/cocoindex-io/cocoindex/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) label for beginner contributors.
1717

18-
## How to Contribute
18+
## How to Contribute
1919
- If you decide to work on an issue, unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like **`I'm working on it`** or **`Can I work on this issue?`** to avoid duplicating work.
2020
- For larger features, we recommend you to discuss with us first in our [Discord server](https://discord.com/invite/zpA9S2DR7s) to coordinate the design and work.
2121
- Our [Discord server](https://discord.com/invite/zpA9S2DR7s) are constantly open. If you are unsure about anything, it is a good place to discuss! We'd love to collaborate and will always be friendly.
2222

23-
## Start hacking! Setting Up Development Environment
23+
## Start hacking! Setting Up Development Environment
2424
Following the steps below to get cocoindex build on latest codebase locally - if you are making changes to cocoindex funcionality and want to test it out.
2525

2626
- 🦀 [Install Rust](https://rust-lang.org/tools/install)
27-
27+
2828
If you don't have Rust installed, run
2929
```sh
3030
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
3131
```
32-
Already have Rust? Make sure it's up to date
33-
```sh
32+
Already have Rust? Make sure it's up to date
33+
```sh
3434
rustup update
3535
```
3636
@@ -46,14 +46,19 @@ Following the steps below to get cocoindex build on latest codebase locally - if
4646
4747
- Install required tools:
4848
```sh
49-
pip install maturin mypy ruff
49+
pip install maturin mypy pre-commit
5050
```
5151
5252
- Build the library. Run at the root of cocoindex directory:
5353
```sh
5454
maturin develop
5555
```
5656
57+
- Install and enable pre-commit hooks. This ensures all checks run automatically before each commit:
58+
```sh
59+
pre-commit install
60+
```
61+
5762
- Before running a specific example, set extra environment variables, for exposing extra traces, allowing dev UI, etc.
5863
```sh
5964
. ./.env.lib_debug
@@ -67,10 +72,14 @@ To submit your code:
6772
1. Fork the [CocoIndex repository](https://github.com/cocoindex-io/cocoindex)
6873
2. [Create a new branch](https://docs.github.com/en/desktop/making-changes-in-a-branch/managing-branches-in-github-desktop) on your fork
6974
3. Make your changes
70-
4. Make sure all tests and linting pass by running
71-
```sh
72-
./check.sh
73-
```
75+
4. Run the pre-commit checks (automatically triggered on `git commit`)
76+
77+
:::tip
78+
To run them manually (same as CI):
79+
```sh
80+
pre-commit run --all-files
81+
```
82+
:::
7483
7584
5. [Open a Pull Request (PR)](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) when your work is ready for review
7685

docs/docs/ai/llm.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -136,9 +136,9 @@ pip install 'litellm[proxy]'
136136
**Example for OpenAI:**
137137
```yaml
138138
model_list:
139-
- model_name: "*"
139+
- model_name: "*"
140140
litellm_params:
141-
model: openai/*
141+
model: openai/*
142142
api_key: os.environ/LITELLM_API_KEY
143143
```
144144
@@ -176,7 +176,7 @@ litellm --config config.yml
176176
```python
177177
cocoindex.LlmSpec(
178178
api_type=cocoindex.LlmApiType.LITE_LLM,
179-
model="deepseek-r1",
179+
model="deepseek-r1",
180180
address="http://127.0.0.1:4000", # default url of LiteLLM
181181
)
182182
```

docs/docs/core/basics.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ An indexing flow, once set up, maintains a long-lived relationship between data
7171

7272
* **One time update**: Once triggered, CocoIndex updates the target data to reflect the version of source data up to the current moment.
7373
* **Live update**: CocoIndex continuously reacts to changes of source data and updates the target data accordingly, based on various **change capture mechanisms** for the source.
74-
74+
7575
See more details in the [build / update target data](flow_methods#build--update-target-data) section.
7676

7777
3. CocoIndex intelligently reprocesses to propagate source changes to target by:
@@ -101,4 +101,4 @@ As an indexing flow is long-lived, it needs to store intermediate data to keep t
101101
CocoIndex uses internal storage for this purpose.
102102

103103
Currently, CocoIndex uses Postgres database as the internal storage.
104-
See [Settings](settings#databaseconnectionspec) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.
104+
See [Settings](settings#databaseconnectionspec) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.

0 commit comments

Comments
 (0)