Skip to content
This repository was archived by the owner on Nov 2, 2025. It is now read-only.

Commit 3556e34

Browse files
pqnCopybara Bot
andauthored
Project import generated by Copybara. (#3)
GitOrigin-RevId: b25b2d3469e0423a67b645a4ebccf3cecef61cdc Co-authored-by: Copybara Bot <[email protected]>
1 parent 193e9a3 commit 3556e34

File tree

9 files changed

+243
-0
lines changed

9 files changed

+243
-0
lines changed

.github/workflows/ci.yml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
name: CI
2+
on:
3+
push:
4+
branches:
5+
- main
6+
pull_request:
7+
concurrency:
8+
group: ${{ github.workflow }}-${{ github.ref_name }}-${{ github.ref_name != 'main' || github.sha }}
9+
cancel-in-progress: true
10+
jobs:
11+
test:
12+
runs-on: ubuntu-latest
13+
steps:
14+
- name: Check out repository code
15+
uses: actions/checkout@v3
16+
with:
17+
ref: ${{ github.event.pull_request.head.sha || github.sha }}
18+
token: ${{ secrets.ACTIONS }}
19+
- name: Download parse binary
20+
run: ./download_parse.sh
21+
- name: Run tests
22+
run: ./test.sh

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
parse.gz
2+
parse

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2023 Exafunction
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
<p align="center">
2+
<img width="300" alt="Codeium" src="codeium.svg"/>
3+
</p>
4+
5+
---
6+
7+
[![Discord](https://img.shields.io/discord/1027685395649015980?label=community&color=5865F2&logo=discord&logoColor=FFFFFF)](https://discord.gg/3XFf78nAx5)
8+
[![Twitter Follow](https://img.shields.io/badge/style--blue?style=social&logo=twitter&label=Follow%20%40codeiumdev)](https://twitter.com/intent/follow?screen_name=codeiumdev)
9+
![License](https://img.shields.io/github/license/Exafunction/codeium-parse)
10+
11+
[![Visual Studio](https://img.shields.io/visual-studio-marketplace/i/Codeium.codeium?label=Visual%20Studio&logo=visualstudio)](https://marketplace.visualstudio.com/items?itemName=Codeium.codeium)
12+
[![JetBrains](https://img.shields.io/jetbrains/plugin/d/20540?label=JetBrains)](https://plugins.jetbrains.com/plugin/20540-codeium/)
13+
[![Open VSX](https://img.shields.io/open-vsx/dt/Codeium/codeium?label=Open%20VSX)](https://open-vsx.org/extension/Codeium/codeium)
14+
[![Google Chrome](https://img.shields.io/chrome-web-store/users/hobjkcpmjhlegmobgonaagepfckjkceh?label=Google%20Chrome&logo=googlechrome&logoColor=FFFFFF)](https://chrome.google.com/webstore/detail/codeium/hobjkcpmjhlegmobgonaagepfckjkceh)
15+
16+
# codeium-parse
17+
18+
This repository contains tools built with [tree-sitter](https://github.com/tree-sitter/tree-sitter) that let you:
19+
* Inspect the concrete syntax tree of a source file
20+
* Use pre-written tree-sitter query files to locate important symbols in source code
21+
* Optionally format output in JSON to use the results in your own applications
22+
23+
Contributions welcome. These queries are used by Codeium Search to index your
24+
code locally for semantic search! Adding queries for your language here will
25+
enable Codeium Search to work better on your own code!
26+
27+
In particular, this repo provides a binary prepackaged with:
28+
* A recent version of the tree-sitter library
29+
* A large number of tree-sitter grammars
30+
* An implementation of many common query predicates
31+
32+
## Usage example
33+
34+
```console
35+
$ ./download_parse.sh
36+
$ ./parse -file examples/example.js -named_only
37+
program [0, 0] - [4, 0] "// Adds two numbers.\n…"
38+
comment [0, 0] - [0, 20] "// Adds two numbers."
39+
function_declaration [1, 0] - [3, 1] "function add(a, b) {\n…"
40+
name: identifier [1, 9] - [1, 12] "add"
41+
parameters: formal_parameters [1, 12] - [1, 18] "(a, b)"
42+
identifier [1, 13] - [1, 14] "a"
43+
identifier [1, 16] - [1, 17] "b"
44+
body: statement_block [1, 19] - [3, 1] "{\n…"
45+
return_statement [2, 4] - [2, 17] "return a + b;"
46+
binary_expression [2, 11] - [2, 16] "a + b"
47+
left: identifier [2, 11] - [2, 12] "a"
48+
right: identifier [2, 15] - [2, 16] "b"
49+
$ ./parse -file examples/example.js -use_tags_query -json | jq ".captures.doc[0].text"
50+
"// Adds two numbers."
51+
```
52+
53+
## Support status
54+
55+
### Queries
56+
57+
Queries try to follow the [conventions established by tree-sitter.](https://tree-sitter.github.io/tree-sitter/code-navigation-systems)
58+
59+
Most captures also include documentation as `@doc`. `@definition.function` and `@definition.method` also capture `@codeium.parameters`.
60+
61+
| | Python | TypeScript | JavaScript | Go |
62+
| ---------------------- | ------ | ---------- | ---------- | --- |
63+
| `@definition.class` |||||
64+
| `@definition.function` ||[^3] |||
65+
| `@definition.method` |[^1] |[^3] |||
66+
| `@definition.interface` | N/A || N/A ||
67+
| `@definition.namespace` | N/A || N/A | N/A |
68+
| `@definition.module` | N/A || N/A | N/A |
69+
| `@definition.type` | N/A || N/A ||
70+
| `@definition.constant` |||||
71+
| `@definition.enum` |||||
72+
| `@reference.call` |||||
73+
| `@reference.class` |[^2] ||||
74+
75+
[^1]: Currently functions and methods are not distinguished in Python.
76+
[^2]: Function calls and class instantiation are indistinguishable in Python.
77+
[^3]: Function and method signatures are captured individually in TypeScript. Therefore, the `@doc` capture may not exist on all nodes.
78+
79+
Want to write a query for a new language? `tags.scm` and other queries in each language's tree-sitter repository, [like tree-sitter-javascript](https://github.com/tree-sitter/tree-sitter-javascript/blob/5720b249490b3c17245ba772f6be4a43edb4e3b7/queries/tags.scm), are a good place to start.
80+
81+
### Query predicates
82+
83+
```console
84+
$ ./parse -supported_predicates
85+
#eq?/#not-eq?
86+
(#eq? <@capture|"literal"> <@capture|"literal">)
87+
Checks if two values are equal.
88+
89+
#has-parent?/#not-has-parent?
90+
(#has-parent? @capture node_type...)
91+
Checks if @capture has a parent node of any of the given types.
92+
93+
#has-type?/#not-has-type?
94+
(#has-type? @capture node_type...)
95+
Checks if @capture has a node of any of the given types.
96+
97+
#match?/#not-match?
98+
(#match? @capture "regex")
99+
Checks if the text for @capture matches the given regular expression.
100+
101+
#select-adjacent!
102+
(#select-adjacent! @capture @anchor)
103+
Selects @capture nodes contiguous with @anchor (all starting and ending on
104+
adjacent lines).
105+
106+
#strip!
107+
(#strip! @capture "regex")
108+
Removes all matching text from all @capture nodes.
109+
```
110+
111+
Need a predicate which hasn't been implemented? [File an issue!](https://github.com/Exafunction/codeium-parse/issues/new) We try to use [predicates from nvim-treesitter.](https://github.com/nvim-treesitter/nvim-treesitter/blob/980f0816cc28c20e45715687a0a21b5b39af59eb/lua/nvim-treesitter/query_predicates.lua)
112+
113+
### Grammars
114+
115+
```console
116+
$ ./parse -supported_languages
117+
c
118+
cpp
119+
csharp
120+
css
121+
dart
122+
go
123+
hcl
124+
html
125+
java
126+
javascript
127+
json
128+
kotlin
129+
latex
130+
markdown
131+
php
132+
protobuf
133+
python
134+
ruby
135+
rust
136+
shell
137+
svelte
138+
toml
139+
tsx
140+
typescript
141+
vue
142+
yaml
143+
```
144+
145+
Looking for support for another language? [File an issue](https://github.com/Exafunction/codeium-parse/issues/new) with a link to the repo that contains the grammar.
146+
147+
## Contributing
148+
149+
Pull requests are welcome. For non-issue discussions about `codeium-parse`, [join
150+
our Discord.](https://discord.gg/3XFf78nAx5)
151+
152+
### Adding and testing queries
153+
154+
* You can create new source files with patterns you want to target in `test_files/`.
155+
* Look at the syntax tree using `./parse -file test_files/<your file>` to get a sense of how to capture the pattern.
156+
* Learn the query syntax from [tree-sitter documentation.](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries)
157+
* Run `./goldens.sh` to see what your query captures.

codeium.svg

Lines changed: 9 additions & 0 deletions
Loading

download_parse.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
cd "$(dirname "${BASH_SOURCE[0]}")"
5+
VERSION="v0.0.1"
6+
rm -f parse.gz parse
7+
curl -Lo parse.gz "https://github.com/Exafunction/codeium-parse/releases/download/$VERSION/parse.gz"
8+
gzip -d parse.gz

examples/example.js

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
// Adds two numbers.
2+
function add(a, b) {
3+
return a + b;
4+
}

goldens.sh

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
cd "$(dirname "${BASH_SOURCE[0]}")"
5+
for test_file in test_files/*; do
6+
test_file="$(basename "$test_file")"
7+
echo "$test_file"
8+
./parse -file "test_files/$test_file" -use_tags_query -tags_query_dir "queries" > "goldens/$test_file.golden"
9+
done

test.sh

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
cd "$(dirname "${BASH_SOURCE[0]}")"
5+
for test_file in test_files/*; do
6+
test_file="$(basename "$test_file")"
7+
echo "$test_file"
8+
./parse -file "test_files/$test_file" -use_tags_query -tags_query_dir "queries" > "goldens/$test_file.golden.tmp"
9+
trap 'rm "goldens/$test_file.golden.tmp"' EXIT
10+
diff -u "goldens/$test_file.golden" "goldens/$test_file.golden.tmp"
11+
done

0 commit comments

Comments
 (0)