Skip to content

Commit 51caa47

Browse files
authored
Build a tree-sitter playground with the grammars that Piranha uses (#736)
Piranha uses several grammar repositories with custom patches to support the transformations. While these patches are being upstreamed, there may be discrepancies between the grammars in this repository and the upstream grammars. This PR adds a build script to (1) instantiate the `index.html.template` (slightly modified from [tree-sitter-cli-playground-html]) and copy it to the given `dist` directory. (2) clone the (custom) grammar repositories that we use in Piranha (by parsing the `Cargo.toml` file), check out the specific versions, and then build the WASM files for the grammars and copy them to the `dist/assets` directory. (3) the `dist` directory can be served by any web server (e.g., `python -m http.server`). We also add a GitHub Actions job to automatically deploy to GitHub Pages whenever the grammars have been updated in Piranha (or the build script / GitHub Actions configuration chanegs). This is tested in my fork: https://yuxincs.github.io/piranha/tree-sitter-playground/ [tree-sitter-cli-playground-html]: https://github.com/tree-sitter/tree-sitter/blob/eaa10b279f208b47f65e77833d65763f072f3030/crates/cli/src/playground.html#L13
1 parent b7a83b1 commit 51caa47

File tree

7 files changed

+729
-0
lines changed

7 files changed

+729
-0
lines changed
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
name: Deploy Tree-sitter Playground
2+
3+
on:
4+
push:
5+
branches: [ master ]
6+
paths:
7+
- 'playground/tree-sitter/**'
8+
- '**/Cargo.toml'
9+
- '.github/workflows/deploy_tree_sitter_playground.yml'
10+
11+
jobs:
12+
build-and-deploy:
13+
runs-on: ubuntu-latest
14+
15+
permissions:
16+
contents: read
17+
pages: write
18+
id-token: write
19+
20+
steps:
21+
- name: Checkout
22+
uses: actions/checkout@v4
23+
24+
- name: Setup Python
25+
uses: actions/setup-python@v4
26+
with:
27+
python-version: '3.x'
28+
29+
- name: Setup Rust toolchain
30+
uses: actions-rust-lang/setup-rust-toolchain@v1
31+
with:
32+
toolchain: stable
33+
34+
- name: Install tree-sitter CLI
35+
# We have to lock tree-sitter CLI to 0.24 since newer CLI does not support building
36+
# the old grammars we use in Piranha.
37+
run: cargo install tree-sitter-cli --version "=0.24"
38+
39+
- name: Build playground
40+
run: python build.py --dist-dir ./dist/tree-sitter-playground
41+
working-directory: playground/tree-sitter
42+
43+
- name: Setup Pages
44+
uses: actions/configure-pages@v4
45+
46+
- name: Upload artifact
47+
uses: actions/upload-pages-artifact@v3
48+
with:
49+
path: playground/tree-sitter/dist
50+
51+
- name: Deploy to GitHub Pages
52+
id: deployment
53+
uses: actions/deploy-pages@v4
File renamed without changes.

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,3 +45,6 @@ env/
4545
npm-debug.log*
4646
yarn-debug.log*
4747
yarn-error.log*
48+
49+
# Built website directories
50+
dist/

README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,15 @@ A few additional links on Piranha:
9999

100100
If you have any questions on how to use Piranha or find any bugs, please [open a GitHub issue](https://github.com/uber/piranha/issues).
101101

102+
## Piranha Development
103+
104+
Piranha uses several grammar repositories with custom patches to support the transformations. While
105+
these patches are being upstreamed, there may be discrepancies between the grammars in this
106+
repository and the upstream grammars. Therefore, we have built a custom tree-sitter playground
107+
that can be used to test the grammars and queries for easier development:
108+
109+
https://uber.github.io/piranha/tree-sitter-playground/
110+
102111
## License
103112
Piranha is licensed under the Apache 2.0 license. See the LICENSE file for more information.
104113

playground/tree-sitter/README.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Tree-sitter Playground for Piranha
2+
3+
Piranha uses several grammar repositories with custom patches to support the transformations. While
4+
these patches are being upstreamed, there may be discrepancies between the grammars in this
5+
repository and the upstream grammars.
6+
7+
This directory contains the build script to
8+
9+
(1) instantiate the index.html.template (slightly modified from [tree-sitter-cli-playground-html])
10+
and copy it to the given `dist` directory.
11+
12+
(2) clone the (custom) grammar repositories that we use in Piranha (by parsing the `Cargo.toml`
13+
file), check out the specific versions, and then build the WASM files for the grammars and copy
14+
them to the `dist/assets` directory.
15+
16+
(3) the `dist` directory can be served by any web server (e.g., `python -m http.server`).
17+
18+
We host the playground at https://uber.github.io/piranha/tree-sitter-playground/ via GitHub Pages.
19+
20+
[tree-sitter-cli-playground-html]: https://github.com/tree-sitter/tree-sitter/blob/eaa10b279f208b47f65e77833d65763f072f3030/crates/cli/src/playground.html#L13

playground/tree-sitter/build.py

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
#!/usr/bin/env python3
2+
3+
from __future__ import annotations
4+
5+
import argparse
6+
import json
7+
import shutil
8+
import subprocess
9+
import tempfile
10+
import urllib.parse
11+
from pathlib import Path
12+
from typing import Dict, List
13+
14+
_HERE = Path(__file__).parent
15+
16+
_LANGUAGES = ["swift", "kotlin", "java", "go", "python"]
17+
18+
19+
def run_command(cmd: List[str], cwd: Path = None) -> subprocess.CompletedProcess[str]:
20+
"""Run a shell command and return the result."""
21+
try:
22+
result = subprocess.run(
23+
cmd, cwd=cwd, capture_output=True, text=True, check=True
24+
)
25+
return result
26+
except subprocess.CalledProcessError as e:
27+
print(f"Command failed: {' '.join(cmd)}")
28+
print(f"Error: {e.stderr}")
29+
raise
30+
31+
32+
def extract_tree_sitter_deps(repo_root: Path) -> list[tuple[str, str, dict[str, str]]]:
33+
"""Extract tree-sitter dependencies from cargo metadata."""
34+
cmd = ["cargo", "metadata", "--format-version", "1"]
35+
result = run_command(cmd, cwd=repo_root)
36+
metadata = json.loads(result.stdout)
37+
38+
deps = []
39+
40+
for package in metadata["packages"]:
41+
if not package["name"].startswith("tree-sitter-"):
42+
continue
43+
44+
lang_name = package["name"].replace("tree-sitter-", "")
45+
if lang_name in _LANGUAGES:
46+
deps.append((lang_name, package["name"], package))
47+
48+
return deps
49+
50+
51+
def parse_git_source(dep_info: Dict) -> tuple[str, str]:
52+
"""Parse git source information from dependency."""
53+
source: str = dep_info["source"]
54+
# An example git string: "git+https://github.com/danieltrt/tree-sitter-go.git?rev=ea5ceb716012db8813a2c05fab23c3a020988724#ea5ceb716012db8813a2c05fab23c3a020988724"
55+
# So we first remove the "git+" prefix and remove the "#" part if it exists.
56+
source = source.removeprefix("git+").split("#")[0].strip()
57+
58+
if "?" not in source:
59+
raise ValueError(f"Expecting ? in git source string: {source}")
60+
61+
git_url, query_string = source.split("?", 1)
62+
params = urllib.parse.parse_qs(query_string)
63+
rev = (
64+
params.get("rev", [None])[0]
65+
or params.get("branch", [None])[0]
66+
or params.get("tag", [None])[0]
67+
)
68+
if not rev:
69+
raise ValueError(f"Missing rev/branch/tag information in git source: {source}")
70+
71+
return git_url, rev
72+
73+
74+
def clone_grammar(name: str, dep_info: Dict, temp_dir: Path) -> Path:
75+
"""Clone a grammar repository to temporary directory."""
76+
source: str = dep_info["source"]
77+
# If it is a git source, parse it. Otherwise, it is a registry source, and we can assume it is
78+
# from tree-sitter official repo.
79+
if source.startswith("git+"):
80+
git_url, version = parse_git_source(dep_info)
81+
elif source.startswith("registry+"):
82+
repo_name = name.replace("tree-sitter-", "")
83+
git_url = f"https://github.com/tree-sitter/tree-sitter-{repo_name}"
84+
version = "v" + dep_info["version"]
85+
else:
86+
raise ValueError(f"Unsupported source type for {name}: {source}")
87+
88+
clone_dir = temp_dir / name
89+
90+
print(f"Cloning {name} from {git_url} and checking out {version}")
91+
run_command(["git", "clone", git_url, str(clone_dir)])
92+
run_command(["git", "checkout", version], cwd=clone_dir)
93+
94+
return clone_dir
95+
96+
97+
def build_wasm(grammar_dir: Path, name: str) -> Path:
98+
"""Build WASM file for a grammar."""
99+
print(f"Building WASM for {name}")
100+
101+
# Note that we have to use tree-sitter CLI 0.24 since the main tree-sitter and grammars
102+
# we use in Piranha are old and not compatible with the latest tree-sitter CLI.
103+
# TODO: remove this restriction once we upstream all our changes to tree-sitter grammars
104+
# and upgrade to latest tree-sitter in Piranha.
105+
try:
106+
proc = run_command(["tree-sitter", "--version"])
107+
version = proc.stdout.strip().split()[1]
108+
if not version.startswith("0.24"):
109+
raise RuntimeError(f"tree-sitter CLI version {version} not supported")
110+
except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError):
111+
raise RuntimeError(
112+
"tree-sitter CLI version 0.24.x is required. Install with: cargo install tree-sitter-cli --version 0.24.4"
113+
)
114+
115+
print(f"Using tree-sitter CLI version: {proc.stdout.strip()}")
116+
117+
run_command(["tree-sitter", "build", "--wasm"], cwd=grammar_dir)
118+
119+
wasm_file = grammar_dir / f"{name}.wasm"
120+
if not wasm_file.exists():
121+
raise FileNotFoundError(f"WASM file not found for {name}")
122+
123+
return wasm_file
124+
125+
126+
def copy_wasm_to_assets(wasm_file: Path, lang_name: str, assets_dir: Path) -> Path:
127+
"""Copy WASM file to assets directory."""
128+
assets_dir.mkdir(exist_ok=True)
129+
130+
dest_file = assets_dir / f"tree-sitter-{lang_name}.wasm"
131+
132+
print(f"Copying {wasm_file} to {dest_file}")
133+
shutil.copy2(wasm_file, dest_file)
134+
135+
return dest_file
136+
137+
138+
def instantiate_index_html(template_path: Path, output_path: Path):
139+
with template_path.open("r") as inp, output_path.open("w") as out:
140+
content = inp.read()
141+
languages = [
142+
f'<option value="{lang}">{lang.title()}</option>' for lang in _LANGUAGES
143+
]
144+
content = content.replace("{{ LANGUAGE_OPTIONS }}", "\n".join(languages))
145+
out.write(content)
146+
147+
148+
def main():
149+
"""Build WASM files for all supported tree-sitter dependencies."""
150+
151+
"""Main entry point with argument parsing."""
152+
parser = argparse.ArgumentParser(
153+
description="Build tree-sitter playground with WASM files"
154+
)
155+
parser.add_argument(
156+
"--dist-dir",
157+
"-d",
158+
type=Path,
159+
help="Directory to copy playground files and build WASM files to",
160+
default=Path().cwd() / "dist",
161+
)
162+
163+
args = parser.parse_args()
164+
dist_dir = Path(args.dist_dir)
165+
166+
if dist_dir.exists():
167+
print(f"Dist directory {dist_dir} already exists, clearing it...")
168+
shutil.rmtree(dist_dir)
169+
170+
proc = run_command(["git", "rev-parse", "--show-toplevel"])
171+
repo_root = Path(proc.stdout.strip())
172+
print(f"Using repo root: {repo_root}")
173+
print()
174+
175+
print("Instantiating index.html.template to dist directory...")
176+
dist_dir.mkdir(parents=True, exist_ok=True)
177+
instantiate_index_html(_HERE / "index.html.template", dist_dir / "index.html")
178+
179+
print("Building WASM files for all supported tree-sitter dependencies...")
180+
181+
print("Extracting tree-sitter dependencies to build WASM grammars...")
182+
deps = extract_tree_sitter_deps(repo_root)
183+
184+
if not deps:
185+
raise RuntimeError("No supported tree-sitter dependencies found")
186+
187+
print(f"Found {len(deps)} supported tree-sitter dependencies:")
188+
for lang_name, pkg_name, _ in deps:
189+
print(f" - {pkg_name} ({lang_name})")
190+
191+
with tempfile.TemporaryDirectory() as temp_dir_str:
192+
temp_dir = Path(temp_dir_str)
193+
194+
for lang_name, pkg_name, dep_info in deps:
195+
print(f"\n--- Processing {pkg_name} ---")
196+
197+
grammar_dir = clone_grammar(pkg_name, dep_info, temp_dir)
198+
wasm_file = build_wasm(grammar_dir, pkg_name)
199+
copy_wasm_to_assets(wasm_file, lang_name, dist_dir / "assets")
200+
201+
print(f"✓ Successfully built {pkg_name}")
202+
203+
print("\n=== Build Complete ===")
204+
print(f"Successfully built {len(deps)} grammars: {_LANGUAGES}")
205+
print(f"Output directory: {dist_dir}")
206+
207+
208+
if __name__ == "__main__":
209+
main()

0 commit comments

Comments
 (0)