|
| 1 | +# Adding Tree-Sitter Language Support via Git Submodules |
| 2 | + |
| 3 | +This document provides step-by-step instructions for adding new tree-sitter language parsers (specifically tree-sitter-abl and tree-sitter-df) to the Roo Code codebase using git submodules. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The goal is to integrate the following tree-sitter repositories: |
| 8 | + |
| 9 | +- [tree-sitter-abl](https://github.com/usagi-coffee/tree-sitter-abl) - For OpenEdge ABL language support |
| 10 | +- [tree-sitter-df](https://github.com/usagi-coffee/tree-sitter-df) - For OpenEdge Data Dictionary (.df) file support |
| 11 | + |
| 12 | +## Step-by-Step Instructions |
| 13 | + |
| 14 | +### 1. Add Git Submodules |
| 15 | + |
| 16 | +First, create a `/deps` directory in the project root and add the tree-sitter repositories as submodules: |
| 17 | + |
| 18 | +```bash |
| 19 | +# Create deps directory if it doesn't exist |
| 20 | +mkdir -p deps |
| 21 | + |
| 22 | +# Add tree-sitter-abl as a submodule |
| 23 | +git submodule add https://github.com/usagi-coffee/tree-sitter-abl.git deps/tree-sitter-abl |
| 24 | + |
| 25 | +# Add tree-sitter-df as a submodule |
| 26 | +git submodule add https://github.com/usagi-coffee/tree-sitter-df.git deps/tree-sitter-df |
| 27 | + |
| 28 | +# Initialize and update submodules |
| 29 | +git submodule update --init --recursive |
| 30 | +``` |
| 31 | + |
| 32 | +### 2. Build WASM Files from Submodules |
| 33 | + |
| 34 | +You'll need to compile the tree-sitter grammars to WASM format. This requires the tree-sitter CLI tool: |
| 35 | + |
| 36 | +```bash |
| 37 | +# Install tree-sitter CLI if not already installed |
| 38 | +npm install -g tree-sitter-cli |
| 39 | + |
| 40 | +# Build WASM for tree-sitter-abl |
| 41 | +cd deps/tree-sitter-abl |
| 42 | +tree-sitter build --wasm |
| 43 | +# This creates tree-sitter-abl.wasm |
| 44 | + |
| 45 | +# Build WASM for tree-sitter-df |
| 46 | +cd ../tree-sitter-df |
| 47 | +tree-sitter build --wasm |
| 48 | +# This creates tree-sitter-df.wasm |
| 49 | + |
| 50 | +cd ../.. |
| 51 | +``` |
| 52 | + |
| 53 | +### 3. Update Build Process to Copy WASM Files |
| 54 | + |
| 55 | +Modify `packages/build/src/esbuild.ts` to include the new WASM files in the build process: |
| 56 | + |
| 57 | +```typescript |
| 58 | +// In the copyWasms function, after copying from tree-sitter-wasms: |
| 59 | + |
| 60 | +// Copy custom tree-sitter WASM files from deps |
| 61 | +const customWasmFiles = [ |
| 62 | + { source: "deps/tree-sitter-abl/tree-sitter-abl.wasm", name: "tree-sitter-abl.wasm" }, |
| 63 | + { source: "deps/tree-sitter-df/tree-sitter-df.wasm", name: "tree-sitter-df.wasm" }, |
| 64 | +] |
| 65 | + |
| 66 | +customWasmFiles.forEach(({ source, name }) => { |
| 67 | + const sourcePath = path.join(srcDir, "..", source) |
| 68 | + if (fs.existsSync(sourcePath)) { |
| 69 | + fs.copyFileSync(sourcePath, path.join(distDir, name)) |
| 70 | + console.log(`[copyWasms] Copied custom ${name} to ${distDir}`) |
| 71 | + } else { |
| 72 | + console.warn(`[copyWasms] Custom WASM file not found: ${sourcePath}`) |
| 73 | + } |
| 74 | +}) |
| 75 | +``` |
| 76 | + |
| 77 | +### 4. Add File Extensions to Scanner |
| 78 | + |
| 79 | +Update `src/services/tree-sitter/index.ts` to include the new file extensions: |
| 80 | + |
| 81 | +```typescript |
| 82 | +const extensions = [ |
| 83 | + // ... existing extensions ... |
| 84 | + |
| 85 | + // OpenEdge ABL |
| 86 | + "p", // ABL procedure files |
| 87 | + "i", // ABL include files |
| 88 | + "w", // ABL window files |
| 89 | + "cls", // ABL class files |
| 90 | + |
| 91 | + // OpenEdge Data Dictionary |
| 92 | + "df", // Data dictionary files |
| 93 | + |
| 94 | + // ... rest of extensions ... |
| 95 | +].map((e) => `.${e}`) |
| 96 | +``` |
| 97 | + |
| 98 | +### 5. Add Language Parser Support |
| 99 | + |
| 100 | +Update `src/services/tree-sitter/languageParser.ts` to handle the new languages: |
| 101 | + |
| 102 | +```typescript |
| 103 | +// Add imports for the new query strings (create these first - see step 6) |
| 104 | +import { ablQuery } from "./queries/abl" |
| 105 | +import { dfQuery } from "./queries/df" |
| 106 | + |
| 107 | +// In the loadRequiredLanguageParsers function, add cases: |
| 108 | +case "p": |
| 109 | +case "i": |
| 110 | +case "w": |
| 111 | +case "cls": |
| 112 | + language = await loadLanguage("abl", sourceDirectory) |
| 113 | + query = new Query(language, ablQuery) |
| 114 | + break |
| 115 | + |
| 116 | +case "df": |
| 117 | + language = await loadLanguage("df", sourceDirectory) |
| 118 | + query = new Query(language, dfQuery) |
| 119 | + break |
| 120 | +``` |
| 121 | + |
| 122 | +### 6. Create Query Files |
| 123 | + |
| 124 | +Create query files for the new languages: |
| 125 | + |
| 126 | +**src/services/tree-sitter/queries/abl.ts:** |
| 127 | + |
| 128 | +```typescript |
| 129 | +export default ` |
| 130 | +; ABL Query for code definitions |
| 131 | +; Based on tree-sitter-abl grammar |
| 132 | +
|
| 133 | +; Procedure definitions |
| 134 | +(procedure_statement |
| 135 | + name: (identifier) @name.definition.function) |
| 136 | +
|
| 137 | +; Function definitions |
| 138 | +(function_statement |
| 139 | + name: (identifier) @name.definition.function) |
| 140 | +
|
| 141 | +; Method definitions |
| 142 | +(method_statement |
| 143 | + name: (identifier) @name.definition.method) |
| 144 | +
|
| 145 | +; Class definitions |
| 146 | +(class_statement |
| 147 | + name: (identifier) @name.definition.class) |
| 148 | +
|
| 149 | +; Interface definitions |
| 150 | +(interface_statement |
| 151 | + name: (identifier) @name.definition.interface) |
| 152 | +
|
| 153 | +; Variable definitions |
| 154 | +(define_variable_statement |
| 155 | + name: (identifier) @name.definition.variable) |
| 156 | +
|
| 157 | +; Property definitions |
| 158 | +(define_property_statement |
| 159 | + name: (identifier) @name.definition.property) |
| 160 | +
|
| 161 | +; Temp-table definitions |
| 162 | +(define_temp_table_statement |
| 163 | + name: (identifier) @name.definition.table) |
| 164 | +` |
| 165 | +``` |
| 166 | + |
| 167 | +**src/services/tree-sitter/queries/df.ts:** |
| 168 | + |
| 169 | +```typescript |
| 170 | +export default ` |
| 171 | +; Data Dictionary Query for schema definitions |
| 172 | +; Based on tree-sitter-df grammar |
| 173 | +
|
| 174 | +; Table definitions |
| 175 | +(table_definition |
| 176 | + name: (identifier) @name.definition.table) |
| 177 | +
|
| 178 | +; Field definitions |
| 179 | +(field_definition |
| 180 | + name: (identifier) @name.definition.field) |
| 181 | +
|
| 182 | +; Index definitions |
| 183 | +(index_definition |
| 184 | + name: (identifier) @name.definition.index) |
| 185 | +
|
| 186 | +; Sequence definitions |
| 187 | +(sequence_definition |
| 188 | + name: (identifier) @name.definition.sequence) |
| 189 | +` |
| 190 | +``` |
| 191 | + |
| 192 | +### 7. Add to Fallback Extensions (Optional) |
| 193 | + |
| 194 | +If the parsers are not stable or complete, you may want to add these extensions to the fallback list in `src/services/code-index/shared/supported-extensions.ts`: |
| 195 | + |
| 196 | +```typescript |
| 197 | +export const fallbackExtensions = [ |
| 198 | + // ... existing extensions ... |
| 199 | + ".p", // ABL - use fallback if parser is incomplete |
| 200 | + ".i", // ABL include |
| 201 | + ".w", // ABL window |
| 202 | + ".cls", // ABL class |
| 203 | + ".df", // Data dictionary |
| 204 | +] |
| 205 | +``` |
| 206 | + |
| 207 | +### 8. Update GitHub Actions Workflow |
| 208 | + |
| 209 | +Modify `.github/workflows/code-qa.yml` to handle submodules: |
| 210 | + |
| 211 | +```yaml |
| 212 | +- name: Checkout code |
| 213 | + uses: actions/checkout@v4 |
| 214 | + with: |
| 215 | + submodules: recursive # Add this line to checkout submodules |
| 216 | + |
| 217 | +# Add a step to build WASM files from submodules |
| 218 | +- name: Build custom tree-sitter WASM files |
| 219 | + run: | |
| 220 | + # Install tree-sitter CLI |
| 221 | + npm install -g tree-sitter-cli |
| 222 | +
|
| 223 | + # Build ABL WASM |
| 224 | + if [ -d "deps/tree-sitter-abl" ]; then |
| 225 | + cd deps/tree-sitter-abl |
| 226 | + tree-sitter build --wasm |
| 227 | + cd ../.. |
| 228 | + fi |
| 229 | +
|
| 230 | + # Build DF WASM |
| 231 | + if [ -d "deps/tree-sitter-df" ]; then |
| 232 | + cd deps/tree-sitter-df |
| 233 | + tree-sitter build --wasm |
| 234 | + cd ../.. |
| 235 | + fi |
| 236 | +``` |
| 237 | +
|
| 238 | +### 9. Add Tests |
| 239 | +
|
| 240 | +Create test files to verify the new language support: |
| 241 | +
|
| 242 | +**src/services/tree-sitter/**tests**/parseSourceCodeDefinitions.abl.spec.ts:** |
| 243 | +
|
| 244 | +```typescript |
| 245 | +import { describe, it, expect } from "vitest" |
| 246 | +import { parseTestFile } from "./helpers" |
| 247 | +import ablQuery from "../queries/abl" |
| 248 | + |
| 249 | +describe("parseSourceCodeDefinitions - ABL", () => { |
| 250 | + it("should parse ABL procedure definitions", async () => { |
| 251 | + const { captures } = await parseTestFile({ |
| 252 | + language: "abl", |
| 253 | + wasmFile: "tree-sitter-abl.wasm", |
| 254 | + queryString: ablQuery, |
| 255 | + content: ` |
| 256 | +PROCEDURE myProcedure: |
| 257 | + DEFINE VARIABLE x AS INTEGER NO-UNDO. |
| 258 | + x = 10. |
| 259 | +END PROCEDURE. |
| 260 | + |
| 261 | +FUNCTION myFunction RETURNS INTEGER: |
| 262 | + RETURN 42. |
| 263 | +END FUNCTION. |
| 264 | + `, |
| 265 | + }) |
| 266 | +
|
| 267 | + expect(captures).toContainEqual( |
| 268 | + expect.objectContaining({ |
| 269 | + name: "name.definition.function", |
| 270 | + node: expect.objectContaining({ text: "myProcedure" }), |
| 271 | + }), |
| 272 | + ) |
| 273 | + }) |
| 274 | +}) |
| 275 | +``` |
| 276 | + |
| 277 | +### 10. Update Documentation |
| 278 | + |
| 279 | +Add the new languages to any relevant documentation: |
| 280 | + |
| 281 | +1. Update README.md to mention OpenEdge ABL support |
| 282 | +2. Add to the list of supported languages in documentation |
| 283 | +3. Update CHANGELOG.md with the new feature |
| 284 | + |
| 285 | +## Building and Testing |
| 286 | + |
| 287 | +After making all changes: |
| 288 | + |
| 289 | +```bash |
| 290 | +# Install dependencies |
| 291 | +pnpm install |
| 292 | +
|
| 293 | +# Build the project |
| 294 | +pnpm build |
| 295 | +
|
| 296 | +# Run tests |
| 297 | +pnpm test |
| 298 | +
|
| 299 | +# Bundle the extension |
| 300 | +pnpm bundle |
| 301 | +``` |
| 302 | + |
| 303 | +## Maintenance |
| 304 | + |
| 305 | +### Updating Submodules |
| 306 | + |
| 307 | +To update the submodules to their latest versions: |
| 308 | + |
| 309 | +```bash |
| 310 | +git submodule update --remote --merge |
| 311 | +``` |
| 312 | + |
| 313 | +### Adding More Languages |
| 314 | + |
| 315 | +Follow the same pattern: |
| 316 | + |
| 317 | +1. Add submodule to `/deps` |
| 318 | +2. Build WASM file |
| 319 | +3. Add to build process |
| 320 | +4. Add file extensions |
| 321 | +5. Add parser cases |
| 322 | +6. Create query files |
| 323 | +7. Add tests |
| 324 | + |
| 325 | +## Troubleshooting |
| 326 | + |
| 327 | +### WASM Build Failures |
| 328 | + |
| 329 | +If the tree-sitter CLI fails to build WASM: |
| 330 | + |
| 331 | +- Ensure you have the latest tree-sitter CLI: `npm update -g tree-sitter-cli` |
| 332 | +- Check that the grammar has a valid `grammar.js` file |
| 333 | +- Verify Node.js version compatibility |
| 334 | + |
| 335 | +### Parser Not Working |
| 336 | + |
| 337 | +If files are not being parsed: |
| 338 | + |
| 339 | +1. Check that file extensions are added to `src/services/tree-sitter/index.ts` |
| 340 | +2. Verify WASM files are being copied to dist directory |
| 341 | +3. Check browser console for WASM loading errors |
| 342 | +4. Test with fallback chunking first to isolate parser issues |
| 343 | + |
| 344 | +### Query Issues |
| 345 | + |
| 346 | +If queries don't capture expected definitions: |
| 347 | + |
| 348 | +- Use tree-sitter playground to test queries |
| 349 | +- Check the grammar's node types match query patterns |
| 350 | +- Start with simple queries and gradually add complexity |
| 351 | + |
| 352 | +## Alternative Approach: Using npm Packages |
| 353 | + |
| 354 | +If the repositories provide npm packages with prebuilt WASM files, you could alternatively: |
| 355 | + |
| 356 | +1. Add them as dependencies in `src/package.json` |
| 357 | +2. Import WASM files from node_modules |
| 358 | +3. Skip the submodule approach entirely |
| 359 | + |
| 360 | +This would be simpler but requires the maintainers to publish npm packages with WASM builds. |
| 361 | + |
| 362 | +## References |
| 363 | + |
| 364 | +- [Tree-sitter Documentation](https://tree-sitter.github.io/tree-sitter/) |
| 365 | +- [Web Tree-sitter](https://github.com/tree-sitter/tree-sitter/tree/master/lib/binding_web) |
| 366 | +- [Creating Tree-sitter Parsers](https://tree-sitter.github.io/tree-sitter/creating-parsers) |
| 367 | +- [Tree-sitter Queries](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries) |
0 commit comments