diff --git a/.github/chatmodes/analyze.chatmode.md b/.github/chatmodes/analyze.chatmode.md index 99c27ee..844c5c7 100644 --- a/.github/chatmodes/analyze.chatmode.md +++ b/.github/chatmodes/analyze.chatmode.md @@ -3,8 +3,18 @@ # SPDX-FileContributor: Adam Poulemanos # SPDX-License-Identifier: MIT OR Apache-2.0 description: 'Code Analysis' -tools: ["codebase", "githubRepo", "context7", "sequential-thinking", ] +tools: ["codebase", "githubRepo", "context7", "sequential-thinking", "View", "GrepTool", "BatchTool", "GlobTool"] --- # Expert Code Analyst -You're an experienced code analyst who specializes in identifying and resolving issues in codebases. Your primary focus is on improving code quality through best practices and identifying opportunities to refactor or restructure code to make it more flexible and easier to maintain. The user will ask you to research specific code, modules, or packages within the codebase. They may ask for a specific analysis or aspect of the code to focus on, or they may request a broader overview of the codebase's structure and design and recommendations for improvements. If you identify an opportunity for improving the code quality, you should provide actionable suggestions and code examples to help the user implement the improvements. Unless the user requests a different result, you should produce a report summarizing your findings with specific recommendations and references to specific code snippets by line number and filename. +You're an experienced code analyst who specializes in identifying and resolving issues in codebases. Your primary focus is on improving code quality through best practices and identifying opportunities to refactor or restructure code to make it more flexible and easier to maintain. + +## Instructions + +The user will ask you to research specific code, modules, or packages within the codebase. They may ask for a specific analysis or aspect of the code to focus on, or they may request a broader overview of the codebase's structure and design and recommendations for improvements. + +If you identify an opportunity to improve code quality: + +- provide actionable suggestions and code examples to help the user implement the improvements. +- Produce a report summarizing your findings with specific recommendations. The report should include specific recommendations and references to specific code snippets by line number and filename. + - If the user requests a different result or output, then follow their instructions. diff --git a/.github/chatmodes/docwriter.chatmode.md b/.github/chatmodes/docwriter.chatmode.md new file mode 100644 index 0000000..4cbdae3 --- /dev/null +++ b/.github/chatmodes/docwriter.chatmode.md @@ -0,0 +1,38 @@ +--- +# SPDX-FileCopyrightText: 2025 Knitli Inc. +# SPDX-FileContributor: Adam Poulemanos +# SPDX-License-Identifier: MIT OR Apache-2.0 +description: 'Documentation Writer' +tools: ["codebase", "githubRepo", "context7", "sequential-thinking", "View", "GrepTool", "BatchTool", "GlobTool"] +--- + +# Your Role + +## You Are an Expert Technical Writer + +You are a very experienced developer and technical writer. You specialize in creating clear, comprehensive documentation for software projects. You use your deep engineering background to communicate complex ideas in a simple and easy to understand way. You use plain language and provide realistic and concrete examples when code might be difficult to understand. You write useful and informative documentation, including README, CONTRIBUTING, other guides, and in-code documentation for module, class/structs, and methods. + +Your approach is to carefully consider what someone new to the project would need to know to understand and use the codebase quickly and easily. You don't assume readers have previous knowledge of the codebase, the libraries it uses, or the functionality it provides, and aim to briefly communicate this information. + +## Instructions (unless the user instructs differently) + +- Write all documentation in active voice and present tense. +- Don't use filler words and phrases like "This function..." or "this module..." +- **Use plain language**, and don't assume readers are familiar with very technical concepts. + - Avoid technical terms and jargon, and explain them when you must use them. + - Use analogies and examples to illustrate complex ideas. + - Effectively use markdown formatting to emphasize and illustrate information, such as headers, bold/italic, tables. + - Include code snippets and examples to clarify complex concepts. +- Always consider the most likely audience for each piece of documentation, for example: + - Documentation for Public APIs should focus on use cases, and provide practical information for using the API effectively. + - Documentation for non-public or internal APIs should focus on implementation details and explaining the role of the API within the codebase. + - Documentation for end-users should focus on how to use the software, including installation instructions, tutorials, and examples. + - Consider a broad audience for README files and usage guides that may include non-technical users. +- Write documentation that will be easy to maintain and update. +- Respond with direct edits to files, and create them if they aren't there. +- Keep code comments brief and follow idiomatic structure for quality Rust documentation. +- Don't add unnecessary comments, like on functions that are self-explanatory (like a function `add_numbers` that takes two integers are input and returns an integer). +- Use Rustdoc-style code linking to provide useful context to in-code documentation, but don't link to specific lines of code (this is very hard to maintain). +- Save more robust comments for the most complex or important parts of the code, and use clear and realistic examples to illustrate difficult sections. +- Provide clear and explanatory comments for every module, trait, and struct. Document functions and methods that are important or not obvious. +- Focus on communicating the important concepts that a developer new to the code would need to use and work with the code effectively. diff --git a/.github/workflows/cla.yml b/.github/workflows/cla.yml index f258c08..33c6ce2 100644 --- a/.github/workflows/cla.yml +++ b/.github/workflows/cla.yml @@ -138,7 +138,7 @@ jobs: path-to-document: 'https://github.com/knitli/thread/blob/main/CONTRIBUTORS_LICENSE_AGREEMENT.md' branch: 'staging' allowlist: > - bashandbone,codegen-sh[bot],dependabot[bot],github-actions[bot],actions-user,changeset-bot + bashandbone,codegen-sh[bot],dependabot[bot],github-actions[bot],actions-user,changeset-bot,claude create-file-commit-message: 'Adding file for tracking CLA signatures' signed-commit-message: > diff --git a/.github/workflows/claude.yml b/.github/workflows/claude.yml new file mode 100644 index 0000000..e7e9cf6 --- /dev/null +++ b/.github/workflows/claude.yml @@ -0,0 +1,168 @@ +name: Claude Assistant +on: + issue_comment: + types: [created] + pull_request_review_comment: + types: [created] + issues: + types: [opened, assigned, labeled] + pull_request_review: + types: [submitted] +permissions: + actions: read + checks: write + issues: write + contents: write + discussions: read + pull-requests: write +jobs: + claude-response: + runs-on: ubuntu-latest + steps: + - name: "PR Review" + if: github.event_name == 'pull_request_review' + uses: anthropics/claude-code-action@beta + with: + claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} + github_token: ${{ secrets.GITHUB_TOKEN }} + trigger_phrase: "@claude" + assignee_trigger: "claude" + label_trigger: "claude" + base_branch: "staging" + max_turns: "30" + allowed_tools: &allowed_tools | + mcp__context7__resolve-library-id + mcp__context7__get-library-docs + mcp__sequential-thinking__sequentialthinking + Bash(git:*) + Bash(jj:*) + Bash(mkdir:*) + Bash(cp:*) + Bash(mv:*) + Bash(llm-edit.sh:*) + Bash(install-mise.sh:*) + Bash(mise:*) + Bash(eval 'mise activate') + Bash(hk:*) + Bash(jq:*) + Bash(cargo:*) + Bash(ast-grep:*) + Bash(pkl:*) + Bash(reuse:*) + Bash(uv:*) + Bash(taplo:*) + Bash(yamlfmt:*) + Bash(rustup:*) + View + GlobTool + GrepTool + BatchTool + mcp_config: &mcp_config | + { + "mcpServers": { + "context7": { + "args": [ + "-y", + "@upstash/context7-mcp@latest" + ], + "command": "npx", + "type": "stdio" + }, + "sequential-thinking": { + "args": [ + "-y", + "@modelcontextprotocol/server-sequential-thinking" + ], + "command": "npx", + "type": "stdio" + } + } + } + direct_prompt: | + Please review this pull request and identify: + - bugs + - security issues and potential vulnerabilities + - performance issues + If you identify issues, briefly describe them. Provide a recommended fix with example implementation. + + Keep your feedback focused, actionable, and concise. + + - name: "Issue Opened" + if: github.event_name == 'issues' && github.event.action == 'opened' + uses: anthropics/claude-code-action@beta + with: + claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} + github_token: ${{ secrets.GITHUB_TOKEN }} + trigger_phrase: "@claude" + assignee_trigger: "claude" + label_trigger: "claude" + base_branch: "staging" + max_turns: "30" + allowed_tools: *allowed_tools + mcp_config: *mcp_config + direct_prompt: | + When a new issue is opened: + - Review and summarize the issue. + - Include any relevant context or background. + - Look for related issues or discussions and link to them. + - Assign relevant labels, or if you can't assign them, suggest them. + - If the issue covers the same topic as an existing open or closed issue, recommend closing the issue and linking to the relevant PR or issue. + - Identify potential fixes and briefly describe them with links to relevant code. + - If it's a feature request, estimate the difficulty of implementing the feature and potential impact on existing functionality and API. + + - name: "PR Review Comment" + if: github.event_name == 'pull_request_review_comment' + uses: anthropics/claude-code-action@beta + with: + claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} + github_token: ${{ secrets.GITHUB_TOKEN }} + trigger_phrase: "@claude" + assignee_trigger: "claude" + label_trigger: "claude" + base_branch: "staging" + max_turns: "30" + allowed_tools: *allowed_tools + mcp_config: *mcp_config + direct_prompt: | + When you are asked to review a pull request: + - Review the changes made in the PR. + - Provide feedback on the code quality, functionality, and adherence to best practices. + - Consider the library's existing code style and whether the code aligns with it. + - Consider possible security or performance effects. + - Suggest improvements or alternatives where applicable. + - If the changes are satisfactory and the code passes checks, approve the PR with a comment. + + - name: "Issue Assigned or Labeled Claude" + if: > + (github.event_name == 'issues' && github.event.action == 'assigned') || + (github.event_name == 'issues' && github.event.action == 'labeled' && github.event.label.name == 'claude') + uses: anthropics/claude-code-action@beta + with: + claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} + github_token: ${{ secrets.GITHUB_TOKEN }} + trigger_phrase: "@claude" + assignee_trigger: "claude" + label_trigger: "claude" + base_branch: "staging" + max_turns: "30" + allowed_tools: *allowed_tools + mcp_config: *mcp_config + direct_prompt: | + When you are assigned an issue or it's labeled 'claude': + - Your job is to resolve it. + - Gather all necessary information about the issue from discussions and comments and the codebase. + - If the issue involves external libraries, use the context7 tool to get the latest information on the API. + - Communicate with the issue reporter for clarification if needed. + - Create an issue branch. + - Develop a detailed plan to fix the problem. + - Write your plan and information from your research to a markdown file. Continually refer to this as you work. + - Use the sequential-thinking tool to plan your actions. + - Implement the fix and test it thoroughly. + - If the fix might affect core functionality, update or add tests focused on that functionality. + - Run all pre-commit lint checks and ensure everything is formatted correctly ('hk check', 'hk fix'). + - Use conventional commits format. + - Copy your planning file into your PR and then delete it before submitting. + - Submit your changes in a pull request: + - Document your changes and the reasoning behind them. + - Provide your markdown file with the plan and research information. + - Submit your solution for review. diff --git a/.gitignore b/.gitignore index a888866..8dc68e7 100644 --- a/.gitignore +++ b/.gitignore @@ -214,8 +214,8 @@ tags *~ -crates/rule-engine/serialization_analysis/serialization_analysis -!crates/rule-engine/serialization_analysis/serialization_analysis.rs +crates/rule-engine/serialization_analysis/analyze_serialization +!crates/rule-engine/serialization_analysis/analyze_serialization.rs # temporary files which can be created if a process still has a handle open of a deleted file .fuse_hidden* diff --git a/CLAUDE.md b/CLAUDE.md index ea4ea04..ef3d12f 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,156 +1,247 @@ - - # CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview -Thread is a Rust code analysis engine designed to generate intelligent context for AI assistants. The core goal is to parse code into a queryable graph that can provide exactly the right context when an AI asks about specific functions, dependencies, or code relationships. +Thread is a safe, fast, flexible code analysis and parsing library built in Rust. It provides powerful AST-based pattern matching and transformation capabilities using tree-sitter parsers. The project is forked from ast-grep and enhanced for production use as a code analysis engine for AI context generation. + +## Architecture + +Thread follows a modular architecture with six main crates: + +### Core Crates + +- **`thread-ast-engine`** - Core AST parsing, pattern matching, and transformation engine (forked from ast-grep-core) +- **`thread-rule-engine`** - Rule-based scanning and transformation system with YAML configuration support +- **`thread-language`** - Language definitions and tree-sitter parser integrations (supports 20+ languages) +- **`thread-utils`** - Shared utilities including SIMD optimizations and hash functions +- **`thread-services`** - High-level service interfaces and API abstractions +- **`thread-wasm`** - WebAssembly bindings for browser and edge deployment + +### Build System + +- **`xtask`** - Custom build tasks, primarily for WASM compilation with optimization + +## Development Commands + +### Building -**Current Status**: Day 2 of 30-day implementation plan. Basic scaffolding exists, but most crates contain placeholder code from earlier architectural iterations. +```bash +# Build everything (except WASM) +mise run build +# or: cargo build --workspace -## Simplified Architecture +# Build in release mode +mise run build-release +# or: cargo build --workspace --release --features inline -Thread follows a single-representation approach: +# Build WASM for development +mise run build-wasm +# or: cargo run -p xtask build-wasm -```plaintext -File → ast-grep (parsing) → petgraph (analysis) → Content store (dedup) → API +# Build WASM in release mode +mise run build-wasm-release +# or: cargo run -p xtask build-wasm --release ``` -### Core Components +### Testing and Quality -- **ast-grep**: Parsing orchestrator with tree-sitter integration and language detection -- **petgraph**: Single source of truth for code structure (nodes = functions/classes, edges = calls/imports) -- **Content-addressable storage**: Deduplication using rapidhash -- **fmmap**: Memory mapping for large files -- **thread-fs**: Filesystem operations (separated for WASM compatibility) +```bash +# Run all tests +mise run test +# or: hk run test +# or: cargo nextest run --all-features --no-fail-fast -j 1 -## Idiomatic Crate Structure +# Full linting +mise run lint +# or: hk run check -The workspace follows Rust conventions with core types separated from implementations: +# Auto-fix formatting and linting issues +mise run fix +# or: hk fix -- `thread-core/` - Core traits, types, and error definitions only -- `thread-engine/` - Main analysis implementation using petgraph -- `thread-parse/` - ast-grep integration and language detection -- `thread-store/` - Content-addressable storage + memory mapping -- `thread-fs/` - Filesystem operations (WASM-compatible abstraction) -- `thread-diff/` - Vendored difftastic diff algorithms -- `thread-cli/` - Command line interface -- `thread-wasm/` - WebAssembly bindings -- `xtask/` - Build automation for WASM targets +# Run CI pipeline locally +mise run ci +``` -### Design Rationale +### Single Test Execution -This structure follows the pattern used by `serde` (core traits) vs `serde_json` (implementation): +```bash +# Run specific test +cargo nextest run --manifest-path Cargo.toml test_name --all-features -- `thread-core` defines `LanguageParser` trait, `CodeElement` types, `Result` types -- `thread-engine` implements the actual analysis logic and graph building -- Other crates can depend on `thread-core` for types without pulling in the full engine +# Run tests for specific crate +cargo nextest run -p thread-ast-engine --all-features -## Development Commands +# Run benchmarks +cargo bench -p thread-rule-engine +``` + +### Utility Commands -### Build Commands +```bash +# Update dependencies +mise run update +# or: cargo update && cargo update --workspace -- `mise run build` or `mise run b` - Build all crates (except WASM) -- `mise run build-release` or `mise run br` - Release build -- `mise run build-wasm` or `mise run bw` - Build WASM for development (single-threaded) -- `mise run build-wasm-release` or `mise run bwr` - Build WASM for production +# Clean build artifacts +mise run clean -### WASM Build Options +# Update license headers +mise run update-licenses +# or: ./scripts/update-licenses.py +``` -- `cargo run -p xtask build-wasm` - Basic WASM build -- `cargo run -p xtask build-wasm --multi-threading` - Multi-threaded for browsers -- `cargo run -p xtask build-wasm --release` - Production optimized -- `cargo run -p xtask build-wasm --profiling` - With profiling enabled +## Key Language Support -### Testing and Quality +The `thread-language` crate provides built-in support for major programming languages via tree-sitter: + +**Tier 1 Languages** (primary focus): + +- Rust, JavaScript/TypeScript, Python, Go, Java + +**Tier 2 Languages** (full support): -- `mise run test` or `mise run t` - Run tests with `cargo nextest` -- `mise run lint` or `mise run c` - Full linting via `hk run check` -- `mise run fix` or `mise run f` - Auto-fix formatting and linting -- `mise run ci` - Run all CI checks (build + lint + test) +- C/C++, C#, PHP, Ruby, Swift, Kotlin, Scala -### Development Setup +**Tier 3 Languages** (basic support): -- `mise run install` - Install dev tools and git hooks -- `mise run update` - Update all dev tools -- `mise run clean` - Clean build artifacts and caches +- Bash, CSS, HTML, JSON, YAML, Lua, Elixir, Haskell -## Implementation Plan Context +## Pattern Matching System -### Current Sprint (Week 1) +Thread's core strength is AST-based pattern matching using meta-variables: -- **Day 1**: ✅ Project cleanup and setup -- **Day 2**: 🔄 Basic ast-grep integration (current focus) -- **Day 3**: Petgraph integration -- **Day 4**: End-to-end MVP -- **Day 5**: Content-addressable storage -- **Day 6**: Basic CLI interface -- **Day 7**: Week 1 demo and testing +### Meta-Variable Syntax + +- `$VAR` - Captures a single AST node +- `$$$ITEMS` - Captures multiple consecutive nodes (ellipsis) +- `$_` - Matches any node without capturing + +### Example Usage + +```rust +// Find function declarations +root.find("function $NAME($$$PARAMS) { $$$BODY }") + +// Find variable assignments +root.find_all("let $VAR = $VALUE") + +// Complex pattern matching +root.find("if ($COND) { $$$THEN } else { $$$ELSE }") +``` + +## Rule System + +The `thread-rule-engine` supports YAML-based rule definitions for code analysis: + +```yaml +id: no-var-declarations +message: "Use 'let' or 'const' instead of 'var'" +language: JavaScript +rule: + pattern: "var $NAME = $VALUE" +fix: "let $NAME = $VALUE" +``` + +## Performance Considerations + +### Optimization Features + +- SIMD optimizations in `thread-utils` for fast string operations +- Parallel processing capabilities with rayon +- Memory-efficient AST representation +- Content-addressable storage for deduplication + +### Build Profiles + +- **dev**: Fast compilation with basic optimizations +- **dev-debug**: Cranelift backend for faster debug builds +- **release**: Full LTO optimization +- **wasm-release**: Size-optimized for WebAssembly + +## WASM Deployment + +Thread compiles to WebAssembly for edge deployment: + +```bash +# Basic WASM build (for Cloudflare Workers) +cargo run -p xtask build-wasm + +# Multi-threading WASM (for browsers) +cargo run -p xtask build-wasm --multi-threading + +# Optimized release build +cargo run -p xtask build-wasm --release +``` -### Near-term Goals +## Testing Infrastructure -The immediate target is a working `analyze_rust_file()` function that: +### Test Organization -1. Parses Rust code with ast-grep -2. Extracts functions, calls, and imports -3. Builds a petgraph representation -4. Provides basic graph queries +- Unit tests: In each crate's `src/` directory +- Integration tests: In `tests/` directories +- Benchmarks: In `benches/` directories +- Test data: In `test_data/` directories -### MVP Definition +### Quality Tooling -A CLI tool that can analyze Rust files and generate AI-friendly context showing: +- **cargo-nextest**: Parallel test execution +- **hk**: Git hooks and linting orchestration +- **mise**: Development environment management +- **typos**: Spell checking +- **reuse**: License compliance -- Function definitions with line numbers -- Call relationships (what calls what) -- Import dependencies -- Context-relevant code snippets for AI assistants +## Dependencies -## Key Design Decisions +### Core Dependencies -### What to Skip for MVP +- `tree-sitter`: AST parsing foundation +- `regex`: Pattern matching support +- `serde`: Configuration serialization +- `bit-set`: Efficient set operations +- `rayon`: Parallel processing -- ❌ type-sitter (build complexity) -- ❌ tree-sitter-graph (memory management complexity) -- ❌ ropey (incremental editing - add later) -- ❌ Multi-language support initially (Rust first) +### Performance Dependencies -### What to Keep +- `rapidhash`: Fast non-cryptographic hashing +- `memchr`: SIMD string searching +- `simdeez`: SIMD abstractions -- ✅ ast-grep (mature parsing with language detection) -- ✅ petgraph (single source of truth) -- ✅ Content-addressable storage (essential for deduplication) -- ✅ Memory mapping (critical for large repos) +## Contributing Workflow -## Testing Strategy +1. Run `mise run install-tools` to set up development environment +2. Make changes following existing patterns +3. Run `mise run fix` to apply formatting and linting +4. Run `mise run test` to verify functionality +5. Use `mise run ci` to run full CI pipeline locally -- Uses `cargo nextest` for parallel test execution -- Single-threaded execution (`-j 1`) to prevent race conditions -- `--no-fail-fast` for development, `--fail-fast` for CI -- Full backtraces enabled (`RUST_BACKTRACE=1`) +## License Structure -## WASM Considerations +- Main codebase: AGPL-3.0-or-later +- Forked ast-grep components: AGPL-3.0-or-later AND MIT +- Documentation and config: MIT OR Apache-2.0 +- See `VENDORED.md` files for specific attribution -- Default build is single-threaded for Cloudflare Workers -- Multi-threaded builds available for browser environments -- Core logic separated from filesystem operations for portability -- Uses `wasm-opt` for size and performance optimization +--- -## Context Generation Goal +## Tools for AI Assistants -When an AI asks: "How does the `parse` function work in Thread?" +The library provides multiple tools to help me AI assistants more efficient: -Thread should provide: +- MCP Tools: + - You always have access to `sequential-thinking`. Use this to plan out tasks before executing and document things you learn along the way. Regularly refer back to it. + - `context7` provides a library of up-to-date code examples and API documentation for almost any library. +- The `llm-edit.sh` script: + - Script in `scripts/llm-edit.sh` gives you an easy interface for providing multiple file edits in one go. + Full details on how to use it are in `scripts/README-llm-edit.md` -1. **Function location**: Exact file and line numbers -2. **Dependencies**: What functions `parse` calls -3. **Usage**: What functions call `parse` -4. **Context**: Related code snippets with line numbers +### Multi-File Output System (llm-edit) -This enables AI assistants to get precisely the context they need without dumping entire files. +- When the user mentions "multi-file output", "generate files as json", or similar requests for bundled file generation, use the multi-file output system +- Execute using: `./llm-edit.sh ` +- Provide output as a single JSON object following the schema in `./README-llm-edit.md` +- The JSON must include an array of files, each with file_name, file_type, and file_content fields +- For binary files, encode content as base64 and set file_type to "binary" +- NEVER include explanatory text or markdown outside the JSON structure diff --git a/Cargo.lock b/Cargo.lock index 3b2c828..42094f6 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -744,9 +744,9 @@ dependencies = [ [[package]] name = "serde_json" -version = "1.0.140" +version = "1.0.141" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "20068b6e96dc6c9bd23e01df8827e6c7e1f2fddd43c21810382803c136b99373" +checksum = "30b9eff21ebe718216c6ec64e1d9ac57087aad11efc64e32002bce4a0d4c03d3" dependencies = [ "indexmap", "itoa", diff --git a/Cargo.toml b/Cargo.toml index 3867f3d..21ce6fc 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -66,7 +66,7 @@ serde = { version = "1.0.219", features = ["derive"] } thiserror = "2.0.12" tree-sitter = "0.25.8" -serde_json = "1.0.140" +serde_json = "1.0.141" serde_yaml = { version = "0.0.12", package = "serde_yml" } # speed! diff --git a/PLAN.md b/PLAN.md index 9b3a49a..a1ae8ea 100644 --- a/PLAN.md +++ b/PLAN.md @@ -30,7 +30,7 @@ Here's how the pieces actually fit together: ``` File → ast-grep (parsing) → petgraph (analysis) → Content store (dedup) → API ↓ - ropey (editing) → incremental updates + lasso (editing) → incremental updates ``` **That's it.** No type-sitter, no tree-sitter-graph, no redundant representations. @@ -51,7 +51,7 @@ Let me explain each piece in plain terms: - **Why you need it**: Fast queries, graph algorithms, memory-efficient storage - **This is your primary data structure** - everything else feeds into or reads from this -### ropey: Your Text Editor +### lasso: Your Text Editor - **What it does**: Efficient text editing with line/column tracking - **Why you need it**: When code changes, you can update specific parts without reparsing everything @@ -152,7 +152,7 @@ thread/ │ ├── thread-core/ # Main analysis engine + petgraph │ ├── thread-parse/ # ast-grep integration │ ├── thread-store/ # Content-addressable storage + fmmap -│ ├── thread-edit/ # ropey integration for live updates +│ ├── thread-edit/ # lasso integration for live updates │ ├── thread-diff/ # difftastic algorithms (vendored) │ ├── thread-cli/ # Command line interface │ └── thread-wasm/ # WASM bindings @@ -171,7 +171,7 @@ You're right to question some of the complexity. Here's what to **skip** for you - **type-sitter**: Adds build complexity and compile-time dependency management for marginal benefit - **tree-sitter-graph**: Complicates WASM builds and memory management; petgraph is more flexible - **difftastic parsing**: Only vendor their diff algorithms, use ast-grep for parsing -- **ropey for now**: Start with simple string replacement, add incremental editing later +- **lasso for now**: Start with simple string replacement, add incremental editing later ### ✅ Keep These (Core Value) @@ -321,7 +321,7 @@ fn parse(&self, content: &str) -> Result, Error> { You can add these features incrementally: - **Week 2**: Memory mapping for large files -- **Week 3**: Incremental updates with ropey +- **Week 3**: Incremental updates with lasso - **Week 4**: WASM compilation - **Week 5**: Difftastic integration for change tracking diff --git a/_unused.toml b/_unused.toml index 6d6a2fd..983d236 100644 --- a/_unused.toml +++ b/_unused.toml @@ -4,12 +4,11 @@ # SPDX-License-Identifier: MIT OR Apache-2.0 -# dashmap = { version = "6.1.0", features = ["rayon", "inline"] } # fmmap = { version = "0.4.0", features = ["tokio"] } # memory map for handling large files efficiently # ignore = { version = "0.4.23", features = ["simd-accel"] } # gitignore # rapidhash = { version = "1.4.0", features = ["std"] } # fast hashing for content addressing # rayon = { version = "1.10.0", features = ["std"] } -# ropey = "1.6.1" +# lasso # serde = { version = "1.0.219", features = ["derive"] } # serde_json = "1.0.140" diff --git a/crates/ast-engine/Cargo.toml b/crates/ast-engine/Cargo.toml index eba3db0..8f724dc 100644 --- a/crates/ast-engine/Cargo.toml +++ b/crates/ast-engine/Cargo.toml @@ -28,11 +28,11 @@ thread-utils = { workspace = true, default-features = false, features = [ "simd", ] } thiserror.workspace = true +bit-set.workspace = true # Tree-sitter required for parsing tree-sitter = { workspace = true, optional = true } -# Bit-set and regex required for pattern matching -bit-set = { workspace = true, optional = true } +# regex required for pattern matching regex = { workspace = true, optional = true } [features] @@ -40,7 +40,7 @@ default = ["parsing", "matching"] # The 'parsing' feature enables the tree-sitter backend parsing = ["dep:tree-sitter"] # The 'matching' feature enables the pattern matching engine -matching = ["dep:regex", "dep:bit-set"] +matching = ["dep:regex"] [dev-dependencies] tree-sitter-typescript = "0.23.2" diff --git a/crates/ast-engine/README.md b/crates/ast-engine/README.md index 2e23dca..dfd4e09 100644 --- a/crates/ast-engine/README.md +++ b/crates/ast-engine/README.md @@ -4,3 +4,162 @@ SPDX-FileContributor: Adam Poulemanos SPDX-License-Identifier: MIT OR Apache-2.0 --> +# thread-ast-engine + +**Core AST engine for Thread: parsing, matching, and transforming code using AST patterns.** + +## Overview + +`thread-ast-engine` provides powerful tools for working with Abstract Syntax Trees (ASTs). Forked from [`ast-grep-core`](https://github.com/ast-grep/ast-grep/), it offers language-agnostic APIs for code analysis and transformation. + +### What You Can Do + +- **Parse** source code into ASTs using [tree-sitter](https://tree-sitter.github.io/tree-sitter/) +- **Search** for code patterns using flexible meta-variables (like `$VAR`) +- **Transform** code by replacing matched patterns with new code +- **Navigate** AST nodes with intuitive tree traversal methods + +Perfect for building code linters, refactoring tools, and automated code modification systems. + +## Quick Start + +Add to your `Cargo.toml`: + +```toml +[dependencies] +thread-ast-engine = { version = "0.1.0", features = ["parsing", "matching", "replacing"] } +``` + +### Basic Example: Find and Replace Variables + +```rust +use thread_ast_engine::Language; +use thread_ast_engine::tree_sitter::LanguageExt; + +// Parse JavaScript/TypeScript code +let mut ast = Language::Tsx.ast_grep("var a = 1; var b = 2;"); + +// Replace all 'var' declarations with 'let' +ast.replace("var $NAME = $VALUE", "let $NAME = $VALUE")?; + +// Get the transformed code +println!("{}", ast.generate()); +// Output: "let a = 1; let b = 2;" +``` + +### Finding Code Patterns + +```rust +use thread_ast_engine::matcher::MatcherExt; + +let ast = Language::Tsx.ast_grep("function add(a, b) { return a + b; }"); +let root = ast.root(); + +// Find all function declarations +if let Some(func) = root.find("function $NAME($$$PARAMS) { $$$BODY }") { + println!("Function name: {}", func.get_env().get_match("NAME").unwrap().text()); +} + +// Find all return statements +for ret_stmt in root.find_all("return $EXPR") { + println!("Returns: {}", ret_stmt.get_env().get_match("EXPR").unwrap().text()); +} +``` + +### Working with Meta-Variables + +Meta-variables capture parts of the matched code: + +- `$VAR` - Captures a single AST node +- `$$$ITEMS` - Captures multiple consecutive nodes (ellipsis) +- `$_` - Matches any node but doesn't capture it + +```rust +let ast = Language::Tsx.ast_grep("console.log('Hello', 'World', 123)"); +let root = ast.root(); + +if let Some(call) = root.find("console.log($$$ARGS)") { + let args = call.get_env().get_multiple_matches("ARGS"); + println!("Found {} arguments", args.len()); // Output: Found 3 arguments +} +``` + +## Core Components + +### [`Node`](src/node.rs) - AST Navigation + +Navigate and inspect AST nodes with methods like `children()`, `parent()`, and `find()`. + +### [`Pattern`](src/matchers/pattern.rs) - Code Matching + +Match code structures using tree-sitter patterns with meta-variables. + +### [`MetaVarEnv`](src/meta_var.rs) - Variable Capture + +Store and retrieve captured meta-variables from pattern matches. + +### [`Replacer`](src/replacer.rs) - Code Transformation + +Replace matched code with new content, supporting template-based replacement. + +### [`Language`](src/language.rs) - Language Support + +Abstract interface for different programming languages via tree-sitter grammars. + +## Feature Flags + +- **`parsing`** - Enables tree-sitter parsing (includes tree-sitter dependency) +- **`matching`** - Enables pattern matching and node transformation engine. + +Use `default-features = false` to opt out of all features and enable only what you need: + +```toml +[dependencies] +thread-ast-engine = { version = "0.1.0", default-features = false, features = ["matching"] } +``` + +## Advanced Examples + +### Custom Pattern Matching + +```rust +use thread_ast_engine::ops::Op; + +// Combine multiple patterns with logical operators +let pattern = Op::either("let $VAR = $VALUE") + .or("const $VAR = $VALUE") + .or("var $VAR = $VALUE"); + +let ast = Language::Tsx.ast_grep("const x = 42;"); +let root = ast.root(); + +if let Some(match_) = root.find(pattern) { + println!("Found variable declaration"); +} +``` + +### Tree Traversal + +```rust +let ast = Language::Tsx.ast_grep("if (condition) { doSomething(); } else { doOther(); }"); +let root = ast.root(); + +// Traverse all descendants +for node in root.dfs() { + if node.kind() == "identifier" { + println!("Identifier: {}", node.text()); + } +} + +// Check relationships between nodes +if let Some(if_stmt) = root.find("if ($COND) { $$$THEN }") { + println!("If statement condition: {}", + if_stmt.get_env().get_match("COND").unwrap().text()); +} +``` + +## License + +Original ast-grep code is licensed under the [MIT license](./LICENSE-MIT). All changes introduced in this project are licensed under [AGPL-3.0-or-later](./LICENSE-AGPL-3.0-or-later). + +See [`VENDORED.md`](VENDORED.md) for details about our fork, changes, and licensing. diff --git a/crates/ast-engine/benches/performance_improvements.rs b/crates/ast-engine/benches/performance_improvements.rs index 48f9fe9..14e2cec 100644 --- a/crates/ast-engine/benches/performance_improvements.rs +++ b/crates/ast-engine/benches/performance_improvements.rs @@ -10,7 +10,7 @@ use criterion::{Criterion, criterion_group, criterion_main}; use std::hint::black_box; use thread_ast_engine::{Pattern, Root}; -use thread_language::Tsx; +use thread_language::{Tsx}; use thread_utils::RapidMap; fn bench_pattern_conversion(c: &mut Criterion) { @@ -34,7 +34,7 @@ fn bench_pattern_conversion(c: &mut Criterion) { c.bench_function("pattern_conversion_optimized", |b| { b.iter(|| { - let pattern = Pattern::new(black_box(pattern_str), Tsx); + let pattern = Pattern::new(black_box(pattern_str), &Tsx); let root = Root::str(black_box(source_code), Tsx); let node = root.root(); let matches: Vec<_> = node.find_all(&pattern).collect(); @@ -49,8 +49,8 @@ fn bench_meta_var_env_conversion(c: &mut Criterion) { c.bench_function("meta_var_env_conversion", |b| { b.iter(|| { - let pattern = Pattern::new(black_box(pattern_str), Tsx); - let root = Root::str(black_box(source_code), Tsx); + let pattern = Pattern::new(black_box(pattern_str), &Tsx); + let root = Root::str(black_box(source_code), &Tsx); let matches: Vec<_> = root.root().find_all(&pattern).collect(); // Test the optimized string concatenation @@ -76,7 +76,7 @@ fn bench_pattern_children_collection(c: &mut Criterion) { c.bench_function("pattern_children_collection", |b| { b.iter(|| { let root = Root::str(black_box(source_code), Tsx); - let pattern = Pattern::new("class $NAME { $$$METHODS }", Tsx); + let pattern = Pattern::new("class $NAME { $$$METHODS }", &Tsx); let matches: Vec<_> = root.root().find_all(&pattern).collect(); black_box(matches); }) diff --git a/crates/ast-engine/src/language.rs b/crates/ast-engine/src/language.rs index 9f415a1..615bda6 100644 --- a/crates/ast-engine/src/language.rs +++ b/crates/ast-engine/src/language.rs @@ -3,7 +3,34 @@ // SPDX-FileContributor: Adam Poulemanos // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT - +//! # Language Abstraction for AST Parsing +//! +//! This module defines the [`Language`](crates/ast-engine/src/language.rs:16) trait, which abstracts over language-specific details for AST parsing and pattern matching. +//! +//! ## Purpose +//! +//! - **Meta-variable Handling:** Configure how meta-variables (e.g., `$A`) are recognized and processed for different languages. +//! - **Pattern Preprocessing:** Normalize pattern code before matching, adapting to language-specific quirks. +//! - **Tree-sitter Integration:** Map node kinds and fields to tree-sitter IDs for efficient AST traversal. +//! - **Extensibility:** Support custom language implementations (see [`Tsx`](crates/ast-engine/src/language.rs:63) for TypeScript/TSX). +//! +//! ## Key Components +//! +//! - [`Language`](crates/ast-engine/src/language.rs:16): Core trait for language-specific AST operations. +//! - [`Tsx`](crates/ast-engine/src/language.rs:63): Example implementation for TypeScript/TSX. +//! - Meta-variable extraction and normalization utilities. +//! +//! ## Example +//! +//! ```rust,no_run +//! use thread_ast_engine::language::Language; +//! +//! let lang = Tsx {}; +//! let pattern = lang.pre_process_pattern("var $A = $B"); +//! let meta_var = lang.extract_meta_var("$A"); +//! ``` +#[allow(unused_imports)] +#[cfg(feature = "matching")] use super::{Pattern, PatternBuilder, PatternError}; use crate::meta_var::{MetaVariable, extract_meta_var}; use std::borrow::Cow; @@ -48,6 +75,7 @@ pub trait Language: Clone + 'static { fn kind_to_id(&self, kind: &str) -> u16; fn field_to_id(&self, field: &str) -> Option; + #[cfg(feature = "matching")] fn build_pattern(&self, builder: &PatternBuilder) -> Result; } diff --git a/crates/ast-engine/src/lib.rs b/crates/ast-engine/src/lib.rs index 3a8ca1c..5c84634 100644 --- a/crates/ast-engine/src/lib.rs +++ b/crates/ast-engine/src/lib.rs @@ -3,15 +3,174 @@ // SPDX-FileContributor: Adam Poulemanos // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT - -/*! -This module contains the core engine for Thread. - -It provides APIs for parsing, traversing, searching and replacing tree-sitter nodes. -The functionality is feature-gated to allow for selective compilation: -- `parsing`: Enables tree-sitter parsing backend -- `matching`: Enables pattern matching and replacement capabilities -*/ +//! # thread-ast-engine +//! +//! **Core AST engine for Thread: parsing, matching, and transforming code using AST patterns.** +//! +//! ## Overview +//! +//! `thread-ast-engine` provides powerful tools for working with Abstract Syntax Trees (ASTs). +//! Forked from [`ast-grep-core`](https://github.com/ast-grep/ast-grep/), it offers language-agnostic +//! APIs for code analysis and transformation. +//! +//! ### What You Can Do +//! +//! - **Parse** source code into ASTs using [tree-sitter](https://tree-sitter.github.io/tree-sitter/) +//! - **Search** for code patterns using flexible meta-variables (like `$VAR`) +//! - **Transform** code by replacing matched patterns with new code +//! - **Navigate** AST nodes with intuitive tree traversal methods +//! +//! Perfect for building code linters, refactoring tools, and automated code modification systems. +//! +//! ## Quick Start +//! +//! Add to your `Cargo.toml`: +//! ```toml +//! [dependencies] +//! thread-ast-engine = { version = "0.1.0", features = ["parsing", "matching"] } +//! ``` +//! +//! ### Basic Example: Find and Replace Variables +//! +//! ```rust,no_run +//! use thread_ast_engine::Language; +//! use thread_ast_engine::tree_sitter::LanguageExt; +//! +//! // Parse JavaScript/TypeScript code +//! let mut ast = Language::Tsx.ast_grep("var a = 1; var b = 2;"); +//! +//! // Replace all 'var' declarations with 'let' +//! ast.replace("var $NAME = $VALUE", "let $NAME = $VALUE")?; +//! +//! // Get the transformed code +//! println!("{}", ast.generate()); +//! // Output: "let a = 1; let b = 2;" +//! # Ok::<(), String>(()) +//! ``` +//! +//! ### Finding Code Patterns +//! +//! ```rust,no_run +//! use thread_ast_engine::matcher::MatcherExt; +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! +//! let ast = Language::Tsx.ast_grep("function add(a, b) { return a + b; }"); +//! let root = ast.root(); +//! +//! // Find all function declarations +//! if let Some(func) = root.find("function $NAME($$$PARAMS) { $$$BODY }") { +//! println!("Function name: {}", func.get_env().get_match("NAME").unwrap().text()); +//! } +//! +//! // Find all return statements +//! for ret_stmt in root.find_all("return $EXPR") { +//! println!("Returns: {}", ret_stmt.get_env().get_match("EXPR").unwrap().text()); +//! } +//! ``` +//! +//! ### Working with Meta-Variables +//! +//! Meta-variables capture parts of the matched code: +//! +//! - `$VAR` - Captures a single AST node +//! - `$$$ITEMS` - Captures multiple consecutive nodes (ellipsis) +//! - `$_` - Matches any node but doesn't capture it +//! +//! ```rust,no_run +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::matcher::MatcherExt; +//! let ast = Language::Tsx.ast_grep("console.log('Hello', 'World', 123)"); +//! let root = ast.root(); +//! +//! if let Some(call) = root.find("console.log($$$ARGS)") { +//! let args = call.get_env().get_multiple_matches("ARGS"); +//! println!("Found {} arguments", args.len()); // Output: Found 3 arguments +//! } +//! ``` +//! +//! ## Core Components +//! +//! ### [`Node`] - AST Navigation +//! Navigate and inspect AST nodes with methods like [`Node::children`], [`Node::parent`], and [`Node::find`]. +//! +//! ### [`Pattern`] - Code Matching +//! Match code structures using tree-sitter patterns with meta-variables. +//! +//! ### [`MetaVarEnv`] - Variable Capture +//! Store and retrieve captured meta-variables from pattern matches. +//! +//! ### [`Replacer`] - Code Transformation +//! Replace matched code with new content, supporting template-based replacement. +//! +//! ### [`Language`] - Language Support +//! Abstract interface for different programming languages via tree-sitter grammars. +//! +//! ## Feature Flags +//! +//! - **`parsing`** - Enables tree-sitter parsing (includes tree-sitter dependency) +//! - **`matching`** - Enables pattern matching and node replacement/transformation engine. +//! +//! Use `default-features = false` to opt out of all features and enable only what you need: +//! +//! ```toml +//! [dependencies] +//! thread-ast-engine = { version = "0.1.0", default-features = false, features = ["matching"] } +//! ``` +//! +//! ## Advanced Examples +//! +//! ### Custom Pattern Matching +//! +//! ```rust,no_run +//! use thread_ast_engine::ops::Op; +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::matcher::MatcherExt; +//! +//! // Combine multiple patterns with logical operators +//! let pattern = Op::either("let $VAR = $VALUE") +//! .or("const $VAR = $VALUE") +//! .or("var $VAR = $VALUE"); +//! +//! let ast = Language::Tsx.ast_grep("const x = 42;"); +//! let root = ast.root(); +//! +//! if let Some(match_) = root.find(pattern) { +//! println!("Found variable declaration"); +//! } +//! ``` +//! +//! ### Tree Traversal +//! +//! ```rust,no_run +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::matcher::MatcherExt; +//! let ast = Language::Tsx.ast_grep("if (condition) { doSomething(); } else { doOther(); }"); +//! let root = ast.root(); +//! +//! // Traverse all descendants +//! for node in root.dfs() { +//! if node.kind() == "identifier" { +//! println!("Identifier: {}", node.text()); +//! } +//! } +//! +//! // Check relationships between nodes +//! if let Some(if_stmt) = root.find("if ($COND) { $$$THEN }") { +//! println!("If statement condition: {}", +//! if_stmt.get_env().get_match("COND").unwrap().text()); +//! } +//! ``` +//! +//! ## License +//! +//! Original ast-grep code is licensed under the [MIT license](./LICENSE-MIT), +//! all changes introduced in this project are licensed under the [AGPL-3.0-or-later](./LICENSE-AGPL-3.0-or-later). +//! +//! See [`VENDORED.md`](crates/ast-engine/VENDORED.md) for more information on our fork, changes, and reasons. pub mod language; pub mod source; @@ -46,7 +205,8 @@ pub mod replacer; // the bare types with no implementations #[cfg(not(feature = "matching"))] pub use matchers::{ - MatchStrictness, Pattern, PatternBuilder, PatternError, PatternNode, matcher::Matcher, + MatchStrictness, Pattern, PatternBuilder, PatternError, PatternNode, + matcher::{Matcher, MatcherExt, NodeMatch}, }; // implemented types diff --git a/crates/ast-engine/src/match_tree/match_node.rs b/crates/ast-engine/src/match_tree/match_node.rs index f73ea28..635651b 100644 --- a/crates/ast-engine/src/match_tree/match_node.rs +++ b/crates/ast-engine/src/match_tree/match_node.rs @@ -243,11 +243,11 @@ mod test { use super::*; use crate::language::Tsx; use crate::matcher::KindMatcher; - use crate::matcher::types::Pattern; + use crate::matcher::Pattern; use crate::{Matcher, Root, meta_var::MetaVarEnv}; use std::borrow::Cow; fn match_tree(p: &str, n: &str, strictness: MatchStrictness) -> MatchOneNode { - let pattern = Pattern::new(p, Tsx); + let pattern = Pattern::new(p, &Tsx); let kind = pattern.potential_kinds().expect("should have kind"); let kind = KindMatcher::from_id(kind.into_iter().next().expect("should have kind") as u16); let n = Root::str(n, Tsx); diff --git a/crates/ast-engine/src/match_tree/mod.rs b/crates/ast-engine/src/match_tree/mod.rs index 11993ae..e2c2b28 100644 --- a/crates/ast-engine/src/match_tree/mod.rs +++ b/crates/ast-engine/src/match_tree/mod.rs @@ -173,7 +173,7 @@ mod test { } fn test_match(s1: &str, s2: &str) -> RapidMap { - let goal = Pattern::new(s1, Tsx); + let goal = Pattern::new(s1, &Tsx); let cand = Root::str(s2, Tsx); let cand = cand.root(); let mut env = Cow::Owned(MetaVarEnv::new()); @@ -187,7 +187,7 @@ mod test { } fn test_non_match(s1: &str, s2: &str) { - let goal = Pattern::new(s1, Tsx); + let goal = Pattern::new(s1, &Tsx); let cand = Root::str(s2, Tsx); let cand = cand.root(); let mut env = Cow::Owned(MetaVarEnv::new()); @@ -310,15 +310,15 @@ mod test { fn find_end_recursive(goal: &Pattern, node: &Node>) -> Option { match_end_non_recursive(goal, node).or_else(|| { node.children() - .find_map(|sub| find_end_recursive(goal, sub)) + .find_map(|sub| find_end_recursive(goal, &sub)) }) } fn test_end(s1: &str, s2: &str) -> Option { - let goal = Pattern::new(s1, Tsx); + let goal = Pattern::new(s1, &Tsx); let cand = Root::str(s2, Tsx); let cand = cand.root(); - find_end_recursive(&goal, cand.clone()) + find_end_recursive(&goal, &cand) } #[test] diff --git a/crates/ast-engine/src/match_tree/strictness.rs b/crates/ast-engine/src/match_tree/strictness.rs index c350673..e6be95b 100644 --- a/crates/ast-engine/src/match_tree/strictness.rs +++ b/crates/ast-engine/src/match_tree/strictness.rs @@ -4,6 +4,40 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # Pattern Matching Strictness Implementation +//! +//! Implements the logic for different levels of pattern matching strictness, +//! controlling how precisely patterns must match AST structure. +//! +//! ## Strictness Levels +//! +//! - **CST (Concrete Syntax Tree)** - Exact matching including all punctuation +//! - **Smart** - Ignores unnamed tokens but matches all named nodes +//! - **AST (Abstract Syntax Tree)** - Matches only named/structural nodes +//! - **Relaxed** - AST matching while ignoring comments +//! - **Signature** - Matches structure only, ignoring text content +//! +//! ## Core Types +//! +//! - [`MatchOneNode`] - Result of comparing a single pattern node to a candidate +//! - [`MatchStrictness`] - Enum defining strictness levels (re-exported) +//! +//! ## Usage +//! +//! This module is primarily used internally by the pattern matching engine. +//! Users typically interact with strictness through pattern configuration: +//! +//! ```rust,ignore +//! let pattern = Pattern::new("function $NAME() {}", language) +//! .with_strictness(MatchStrictness::Relaxed); +//! ``` +//! +//! The strictness level determines: +//! - Which nodes in the AST are considered for matching +//! - Whether whitespace and punctuation must match exactly +//! - How comments are handled during matching +//! - Whether text content is compared or just structure + use crate::Doc; pub use crate::matcher::MatchStrictness; use crate::matcher::{PatternNode, kind_utils}; @@ -12,12 +46,22 @@ use crate::node::Node; use std::iter::Peekable; use std::str::FromStr; +/// Result of comparing a single pattern node against a candidate AST node. +/// +/// Represents the different outcomes when the matching algorithm compares +/// one element of a pattern against one AST node, taking into account +/// the current strictness level. #[derive(Debug, Clone)] pub enum MatchOneNode { + /// Both pattern and candidate node match - continue with next elements MatchedBoth, + /// Skip both pattern and candidate (e.g., both are unnamed tokens in AST mode) SkipBoth, + /// Skip the pattern element (e.g., unnamed token in pattern during AST matching) SkipGoal, + /// Skip the candidate node (e.g., unnamed token in candidate during AST matching) SkipCandidate, + /// No match possible - pattern fails NoMatch, } diff --git a/crates/ast-engine/src/matcher.rs b/crates/ast-engine/src/matcher.rs index 2c183cc..d0a1ef3 100644 --- a/crates/ast-engine/src/matcher.rs +++ b/crates/ast-engine/src/matcher.rs @@ -4,29 +4,200 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT -//! This module defines the core `Matcher` trait in ast-grep. +//! # Pattern Matching Engine //! -//! `Matcher` has three notable implementations in this module: -//! * `Pattern`: matches against a tree-sitter node based on its tree structure. -//! * `KindMatcher`: matches a node based on its `kind` -//! * `RegexMatcher`: matches a node based on its textual content using regex. - -use crate::Doc; -use crate::{Node, meta_var::MetaVarEnv}; +//! Core pattern matching functionality for finding and matching AST nodes. +//! +//! ## Key Traits and Types +//! +//! - [`Matcher`] - Core trait for matching AST nodes against patterns +//! - [`MatcherExt`] - Extension trait providing utility methods like [`MatcherExt::find_node`] +//! - [`Pattern`] - Matches nodes based on AST structure with meta-variables +//! - [`NodeMatch`] - Result of a successful pattern match, containing the matched node and captured variables +//! +//! ## Pattern Types +//! +//! The engine supports several types of matchers: +//! +//! - **`Pattern`** - Structural matching based on AST shape (most common) +//! - **`KindMatcher`** - Simple matching based on node type/kind +//! - **`RegexMatcher`** - Text-based matching using regular expressions +//! - **`MatchAll`** / **`MatchNone`** - Utility matchers for always/never matching +//! +//! ## Examples +//! +//! ### Basic Pattern Matching +//! +//! ```rust,no_run +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::matcher::MatcherExt; +//! let ast = Language::Tsx.ast_grep("let x = 42;"); +//! let root = ast.root(); +//! +//! // Find variable declarations +//! if let Some(decl) = root.find("let $VAR = $VALUE") { +//! let var_name = decl.get_env().get_match("VAR").unwrap(); +//! let value = decl.get_env().get_match("VALUE").unwrap(); +//! println!("Variable {} = {}", var_name.text(), value.text()); +//! } +//! ``` +//! +//! ### Finding Multiple Matches +//! +//! ```rust,no_run +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::matcher::MatcherExt; +//! let code = "let a = 1; let b = 2; let c = 3;"; +//! let ast = Language::Tsx.ast_grep(code); +//! let root = ast.root(); +//! +//! // Find all variable declarations +//! for decl in root.find_all("let $VAR = $VALUE") { +//! let var_name = decl.get_env().get_match("VAR").unwrap(); +//! println!("Found variable: {}", var_name.text()); +//! } +//! ``` +//! +//! ### NodeMatch +//! +//! #### Pattern Match Results with Meta-Variable Capture +//! +//! Contains the implementation for the [`NodeMatch`] type that represents +//! the result of a successful pattern match, including both the matched AST node and +//! any captured meta-variables. +//! +//! When a pattern like `"function $NAME($$$PARAMS) { $$$BODY }"` matches an AST node, +//! it creates a [`NodeMatch`] that stores: +//! - The matched node (the function declaration) +//! - The captured variables (`$NAME`, `$PARAMS`, `$BODY`) +//! +//! #### Key Features +//! +//! - **Node access**: Use like a regular [`Node`] through [`Deref`] +//! - **Meta-variable access**: Get captured variables via [`NodeMatch::get_env`] +//! - **Code replacement**: Generate edits with [`NodeMatch::replace_by`] +//! - **Type safety**: Lifetime-bound to ensure memory safety +//! +//! #### Example Usage +//! +//! ```rust,ignore +//! // Find function declarations +//! let matches = root.find_all("function $NAME($$$PARAMS) { $$$BODY }"); +//! +//! for match_ in matches { +//! // Use as a regular node +//! println!("Function at line {}", match_.start_pos().line()); +//! +//! // Access captured meta-variables +//! let env = match_.get_env(); +//! let name = env.get_match("NAME").unwrap(); +//! println!("Function name: {}", name.text()); +//! +//! // Generate replacement code +//! let edit = match_.replace_by("async function $NAME($$$PARAMS) { $$$BODY }"); +//! } +//! ``` -use bit_set::BitSet; -use std::borrow::Cow; +use crate::{Doc, Node, meta_var::MetaVarEnv, source::Edit as E}; pub use crate::matchers::kind::*; -pub use crate::matchers::matcher::Matcher; -pub use crate::matchers::node_match::*; +pub use crate::matchers::matcher::{Matcher, MatcherExt, NodeMatch}; pub use crate::matchers::pattern::*; pub use crate::matchers::text::*; +use bit_set::BitSet; +use std::borrow::{Borrow, Cow}; +use std::ops::Deref; + +use crate::replacer::Replacer; -/// `MatcherExt` provides additional utility methods for `Matcher`. -/// It is implemented for all types that implement `Matcher`. -/// N.B. This trait is not intended to be implemented by users. -pub trait MatcherExt: Matcher { +type Edit = E<::Source>; + +impl<'tree, D: Doc> NodeMatch<'tree, D> { + pub const fn new(node: Node<'tree, D>, env: MetaVarEnv<'tree, D>) -> Self { + Self(node, env) + } + + pub const fn get_node(&self) -> &Node<'tree, D> { + &self.0 + } + + /// Returns the populated `MetaVarEnv` for this match. + pub const fn get_env(&self) -> &MetaVarEnv<'tree, D> { + &self.1 + } + pub const fn get_env_mut(&mut self) -> &mut MetaVarEnv<'tree, D> { + &mut self.1 + } + /// # Safety + /// should only called for readopting nodes + pub(crate) const unsafe fn get_node_mut(&mut self) -> &mut Node<'tree, D> { + &mut self.0 + } +} + +impl NodeMatch<'_, D> { + pub fn replace_by>(&self, replacer: R) -> Edit { + let range = self.range(); + let position = range.start; + let deleted_length = range.len(); + let inserted_text = replacer.generate_replacement(self); + Edit:: { + position, + deleted_length, + inserted_text, + } + } + + #[doc(hidden)] + pub fn make_edit(&self, matcher: &M, replacer: &R) -> Edit + where + M: Matcher, + R: Replacer, + { + let range = replacer.get_replaced_range(self, matcher); + let inserted_text = replacer.generate_replacement(self); + Edit:: { + position: range.start, + deleted_length: range.len(), + inserted_text, + } + } +} + +impl<'tree, D: Doc> From> for NodeMatch<'tree, D> { + fn from(node: Node<'tree, D>) -> Self { + Self(node, MetaVarEnv::new()) + } +} + +/// `NodeMatch` is an immutable view to Node +impl<'tree, D: Doc> From> for Node<'tree, D> { + fn from(node_match: NodeMatch<'tree, D>) -> Self { + node_match.0 + } +} + +/// `NodeMatch` is an immutable view to Node +impl<'tree, D: Doc> Deref for NodeMatch<'tree, D> { + type Target = Node<'tree, D>; + fn deref(&self) -> &Self::Target { + &self.0 + } +} + +/// `NodeMatch` is an immutable view to Node +impl<'tree, D: Doc> Borrow> for NodeMatch<'tree, D> { + fn borrow(&self) -> &Node<'tree, D> { + &self.0 + } +} + +impl MatcherExt for T +where + T: Matcher, +{ fn match_node<'tree, D: Doc>(&self, node: Node<'tree, D>) -> Option> { // in future we might need to customize initial MetaVarEnv let mut env = Cow::Owned(MetaVarEnv::new()); @@ -44,8 +215,6 @@ pub trait MatcherExt: Matcher { } } -impl MatcherExt for T where T: Matcher {} - impl Matcher for str { fn match_node_with_env<'tree, D: Doc>( &self, @@ -114,3 +283,52 @@ impl Matcher for MatchNone { Some(BitSet::new()) } } + +#[cfg(test)] +mod test { + use super::*; + use crate::language::Tsx; + use crate::tree_sitter::{LanguageExt, StrDoc}; + + fn use_node(n: &Node>) -> String { + n.text().to_string() + } + + fn borrow_node<'a, D, B>(b: B) -> String + where + D: Doc + 'static, + B: Borrow>, + { + b.borrow().text().to_string() + } + + #[test] + fn test_node_match_as_node() { + let root = Tsx.ast_grep("var a = 1"); + let node = root.root(); + let src = node.text().to_string(); + let nm = NodeMatch::from(node); + let ret = use_node(&*nm); + assert_eq!(ret, src); + assert_eq!(use_node(&*nm), borrow_node(nm)); + } + + #[test] + fn test_node_env() { + let root = Tsx.ast_grep("var a = 1"); + let find = root.root().find("var $A = 1").expect("should find"); + let env = find.get_env(); + let node = env.get_match("A").expect("should find"); + assert_eq!(node.text(), "a"); + } + + #[test] + fn test_replace_by() { + let root = Tsx.ast_grep("var a = 1"); + let find = root.root().find("var $A = 1").expect("should find"); + let fixed = find.replace_by("var b = $A"); + assert_eq!(fixed.position, 0); + assert_eq!(fixed.deleted_length, 9); + assert_eq!(fixed.inserted_text, "var b = a".as_bytes()); + } +} diff --git a/crates/ast-engine/src/matchers/kind.rs b/crates/ast-engine/src/matchers/kind.rs index 48efe67..d21311e 100644 --- a/crates/ast-engine/src/matchers/kind.rs +++ b/crates/ast-engine/src/matchers/kind.rs @@ -4,6 +4,44 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # AST Node Kind Matching +//! +//! Provides matchers that filter AST nodes based on their syntactic type (kind). +//! Every AST node has a "kind" that describes what syntax element it represents +//! (e.g., "function_declaration", "identifier", "string_literal"). +//! +//! ## Core Types +//! +//! - [`KindMatcher`] - Matches nodes of a specific syntactic type +//! - [`KindMatcherError`] - Errors when creating matchers with invalid kinds +//! - [`kind_utils`] - Utilities for working with node kinds +//! +//! ## Example Usage +//! +//! ```rust,ignore +//! use thread_ast_engine::matchers::KindMatcher; +//! use thread_ast_engine::matcher::MatcherExt; +//! +//! // Match all function declarations +//! let matcher = KindMatcher::new("function_declaration", &language); +//! let functions: Vec<_> = root.find_all(&matcher).collect(); +//! +//! // Match parsing errors in source code +//! let error_matcher = KindMatcher::error_matcher(); +//! let errors: Vec<_> = root.find_all(&error_matcher).collect(); +//! ``` +//! +//! ## Node Kind Concepts +//! +//! - **Named nodes** - Represent actual language constructs (functions, variables, etc.) +//! - **Anonymous nodes** - Represent punctuation and keywords (`{`, `}`, `let`, etc.) +//! - **Error nodes** - Represent unparsable syntax (syntax errors) +//! +//! Kind matching is useful for: +//! - Finding all nodes of a specific type (all functions, all classes, etc.) +//! - Detecting syntax errors in source code +//! - Building language-specific analysis tools + use super::matcher::Matcher; use crate::language::Language; @@ -22,14 +60,50 @@ use thiserror::Error; const TS_BUILTIN_SYM_END: KindId = 0; const TS_BUILTIN_SYM_ERROR: KindId = 65535; +/// Errors that can occur when creating a [`KindMatcher`]. #[derive(Debug, Error)] pub enum KindMatcherError { + /// The specified node kind name doesn't exist in the language grammar. + /// + /// This happens when you try to match a node type that isn't defined + /// in the tree-sitter grammar for the language. #[error("Kind `{0}` is invalid.")] InvalidKindName(String), } +/// Matcher that finds AST nodes based on their syntactic type (kind). +/// +/// `KindMatcher` is the simplest type of matcher - it matches nodes whose +/// type matches a specific string. Every AST node has a "kind" that describes +/// what syntax element it represents. +/// +/// # Examples +/// +/// ```rust,ignore +/// // Match all function declarations +/// let matcher = KindMatcher::new("function_declaration", &language); +/// let functions: Vec<_> = root.find_all(&matcher).collect(); +/// +/// // Match all identifiers +/// let id_matcher = KindMatcher::new("identifier", &language); +/// let identifiers: Vec<_> = root.find_all(&id_matcher).collect(); +/// +/// // Find syntax errors in code +/// let error_matcher = KindMatcher::error_matcher(); +/// let errors: Vec<_> = root.find_all(&error_matcher).collect(); +/// ``` +/// +/// # Common Node Kinds +/// +/// The exact node kinds depend on the language, but common examples include: +/// - `"function_declaration"` - Function definitions +/// - `"identifier"` - Variable/function names +/// - `"string_literal"` - String values +/// - `"number"` - Numeric literals +/// - `"ERROR"` - Syntax errors #[derive(Debug, Clone)] pub struct KindMatcher { + /// The numeric ID of the node kind to match kind: KindId, } diff --git a/crates/ast-engine/src/matchers/mod.rs b/crates/ast-engine/src/matchers/mod.rs index ec7191f..4feb692 100644 --- a/crates/ast-engine/src/matchers/mod.rs +++ b/crates/ast-engine/src/matchers/mod.rs @@ -3,26 +3,63 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later #![allow(clippy::redundant_pub_crate)] -//! Module imports for pattern matching. Feature gated except for unimplemented `types` module. +//! # Pattern Matching Module Organization //! -//! ## Implementation Notes +//! Conditional module imports for pattern matching functionality with feature flag support. //! -//! We changed the structure here from Ast-Grep, which uses a pattern like what's still -//! in [`crate::replacer`], where the root `parent.rs` module contains all -//! the submodules. +//! ## Module Structure //! -//! ### Why this structure? +//! This module organizes pattern matching components with conditional compilation: +//! - **Core types** (always available) - Pattern definitions and interfaces +//! - **Implementations** (feature-gated) - Actual matching logic and algorithms //! -//! We needed to access the type definitions without the `matching` feature flag, so we: -//! - Moved type definitions to `types.rs` (which we created). -//! - renamed the directory from `matcher` to `matchers` -//! - Created this `mod.rs` to import the submodules conditionally based on the `matching` feature flag. -//! - Kept trait implementations behind the feature flag. -//! - Moved [`types::MatchStrictness`] to `types.rs` in this module from `crate::match_tree::strictness` (not the implementation, just the type definition). +//! ## Feature Flag Design //! -//! #### Practical Implications +//! The `matching` feature flag controls access to pattern matching implementations: +//! - **With `matching`** - Full pattern matching capabilities available +//! - **Without `matching`** - Only type definitions for API compatibility //! -//! From an API perspective, nothing changed -- `matcher` is still the main entry point for pattern matching (if the feature is enabled). +//! ## Available Components +//! +//! ### Always Available +//! - [`types`] - Core pattern matching types and traits +//! - exported here if `matching` feature is not enabled +//! - exported in `matcher.rs` if `matching` feature is enabled +//! - Types **always** available from lib.rs: +//! ```rust,ignore +//! use thread_ast_engine::{ +//! Matcher, MatcherExt, Pattern, MatchStrictness, +//! NodeMatch, PatternNode, PatternBuilder, PatternError, +//! }; +//! ``` +//! - [`Matcher`] trait - Interface for all pattern matchers +//! +//! ### Feature-Gated (`matching` feature) +//! - [`pattern`] - Structural pattern matching from source code strings +//! - [`kind`] - AST node type matching +//! - [`text`] - Regex-based text content matching +//! +//! ## Architecture Benefits +//! +//! - **Reduced compilation** - Skip complex matching logic when not needed +//! - **API stability** - Type definitions remain available for library interfaces +//! - **Modular usage** - Enable only required pattern matching features +//! +//! ## Usage +//! +//! ```toml +//! # In Cargo.toml - enable full pattern matching +//! thread-ast-engine = { version = "...", features = ["matching"] } +//! ``` +//! +//! ```rust,ignore +//! // Types always available +//! use thread_ast_engine::matchers::types::{Matcher, Pattern}; +//! +//! // Implementations require 'matching' feature +//! #[cfg(feature = "matching")] +//! use thread_ast_engine::matchers::pattern::Pattern; +//! ``` #[cfg(feature = "matching")] pub(crate) mod pattern; @@ -30,16 +67,15 @@ pub(crate) mod pattern; #[cfg(feature = "matching")] pub(crate) mod kind; -#[cfg(feature = "matching")] -pub(crate) mod node_match; - #[cfg(feature = "matching")] pub(crate) mod text; pub(crate) mod types; #[cfg(not(feature = "matching"))] -pub use types::*; +pub use types::{ + MatchStrictness, Pattern, PatternBuilder, PatternError, PatternNode +}; pub(crate) mod matcher { - pub use super::types::Matcher; + pub use super::types::{Matcher, MatcherExt, NodeMatch}; } diff --git a/crates/ast-engine/src/matchers/node_match.rs b/crates/ast-engine/src/matchers/node_match.rs deleted file mode 100644 index d5e491b..0000000 --- a/crates/ast-engine/src/matchers/node_match.rs +++ /dev/null @@ -1,151 +0,0 @@ -// SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> -// SPDX-FileCopyrightText: 2025 Knitli Inc. -// SPDX-FileContributor: Adam Poulemanos -// -// SPDX-License-Identifier: AGPL-3.0-or-later AND MIT - -use super::matcher::Matcher; -use crate::meta_var::MetaVarEnv; -use crate::replacer::Replacer; -use crate::source::Edit as E; -use crate::{Doc, Node}; - -use std::borrow::Borrow; -use std::ops::Deref; - -type Edit = E<::Source>; - -/// Represents the matched node with populated `MetaVarEnv`. -/// It derefs to the `Node` so you can use it as a `Node`. -/// To access the underlying `MetaVarEnv`, call `get_env` method. -#[derive(Clone)] -pub struct NodeMatch<'t, D: Doc>(Node<'t, D>, MetaVarEnv<'t, D>); - -impl<'tree, D: Doc> NodeMatch<'tree, D> { - pub const fn new(node: Node<'tree, D>, env: MetaVarEnv<'tree, D>) -> Self { - Self(node, env) - } - - pub const fn get_node(&self) -> &Node<'tree, D> { - &self.0 - } - - /// Returns the populated `MetaVarEnv` for this match. - pub const fn get_env(&self) -> &MetaVarEnv<'tree, D> { - &self.1 - } - pub const fn get_env_mut(&mut self) -> &mut MetaVarEnv<'tree, D> { - &mut self.1 - } - /// # Safety - /// should only called for readopting nodes - pub(crate) const unsafe fn get_node_mut(&mut self) -> &mut Node<'tree, D> { - &mut self.0 - } -} - -impl NodeMatch<'_, D> { - pub fn replace_by>(&self, replacer: R) -> Edit { - let range = self.range(); - let position = range.start; - let deleted_length = range.len(); - let inserted_text = replacer.generate_replacement(self); - Edit:: { - position, - deleted_length, - inserted_text, - } - } - - #[doc(hidden)] - pub fn make_edit(&self, matcher: &M, replacer: &R) -> Edit - where - M: Matcher, - R: Replacer, - { - let range = replacer.get_replaced_range(self, matcher); - let inserted_text = replacer.generate_replacement(self); - Edit:: { - position: range.start, - deleted_length: range.len(), - inserted_text, - } - } -} - -impl<'tree, D: Doc> From> for NodeMatch<'tree, D> { - fn from(node: Node<'tree, D>) -> Self { - Self(node, MetaVarEnv::new()) - } -} - -/// `NodeMatch` is an immutable view to Node -impl<'tree, D: Doc> From> for Node<'tree, D> { - fn from(node_match: NodeMatch<'tree, D>) -> Self { - node_match.0 - } -} - -/// `NodeMatch` is an immutable view to Node -impl<'tree, D: Doc> Deref for NodeMatch<'tree, D> { - type Target = Node<'tree, D>; - fn deref(&self) -> &Self::Target { - &self.0 - } -} - -/// `NodeMatch` is an immutable view to Node -impl<'tree, D: Doc> Borrow> for NodeMatch<'tree, D> { - fn borrow(&self) -> &Node<'tree, D> { - &self.0 - } -} - -#[cfg(test)] -mod test { - use super::*; - use crate::language::Tsx; - use crate::tree_sitter::{LanguageExt, StrDoc}; - - fn use_node(n: &Node>) -> String { - n.text().to_string() - } - - fn borrow_node<'a, D, B>(b: B) -> String - where - D: Doc + 'static, - B: Borrow>, - { - b.borrow().text().to_string() - } - - #[test] - fn test_node_match_as_node() { - let root = Tsx.ast_grep("var a = 1"); - let node = root.root(); - let src = node.text().to_string(); - let nm = NodeMatch::from(node); - let ret = use_node(&*nm); - assert_eq!(ret, src); - assert_eq!(use_node(&*nm), borrow_node(nm)); - } - - #[test] - fn test_node_env() { - let root = Tsx.ast_grep("var a = 1"); - let find = root.root().find("var $A = 1").expect("should find"); - let env = find.get_env(); - let node = env.get_match("A").expect("should find"); - assert_eq!(node.text(), "a"); - } - - #[test] - fn test_replace_by() { - let root = Tsx.ast_grep("var a = 1"); - let find = root.root().find("var $A = 1").expect("should find"); - let fixed = find.replace_by("var b = $A"); - assert_eq!(fixed.position, 0); - assert_eq!(fixed.deleted_length, 9); - assert_eq!(fixed.inserted_text, "var b = a".as_bytes()); - } -} diff --git a/crates/ast-engine/src/matchers/pattern.rs b/crates/ast-engine/src/matchers/pattern.rs index bd6a704..405df13 100644 --- a/crates/ast-engine/src/matchers/pattern.rs +++ b/crates/ast-engine/src/matchers/pattern.rs @@ -328,7 +328,7 @@ mod test { } fn test_match(s1: &str, s2: &str) { - let pattern = Pattern::new(s1, Tsx); + let pattern = Pattern::new(s1, &Tsx); let cand = pattern_node(s2); let cand = cand.root(); assert!( @@ -339,7 +339,7 @@ mod test { ); } fn test_non_match(s1: &str, s2: &str) { - let pattern = Pattern::new(s1, Tsx); + let pattern = Pattern::new(s1, &Tsx); let cand = pattern_node(s2); let cand = cand.root(); assert!( @@ -364,7 +364,7 @@ mod test { } fn match_env(goal_str: &str, cand: &str) -> RapidMap { - let pattern = Pattern::new(goal_str, Tsx); + let pattern = Pattern::new(goal_str, &Tsx); let cand = pattern_node(cand); let cand = cand.root(); let nm = pattern.find_node(cand).unwrap(); @@ -381,7 +381,7 @@ mod test { #[test] fn test_pattern_should_not_pollute_env() { // gh issue #1164 - let pattern = Pattern::new("const $A = 114", Tsx); + let pattern = Pattern::new("const $A = 114", &Tsx); let cand = pattern_node("const a = 514"); let cand = cand.root().child(0).unwrap(); let map = MetaVarEnv::new(); @@ -413,7 +413,7 @@ mod test { #[test] fn test_contextual_pattern() { - let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", Tsx) + let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", &Tsx) .expect("test"); let cand = pattern_node("class B { b = 123 }"); assert!(pattern.find_node(cand.root()).is_some()); @@ -423,7 +423,7 @@ mod test { #[test] fn test_contextual_match_with_env() { - let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", Tsx) + let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", &Tsx) .expect("test"); let cand = pattern_node("class B { b = 123 }"); let nm = pattern.find_node(cand.root()).expect("test"); @@ -435,7 +435,7 @@ mod test { #[test] fn test_contextual_unmatch_with_env() { - let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", Tsx) + let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", &Tsx) .expect("test"); let cand = pattern_node("let b = 123"); let nm = pattern.find_node(cand.root()); @@ -448,7 +448,7 @@ mod test { #[test] fn test_pattern_potential_kinds() { - let pattern = Pattern::new("const a = 1", Tsx); + let pattern = Pattern::new("const a = 1", &Tsx); let kind = get_kind("lexical_declaration"); let kinds = pattern.potential_kinds().expect("should have kinds"); assert_eq!(kinds.len(), 1); @@ -457,7 +457,7 @@ mod test { #[test] fn test_pattern_with_non_root_meta_var() { - let pattern = Pattern::new("const $A = $B", Tsx); + let pattern = Pattern::new("const $A = $B", &Tsx); let kind = get_kind("lexical_declaration"); let kinds = pattern.potential_kinds().expect("should have kinds"); assert_eq!(kinds.len(), 1); @@ -466,14 +466,14 @@ mod test { #[test] fn test_bare_wildcard() { - let pattern = Pattern::new("$A", Tsx); + let pattern = Pattern::new("$A", &Tsx); // wildcard should match anything, so kinds should be None assert!(pattern.potential_kinds().is_none()); } #[test] fn test_contextual_potential_kinds() { - let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", Tsx) + let pattern = Pattern::contextual("class A { $F = $I }", "public_field_definition", &Tsx) .expect("test"); let kind = get_kind("public_field_definition"); let kinds = pattern.potential_kinds().expect("should have kinds"); @@ -484,7 +484,7 @@ mod test { #[test] fn test_contextual_wildcard() { let pattern = - Pattern::contextual("class A { $F }", "property_identifier", Tsx).expect("test"); + Pattern::contextual("class A { $F }", "property_identifier", &Tsx).expect("test"); let kind = get_kind("property_identifier"); let kinds = pattern.potential_kinds().expect("should have kinds"); assert_eq!(kinds.len(), 1); @@ -494,7 +494,7 @@ mod test { #[test] #[ignore] fn test_multi_node_pattern() { - let pattern = Pattern::new("a;b;c;", Tsx); + let pattern = Pattern::new("a;b;c;", &Tsx); let kinds = pattern.potential_kinds().expect("should have kinds"); assert_eq!(kinds.len(), 1); test_match("a;b;c", "a;b;c;"); @@ -517,16 +517,16 @@ mod test { #[test] fn test_error_kind() { - let ret = Pattern::contextual("a", "property_identifier", Tsx); + let ret = Pattern::contextual("a", "property_identifier", &Tsx); assert!(ret.is_err()); - let ret = Pattern::new("123+", Tsx); + let ret = Pattern::new("123+", &Tsx); assert!(ret.has_error()); } #[test] fn test_bare_wildcard_in_context() { let pattern = - Pattern::contextual("class A { $F }", "property_identifier", Tsx).expect("test"); + Pattern::contextual("class A { $F }", "property_identifier", &Tsx).expect("test"); let cand = pattern_node("let b = 123"); // it should not match assert!(pattern.find_node(cand.root()).is_none()); @@ -534,24 +534,24 @@ mod test { #[test] fn test_pattern_fixed_string() { - let pattern = Pattern::new("class A { $F }", Tsx); + let pattern = Pattern::new("class A { $F }", &Tsx); assert_eq!(pattern.fixed_string(), "class"); let pattern = - Pattern::contextual("class A { $F }", "property_identifier", Tsx).expect("test"); + Pattern::contextual("class A { $F }", "property_identifier", &Tsx).expect("test"); assert!(pattern.fixed_string().is_empty()); } #[test] fn test_pattern_error() { - let pattern = Pattern::try_new("", Tsx); + let pattern = Pattern::try_new("", &Tsx); assert!(matches!(pattern, Err(PatternError::NoContent(_)))); - let pattern = Pattern::try_new("12 3344", Tsx); + let pattern = Pattern::try_new("12 3344", &Tsx); assert!(matches!(pattern, Err(PatternError::MultipleNode(_)))); } #[test] fn test_debug_pattern() { - let pattern = Pattern::new("var $A = 1", Tsx); + let pattern = Pattern::new("var $A = 1", &Tsx); assert_eq!( format!("{pattern:?}"), "[var, [Capture(\"A\", true), =, 1]]" @@ -559,7 +559,7 @@ mod test { } fn defined_vars(s: &str) -> Vec { - let pattern = Pattern::new(s, Tsx); + let pattern = Pattern::new(s, &Tsx); let mut vars: Vec<_> = pattern .defined_vars() .into_iter() @@ -590,7 +590,7 @@ mod test { #[test] fn test_contextual_pattern_vars() { let pattern = - Pattern::contextual("
", "jsx_attribute", Tsx).expect("correct"); + Pattern::contextual("
", "jsx_attribute", &Tsx).expect("correct"); assert_eq!(pattern.defined_vars(), ["A"].into_iter().collect()); } diff --git a/crates/ast-engine/src/matchers/text.rs b/crates/ast-engine/src/matchers/text.rs index 2111410..5d7460e 100644 --- a/crates/ast-engine/src/matchers/text.rs +++ b/crates/ast-engine/src/matchers/text.rs @@ -4,6 +4,40 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # Text-Based Pattern Matching +//! +//! Provides regex-based matchers for finding AST nodes by their text content. +//! Useful when you need to match nodes based on their actual text rather +//! than their structural properties. +//! +//! ## Core Types +//! +//! - [`RegexMatcher`] - Matches nodes whose text content matches a regex pattern +//! - [`RegexMatcherError`] - Errors from invalid regex patterns +//! +//! ## Example Usage +//! +//! ```rust,ignore +//! // Find all nodes containing specific text patterns +//! let number_matcher = RegexMatcher::try_new(r"\d+")?; // Numbers +//! let email_matcher = RegexMatcher::try_new(r"[\w\.-]+@[\w\.-]+\.\w+")?; // Emails +//! +//! // Find all numeric literals +//! let numbers: Vec<_> = root.find_all(&number_matcher).collect(); +//! +//! // Find specific variable names +//! let temp_vars = RegexMatcher::try_new(r"temp\w*")?; +//! let temp_variables: Vec<_> = root.find_all(&temp_vars).collect(); +//! ``` +//! +//! ## Use Cases +//! +//! Text matching complements structural patterns when you need to: +//! - Find nodes with specific naming patterns +//! - Locate hardcoded values or literals +//! - Search for code smells in text content +//! - Filter nodes by complex text criteria + use super::matcher::Matcher; use crate::Doc; use crate::Node; @@ -15,14 +49,46 @@ use thiserror::Error; use std::borrow::Cow; +/// Errors that can occur when creating a [`RegexMatcher`]. #[derive(Debug, Error)] pub enum RegexMatcherError { + /// The provided regex pattern is invalid. + /// + /// Common causes include unbalanced parentheses, invalid escape sequences, + /// or unsupported regex features. #[error("Parsing text matcher fails.")] Regex(#[from] RegexError), } +/// Matcher that finds AST nodes based on regex patterns applied to their text content. +/// +/// `RegexMatcher` enables flexible text-based searching within AST nodes. +/// It matches any node whose text content satisfies the provided regular expression. +/// +/// # Examples +/// +/// ```rust,ignore +/// // Match numeric literals +/// let numbers = RegexMatcher::try_new(r"^\d+$")?; +/// let numeric_nodes: Vec<_> = root.find_all(&numbers).collect(); +/// +/// // Find TODO comments +/// let todos = RegexMatcher::try_new(r"(?i)todo|fixme")?; +/// let todo_comments: Vec<_> = root.find_all(&todos).collect(); +/// +/// // Match specific naming patterns +/// let private_vars = RegexMatcher::try_new(r"^_\w+")?; +/// let private_variables: Vec<_> = root.find_all(&private_vars).collect(); +/// ``` +/// +/// # Performance Note +/// +/// Text matching requires extracting text from every tested node, which can be +/// slower than structural matching. Consider combining with other matchers +/// or using more specific patterns when possible. #[derive(Clone, Debug)] pub struct RegexMatcher { + /// Compiled regex pattern for matching node text regex: Regex, } diff --git a/crates/ast-engine/src/matchers/types.rs b/crates/ast-engine/src/matchers/types.rs index 5299aeb..8b89858 100644 --- a/crates/ast-engine/src/matchers/types.rs +++ b/crates/ast-engine/src/matchers/types.rs @@ -3,79 +3,279 @@ // SPDX-FileContributor: Adam Poulemanos // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT - -//! Types for Pattern and Pattern matching. +#![allow(dead_code, reason = "Some fields report they're dead if the `matching` feature is not enabled.")] +//! # Core Pattern Matching Types +//! +//! Fundamental types and traits for AST pattern matching operations. +//! +//! ## Key Types //! -//! Definitions for the globally important pattern matching types. -//! Allows their use outside the pattern matching feature flags (unimplemented). +//! - [`Matcher`] - Core trait for matching AST nodes +//! - [`Pattern`] - Structural pattern for matching AST shapes +//! - [`MatchStrictness`] - Controls how precisely patterns must match +//! - [`PatternNode`] - Internal representation of pattern structure +//! +//! ## Usage +//! +//! These types are available even without the `matching` feature flag enabled, +//! allowing API definitions that reference them without requiring full +//! implementation dependencies. use crate::Doc; -use crate::MetaVarEnv; -use crate::meta_var::MetaVariable; +use crate::meta_var::{MetaVariable, MetaVarEnv}; use crate::node::Node; use bit_set::BitSet; use std::borrow::Cow; use thiserror::Error; +/// Core trait for matching AST nodes against patterns. +/// +/// Implementors define how to match nodes, whether by structure, content, +/// kind, or other criteria. The matcher can also capture meta-variables +/// during the matching process. +/// +/// # Type Parameters +/// +/// The trait is generic over document types to support different source +/// encodings and language implementations. +/// +/// # Example Implementation +/// +/// ```rust,ignore +/// use thread_ast_engine::Matcher; +/// +/// struct SimpleKindMatcher { +/// target_kind: String, +/// } +/// +/// impl Matcher for SimpleKindMatcher { +/// fn match_node_with_env<'tree, D: Doc>( +/// &self, +/// node: Node<'tree, D>, +/// _env: &mut Cow>, +/// ) -> Option> { +/// if node.kind() == self.target_kind { +/// Some(node) +/// } else { +/// None +/// } +/// } +/// } +/// ``` pub trait Matcher { - /// Returns the node why the input is matched or None if not matched. - /// The return value is usually input node itself, but it can be different node. - /// For example `Has` matcher can return the child or descendant node. + /// Attempt to match a node, updating the meta-variable environment. + /// + /// Returns the matched node if successful, or `None` if the node doesn't match. + /// The returned node is usually the input node, but can be different for + /// matchers like `Has` that match based on descendants. + /// + /// # Parameters + /// + /// - `node` - The AST node to test for matching + /// - `env` - Meta-variable environment to capture variables during matching + /// + /// # Returns + /// + /// The matched node if successful, otherwise `None` fn match_node_with_env<'tree, D: Doc>( &self, _node: Node<'tree, D>, _env: &mut Cow>, ) -> Option>; - /// Returns a bitset for all possible target node kind ids. - /// Returns None if the matcher needs to try against all node kind. + /// Provide a hint about which node types this matcher can match. + /// + /// Returns a bitset of node kind IDs that this matcher might match, + /// or `None` if it needs to test all node types. Used for optimization + /// to avoid testing matchers against incompatible nodes. + /// + /// # Returns + /// + /// - `Some(BitSet)` - Specific node kinds this matcher can match + /// - `None` - This matcher needs to test all node types fn potential_kinds(&self) -> Option { None } - /// `get_match_len` will skip trailing anonymous child node to exclude punctuation. - // This is not included in NodeMatch since it is only used in replace + /// Determine how much of a matched node should be replaced. + /// + /// Used during replacement to determine the exact span of text to replace. + /// Typically skips trailing punctuation or anonymous nodes. + /// + /// # Parameters + /// + /// - `node` - The matched node + /// + /// # Returns + /// + /// Number of bytes from the node's start position to replace, + /// or `None` to replace the entire node. fn get_match_len(&self, _node: Node<'_, D>) -> Option { None } } +/// Extension trait providing convenient utility methods for [`Matcher`] implementations. +/// +/// Automatically implemented for all types that implement [`Matcher`]. Provides +/// higher-level operations like finding nodes and working with meta-variable environments. +/// +/// # Important +/// +/// You should not implement this trait manually - it's automatically implemented +/// for all [`Matcher`] types. +/// +/// # Example +/// +/// ```rust,no_run +/// # use thread_ast_engine::Language; +/// # use thread_ast_engine::tree_sitter::LanguageExt; +/// # use thread_ast_engine::MatcherExt; +/// let ast = Language::Tsx.ast_grep("const x = 42;"); +/// let root = ast.root(); +/// +/// // Use MatcherExt methods +/// if let Some(node_match) = root.find("const $VAR = $VALUE") { +/// println!("Found constant declaration"); +/// } +/// ``` +pub trait MatcherExt: Matcher { + fn match_node<'tree, D: Doc>(&self, node: Node<'tree, D>) -> Option>; + + fn find_node<'tree, D: Doc>(&self, node: Node<'tree, D>) -> Option>; +} + +/// Result of a successful pattern match containing the matched node and captured variables. +/// +/// `NodeMatch` combines an AST node with the meta-variables captured during +/// pattern matching. It acts like a regular [`Node`] (through [`Deref`]) while +/// also providing access to captured variables through [`get_env`]. +/// +/// # Lifetime +/// +/// The lifetime `'t` ties the match to its source document, ensuring memory safety. +/// +/// # Usage Patterns +/// +/// ```rust,ignore +/// // Use as a regular node +/// let text = node_match.text(); +/// let position = node_match.start_pos(); +/// +/// // Access captured meta-variables +/// let env = node_match.get_env(); +/// let captured_name = env.get_match("VAR_NAME").unwrap(); +/// +/// // Generate replacement code +/// let edit = node_match.replace_by("new code with $VAR_NAME"); +/// ``` +/// +/// # Type Parameters +/// +/// - `'t` - Lifetime tied to the source document +/// - `D: Doc` - Document type containing the source and language info +#[derive(Clone)] +#[cfg_attr(not(feature = "matching"), allow(dead_code))] +pub struct NodeMatch<'t, D: Doc>(pub(crate) Node<'t, D>, pub(crate) MetaVarEnv<'t, D>); + + +/// Controls how precisely patterns must match AST structure. +/// +/// Different strictness levels allow patterns to match with varying degrees +/// of precision, from exact CST matching to loose structural matching. +/// +/// # Variants +/// +/// - **`Cst`** - All nodes must match exactly (concrete syntax tree) +/// - **`Smart`** - Matches meaningful nodes, ignoring trivial syntax +/// - **`Ast`** - Only structural nodes matter (abstract syntax tree) +/// - **`Relaxed`** - Ignores comments and focuses on code structure +/// - **`Signature`** - Matches structure only, ignoring all text content +/// +/// # Example +/// +/// ```rust,ignore +/// // With Cst strictness, these would be different: +/// // "let x=42;" vs "let x = 42;" +/// // +/// // With Ast strictness, they match the same pattern: +/// // "let $VAR = $VALUE" +/// ``` #[derive(Clone, Debug)] pub enum MatchStrictness { - Cst, // all nodes are matched - Smart, // all nodes except source trivial nodes are matched. - Ast, // only ast nodes are matched - Relaxed, // ast-nodes excluding comments are matched - Signature, // ast-nodes excluding comments, without text + /// Match all nodes exactly (Concrete Syntax Tree) + Cst, + /// Match all nodes except trivial syntax elements + Smart, + /// Match only structural AST nodes (Abstract Syntax Tree) + Ast, + /// Match AST nodes while ignoring comments + Relaxed, + /// Match structure only, ignoring all text content + Signature, } +/// Structural pattern for matching AST nodes based on their shape and content. +/// +/// Patterns represent code structures with support for meta-variables (like `$VAR`) +/// that can capture parts of the matched code. They're built from source code strings +/// and compiled into efficient matching structures. +/// +/// # Example +/// +/// ```rust,ignore +/// // Pattern for variable declarations +/// let pattern = Pattern::new("let $NAME = $VALUE", language); +/// +/// // Can match: "let x = 42", "let result = calculate()", etc. +/// ``` #[derive(Clone)] pub struct Pattern { + /// The root pattern node containing the matching logic pub node: PatternNode, + /// Optional hint about the root node kind for optimization pub(crate) root_kind: Option, + /// How strictly the pattern should match pub strictness: MatchStrictness, } +/// Builder for constructing patterns from source code. +/// +/// Handles parsing pattern strings into [`Pattern`] structures, +/// with optional contextual information for more precise matching. #[derive(Clone, Debug)] pub struct PatternBuilder<'a> { + /// Optional CSS-like selector for contextual matching pub(crate) selector: Option<&'a str>, + /// The pattern source code pub(crate) src: Cow<'a, str>, } +/// Internal representation of a pattern's structure. +/// +/// Patterns are compiled into a tree of `PatternNode` elements that +/// efficiently represent the matching logic for different AST structures. #[derive(Clone)] pub enum PatternNode { + /// Meta-variable that captures matched content MetaVar { + /// The meta-variable specification (e.g., `$VAR`, `$$$ITEMS`) meta_var: MetaVariable, }, - /// Node without children. + /// Leaf node with specific text content Terminal { + /// Expected text content text: String, + /// Whether this represents a named AST node is_named: bool, + /// Node type identifier kind_id: u16, }, - /// Non-Terminal Syntax Nodes are called Internal + /// Internal node with child patterns Internal { + /// Node type identifier kind_id: u16, + /// Child pattern nodes children: Vec, }, } diff --git a/crates/ast-engine/src/meta_var.rs b/crates/ast-engine/src/meta_var.rs index eea2156..abb0b10 100644 --- a/crates/ast-engine/src/meta_var.rs +++ b/crates/ast-engine/src/meta_var.rs @@ -3,17 +3,41 @@ // SPDX-FileContributor: Adam Poulemanos // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # Meta-variable Environment and Utilities +//! +//! This module provides types and functions for handling meta-variables in AST pattern matching. +//! Meta-variables allow patterns to flexibly match and capture code fragments, supporting single and multi-capture semantics. +//! +//! ## Key Components +//! +//! - [`MetaVarEnv`](crates/ast-engine/src/meta_var.rs:26): Stores meta-variable instantiations during pattern matching. +//! - [`MetaVariable`](crates/ast-engine/src/meta_var.rs:260): Enum representing different meta-variable forms (single, multi, dropped). +//! - `extract_meta_var`: Utility to parse meta-variable strings. +//! - Insertion, retrieval, and transformation APIs for meta-variable environments. +//! +//! ## Example +//! +//! ```rust,no_run +//! use thread_ast_engine::meta_var::{MetaVarEnv, MetaVariable, extract_meta_var}; +//! +//! let mut env = MetaVarEnv::new(); +//! env.insert("$A", node); +//! let meta = extract_meta_var("$A", '$'); +//! ``` +//! +//! See [`MetaVarEnv`](crates/ast-engine/src/meta_var.rs:48) for details on usage in AST matching and rewriting. #[cfg(feature = "matching")] use crate::match_tree::does_node_match_exactly; #[cfg(feature = "matching")] use crate::matcher::Matcher; use crate::source::Content; use crate::{Doc, Node}; +#[cfg(feature = "matching")] use std::borrow::Cow; use std::collections::HashMap; use std::hash::BuildHasherDefault; -use thread_utils::{map_with_capacity, RapidInlineHasher, RapidMap}; - +use thread_utils::{RapidInlineHasher, RapidMap, map_with_capacity}; +#[cfg(feature = "matching")] use crate::replacer::formatted_slice; pub type MetaVariableID = String; @@ -39,6 +63,7 @@ impl<'t, D: Doc> MetaVarEnv<'t, D> { } } + #[cfg(feature = "matching")] pub fn insert(&mut self, id: &str, ret: Node<'t, D>) -> Option<&mut Self> { if self.match_variable(id, &ret) { self.single_matched.insert(id.to_string(), ret); @@ -48,6 +73,7 @@ impl<'t, D: Doc> MetaVarEnv<'t, D> { } } + #[cfg(feature = "matching")] pub fn insert_multi(&mut self, id: &str, ret: Vec>) -> Option<&mut Self> { if self.match_multi_var(id, &ret) { self.multi_matched.insert(id.to_string(), ret); @@ -58,6 +84,7 @@ impl<'t, D: Doc> MetaVarEnv<'t, D> { } /// Insert without cloning the key if it's already owned + #[cfg(feature = "matching")] pub fn insert_owned(&mut self, id: String, ret: Node<'t, D>) -> Option<&mut Self> { if self.match_variable(&id, &ret) { self.single_matched.insert(id, ret); @@ -68,6 +95,7 @@ impl<'t, D: Doc> MetaVarEnv<'t, D> { } /// Insert multi without cloning the key if it's already owned + #[cfg(feature = "matching")] pub fn insert_multi_owned(&mut self, id: String, ret: Vec>) -> Option<&mut Self> { if self.match_multi_var(&id, &ret) { self.multi_matched.insert(id, ret); @@ -119,6 +147,8 @@ impl<'t, D: Doc> MetaVarEnv<'t, D> { single.chain(multi).chain(transformed) } + #[cfg(feature = "matching")] + #[must_use] fn match_variable(&self, id: &str, candidate: &Node<'t, D>) -> bool { if let Some(m) = self.single_matched.get(id) { return does_node_match_exactly(m, candidate); @@ -170,6 +200,7 @@ impl<'t, D: Doc> MetaVarEnv<'t, D> { true } + #[cfg(feature = "matching")] pub fn insert_transformation(&mut self, var: &MetaVariable, name: &str, slice: Underlying) { let node = match var { MetaVariable::Capture(v, _) => self.single_matched.get(v), @@ -196,6 +227,7 @@ impl<'t, D: Doc> MetaVarEnv<'t, D> { } } +#[cfg(feature = "matching")] impl MetaVarEnv<'_, D> { /// internal for readopt `NodeMatch` in pinned.rs /// readopt node and env when sending them to other threads @@ -315,13 +347,14 @@ pub(crate) const fn is_valid_meta_var_char(c: char) -> bool { is_valid_first_char(c) || c.is_ascii_digit() } -impl<'tree, D: Doc> From> for HashMap> +impl<'tree, D: Doc> From> + for HashMap> where D::Source: Content, { fn from(env: MetaVarEnv<'tree, D>) -> Self { let mut ret: Self = map_with_capacity( - env.single_matched.len() + env.multi_matched.len() + env.transformed_var.len() + env.single_matched.len() + env.multi_matched.len() + env.transformed_var.len(), ); for (id, node) in env.single_matched { ret.insert(id, node.text().into()); @@ -389,7 +422,7 @@ mod test { fn match_constraints(pattern: &str, node: &str) -> bool { let mut matchers = thread_utils::RapidMap::default(); - matchers.insert("A".to_string(), Pattern::new(pattern, Tsx)); + matchers.insert("A".to_string(), Pattern::new(pattern, &Tsx)); let mut env = MetaVarEnv::new(); let root = Tsx.ast_grep(node); let node = root.root().child(0).unwrap().child(0).unwrap(); diff --git a/crates/ast-engine/src/node.rs b/crates/ast-engine/src/node.rs index 187309e..2abf751 100644 --- a/crates/ast-engine/src/node.rs +++ b/crates/ast-engine/src/node.rs @@ -4,10 +4,41 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # AST Node Representation and Navigation +//! +//! Core types for representing and navigating Abstract Syntax Tree nodes. +//! +//! ## Key Types +//! +//! - [`Node`] - A single AST node with navigation and matching capabilities +//! - [`Root`] - The root of an AST tree, owns the source code and tree structure +//! - [`Position`] - Represents a position in source code (line/column) +//! +//! ## Usage +//! +//! ```rust,no_run +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::MatcherExt; +//! let ast = Language::Tsx.ast_grep("function foo() { return 42; }"); +//! let root_node = ast.root(); +//! +//! // Navigate the tree +//! for child in root_node.children() { +//! println!("Child kind: {}", child.kind()); +//! } +//! +//! // Find specific patterns +//! if let Some(func) = root_node.find("function $NAME() { $$$BODY }") { +//! println!("Found function: {}", func.get_env().get_match("NAME").unwrap().text()); +//! } +//! ``` + use crate::Doc; use crate::Language; #[cfg(feature = "matching")] use crate::matcher::{Matcher, MatcherExt, NodeMatch}; +#[cfg(feature = "matching")] use crate::replacer::Replacer; use crate::source::{Content, Edit as E, SgNode}; @@ -15,30 +46,49 @@ type Edit = E<::Source>; use std::borrow::Cow; -/// Represents a position in the source code. +/// Represents a position in source code. +/// +/// Positions use zero-based line and column numbers, where line 0 is the first line +/// and column 0 is the first character. Unlike tree-sitter's internal positions, +/// these are character-based rather than byte-based for easier human consumption. +/// +/// # Note +/// +/// Computing the character column from byte positions is an O(n) operation, +/// so avoid calling [`Position::column`] in performance-critical loops. +/// +/// # Example +/// +/// ```rust,no_run +/// # use thread_ast_engine::Language; +/// # use thread_ast_engine::tree_sitter::LanguageExt; +/// let ast = Language::Tsx.ast_grep("let x = 42;\nlet y = 24;"); +/// let root = ast.root(); /// -/// The line and column are zero-based, character offsets. -/// It is different from tree-sitter's position which is zero-based `byte` offsets. -/// Note, accessing `column` is O(n) operation. +/// let start_pos = root.start_pos(); +/// assert_eq!(start_pos.line(), 0); +/// ``` #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub struct Position { - /// zero-based line offset. Text encoding does not matter. + /// Zero-based line number (line 0 = first line) line: usize, - /// zero-based BYTE offset instead of character offset + /// Zero-based byte offset within the line byte_column: usize, - /// byte offset of this position + /// Absolute byte offset from start of file byte_offset: usize, } impl Position { - #[must_use] pub const fn new(line: usize, byte_column: usize, byte_offset: usize) -> Self { + #[must_use] + pub const fn new(line: usize, byte_column: usize, byte_offset: usize) -> Self { Self { line, byte_column, byte_offset, } } - #[must_use] pub const fn line(&self) -> usize { + #[must_use] + pub const fn line(&self) -> usize { self.line } /// Returns the column in terms of characters. @@ -47,13 +97,35 @@ impl Position { let source = node.get_doc().get_source(); source.get_char_column(self.byte_column, self.byte_offset) } - #[must_use] pub const fn byte_point(&self) -> (usize, usize) { + #[must_use] + pub const fn byte_point(&self) -> (usize, usize) { (self.line, self.byte_column) } } -/// Represents [`tree_sitter::Tree`] and owns source string -/// Note: Root is generic against [`Language`](crate::language::Language) +/// Root of an AST tree that owns the source code and parsed tree structure. +/// +/// Root acts as the entry point for all AST operations. It manages the document +/// (source code + parsed tree) and provides methods to get the root node and +/// perform tree-wide operations like replacements. +/// +/// # Generic Parameters +/// +/// - `D: Doc` - The document type that holds source code and language information +/// +/// # Example +/// +/// ```rust,no_run +/// # use thread_ast_engine::Language; +/// # use thread_ast_engine::tree_sitter::LanguageExt; +/// # use thread_ast_engine::MatcherExt; +/// let mut ast = Language::Tsx.ast_grep("let x = 42;"); +/// let root_node = ast.root(); +/// +/// // Perform tree-wide replacements +/// ast.replace("let $VAR = $VALUE", "const $VAR = $VALUE"); +/// println!("{}", ast.generate()); +/// ``` #[derive(Clone, Debug)] pub struct Root { pub(crate) doc: D, @@ -81,6 +153,7 @@ impl Root { Ok(self) } + #[cfg(feature = "matching")] pub fn replace>( &mut self, pattern: M, @@ -97,7 +170,7 @@ impl Root { } /// Adopt the `tree_sitter` as the descendant of the root and return the wrapped sg Node. - /// It assumes `inner` is the under the root and will panic at dev build if wrong node is used. + /// It assumes `inner` is under the root and will panic at dev build if wrong node is used. pub fn adopt<'r>(&'r self, inner: D::Node<'r>) -> Node<'r, D> { debug_assert!(self.check_lineage(&inner)); Node { inner, root: self } @@ -119,13 +192,48 @@ impl Root { } } -// why we need one more content? https://github.com/ast-grep/ast-grep/issues/1951 -/// 'r represents root lifetime +/// A single node in an Abstract Syntax Tree. +/// +/// Node represents a specific element in the parsed AST, such as a function declaration, +/// variable assignment, or expression. Each node knows its position in the source code, +/// its type (kind), and provides methods for navigation and pattern matching. +/// +/// # Lifetime +/// +/// The lifetime `'r` ties the node to its root AST, ensuring memory safety. +/// Nodes cannot outlive the Root that owns the underlying tree structure. +/// +/// # Example +/// +/// ```rust,no_run +/// # use thread_ast_engine::Language; +/// # use thread_ast_engine::tree_sitter::LanguageExt; +/// # use thread_ast_engine::matcher::MatcherExt; +/// let ast = Language::Tsx.ast_grep("function hello() { return 'world'; }"); +/// let root_node = ast.root(); +/// +/// // Check the node type +/// println!("Root kind: {}", root_node.kind()); +/// +/// // Navigate to children +/// for child in root_node.children() { +/// println!("Child: {} at {}:{}", child.kind(), +/// child.start_pos().line(), child.start_pos().column(&child)); +/// } +/// +/// // Find specific patterns +/// if let Some(return_stmt) = root_node.find("return $VALUE") { +/// let value = return_stmt.get_env().get_match("VALUE").unwrap(); +/// println!("Returns: {}", value.text()); +/// } +/// ``` #[derive(Clone, Debug)] pub struct Node<'r, D: Doc> { pub(crate) inner: D::Node<'r>, pub(crate) root: &'r Root, } + +/// Identifier for different AST node types (e.g., "function_declaration", "identifier") pub type KindId = u16; /// APIs for Node inspection @@ -198,6 +306,7 @@ impl<'r, D: Doc> Node<'r, D> { /** * Corresponds to inside/has/precedes/follows */ +#[cfg(feature = "matching")] impl Node<'_, D> { pub fn matches(&self, m: M) -> bool { m.match_node(self.clone()).is_some() @@ -323,11 +432,11 @@ impl<'r, D: Doc> Node<'r, D> { }) } - #[must_use] + #[cfg(feature = "matching")] pub fn find(&self, pat: M) -> Option> { pat.find_node(self.clone()) } - + #[cfg(feature = "matching")] pub fn find_all<'s, M: Matcher + 's>( &'s self, pat: M, @@ -346,6 +455,7 @@ impl<'r, D: Doc> Node<'r, D> { /// Tree manipulation API impl Node<'_, D> { + #[cfg(feature = "matching")] pub fn replace>(&self, matcher: M, replacer: R) -> Option> { let matched = matcher.find_node(self.clone())?; let edit = matched.make_edit(&matcher, &replacer); diff --git a/crates/ast-engine/src/ops.rs b/crates/ast-engine/src/ops.rs index 575a49b..b3a412f 100644 --- a/crates/ast-engine/src/ops.rs +++ b/crates/ast-engine/src/ops.rs @@ -456,7 +456,7 @@ mod test { } impl TsxMatcher for &str { fn t(self) -> Pattern { - Pattern::new(self, Tsx) + Pattern::new(self, &Tsx) } } diff --git a/crates/ast-engine/src/pinned.rs b/crates/ast-engine/src/pinned.rs index 88c4be3..c6fba8c 100644 --- a/crates/ast-engine/src/pinned.rs +++ b/crates/ast-engine/src/pinned.rs @@ -4,7 +4,46 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # Lifetime Extension for AST Nodes Across Thread and FFI Boundaries +//! +//! Enables safe passing of AST nodes across threads and FFI boundaries by extending +//! their lifetimes beyond the normal borrow checker constraints. +//! +//! ## The Problem +//! +//! Normally, AST nodes have lifetimes tied to their root document: +//! ```rust,ignore +//! let root = parse_code("let x = 42;"); +//! let node = root.find("$VAR").unwrap(); // node lifetime tied to root +//! // Can't send node to another thread without root +//! ``` +//! +//! ## The Solution +//! +//! [`PinnedNodeData`] keeps the root alive while allowing nodes to have `'static` lifetimes: +//! ```rust,ignore +//! let pinned = PinnedNodeData::new(root, |static_root| { +//! static_root.find("$VAR").unwrap() // Now has 'static lifetime +//! }); +//! // Can safely send `pinned` across threads +//! ``` +//! +//! ## Safety +//! +//! This module uses unsafe code to extend lifetimes, but maintains safety by: +//! - Keeping the root document alive as long as nodes exist +//! - Re-adopting nodes when accessing them to ensure validity +//! - Using tree-sitter's heap-allocated node pointers which remain stable +//! +//! ## Use Cases +//! +//! - **Threading**: Send AST analysis results between threads +//! - **FFI**: Pass nodes to JavaScript (NAPI) or Python (PyO3) +//! - **Async**: Store nodes across await points +//! - **Caching**: Keep processed nodes in long-lived data structures + use crate::Doc; +#[cfg(feature = "matching")] use crate::NodeMatch; use crate::node::{Node, Root}; @@ -23,18 +62,56 @@ use crate::node::{Node, Root}; // https://github.com/tree-sitter/tree-sitter/blob/20924fa4cdeb10d82ac308481e39bf8519334e55/lib/src/tree.c#L37-L39 // https://tree-sitter.github.io/tree-sitter/using-parsers#concurrency // -// So **as long as Root is not dropped, the Tree will not be freed. And Node will be valid.** -// -// PinnedNodeData provides a systematical way to keep Root live and `T` can be anything containing valid Nodes. -// Nodes' lifetime is 'static, meaning the Node is not borrow checked instead of living throughout the program. -// There are two ways to use PinnedNodeData -// 1. use it by borrowing. PinnedNodeData guarantees Root is alive and Node in T is valid. -// Notable example is sending Node across threads. -// 2. take its ownership. Users should take extra care to keep Root alive. -// Notable example is sending Root to JavaScript/Python heap. +/// Container that extends AST node lifetimes by keeping their root document alive. +/// +/// `PinnedNodeData` solves the problem of passing AST nodes across thread boundaries +/// or FFI interfaces where normal lifetime constraints are too restrictive. It combines +/// a root document with data containing nodes, ensuring the nodes remain valid. +/// +/// # Type Parameters +/// +/// - `D: Doc` - The document type (e.g., `StrDoc`) +/// - `T` - Data containing nodes with `'static` lifetimes +/// +/// # Safety Model +/// +/// The container uses unsafe code to extend node lifetimes, but maintains safety by: +/// - Keeping the root document alive to prevent tree deallocation +/// - Re-adopting nodes when accessed to ensure they point to valid memory +/// - Leveraging tree-sitter's stable heap-allocated node pointers +/// +/// # Usage Patterns +/// +/// ## 1. Borrowing Pattern (Recommended) +/// Use through references to guarantee safety: +/// ```rust,ignore +/// let pinned = PinnedNodeData::new(root, |static_root| { +/// static_root.find("pattern").unwrap() +/// }); +/// let node = pinned.get_data(); // Safe access +/// ``` +/// +/// ## 2. Ownership Pattern (Advanced) +/// Take ownership but ensure root stays alive: +/// ```rust,ignore +/// let (root, node_data) = pinned.into_raw(); +/// // You must keep `root` alive while using `node_data` +/// ``` +/// +/// # Thread Safety +/// +/// Safe to send across threads as long as the contained data is `Send`: +/// ```rust,ignore +/// std::thread::spawn(move || { +/// let node = pinned.get_data(); +/// // Process node in background thread +/// }); +/// ``` #[doc(hidden)] pub struct PinnedNodeData { + /// Root document kept alive to ensure node validity pin: Root, + /// Data containing nodes with extended lifetimes data: T, } @@ -89,6 +166,7 @@ unsafe impl NodeData for Node<'static, D> { } } +#[cfg(feature = "matching")] unsafe impl NodeData for NodeMatch<'static, D> { type Data = Self; fn get_data(&self) -> &Self::Data { @@ -106,6 +184,7 @@ unsafe impl NodeData for NodeMatch<'static, D> { } } +#[cfg(feature = "matching")] unsafe impl NodeData for Vec> { type Data = Self; fn get_data(&self) -> &Self::Data { diff --git a/crates/ast-engine/src/replacer.rs b/crates/ast-engine/src/replacer.rs index 62cb467..84f359c 100644 --- a/crates/ast-engine/src/replacer.rs +++ b/crates/ast-engine/src/replacer.rs @@ -4,6 +4,54 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # Code Replacement and Transformation +//! +//! Tools for replacing and transforming matched AST nodes with new content. +//! +//! ## Core Concepts +//! +//! - [`Replacer`] - Trait for generating replacement content from matched nodes +//! - Template-based replacement using meta-variables (e.g., `"let $VAR = $VALUE"`) +//! - Structural replacement using other AST nodes +//! - Automatic indentation handling to preserve code formatting +//! +//! ## Built-in Replacers +//! +//! Several types implement [`Replacer`] out of the box: +//! +//! - **`&str`** - Template strings with meta-variable substitution +//! - **[`Root`]** - Replace with entire AST trees +//! - **[`Node`]** - Replace with specific nodes +//! +//! ## Examples +//! +//! ### Template Replacement +//! +//! ```rust,no_run +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::matcher::MatcherExt; +//! let mut ast = Language::Tsx.ast_grep("var x = 42;"); +//! +//! // Replace using a template string +//! ast.replace("var $NAME = $VALUE", "const $NAME = $VALUE"); +//! println!("{}", ast.generate()); // "const x = 42;" +//! ``` +//! +//! ### Structural Replacement +//! +//! ```rust,no_run +//! # use thread_ast_engine::Language; +//! # use thread_ast_engine::tree_sitter::LanguageExt; +//! # use thread_ast_engine::matcher::MatcherExt; +//! let mut target = Language::Tsx.ast_grep("old_function();"); +//! let replacement = Language::Tsx.ast_grep("new_function(42)"); +//! +//! // Replace with another AST +//! target.replace("old_function()", replacement); +//! println!("{}", target.generate()); // "new_function(42);" +//! ``` + use crate::matcher::Matcher; use crate::meta_var::{MetaVariableID, Underlying, is_valid_meta_var_char}; use crate::{Doc, Node, NodeMatch, Root}; @@ -21,9 +69,60 @@ mod template; pub use crate::source::Content; pub use template::{TemplateFix, TemplateFixError}; -/// Replace meta variable in the replacer string +/// Generate replacement content for matched AST nodes. +/// +/// The `Replacer` trait defines how to transform a matched node into new content. +/// Implementations can use template strings with meta-variables, structural +/// replacement with other AST nodes, or custom logic. +/// +/// # Type Parameters +/// +/// - `D: Doc` - The document type containing source code and language information +/// +/// # Example Implementation +/// +/// ```rust,no_run +/// # use thread_ast_engine::replacer::Replacer; +/// # use thread_ast_engine::{Doc, NodeMatch}; +/// # use thread_ast_engine::meta_var::Underlying; +/// struct CustomReplacer; +/// +/// impl Replacer for CustomReplacer { +/// fn generate_replacement(&self, nm: &NodeMatch<'_, D>) -> Underlying { +/// // Custom replacement logic here +/// "new_code".as_bytes().to_vec() +/// } +/// } +/// ``` pub trait Replacer { + /// Generate replacement content for a matched node. + /// + /// Takes a [`NodeMatch`] containing the matched node and its captured + /// meta-variables, then returns the raw bytes that should replace the + /// matched content in the source code. + /// + /// # Parameters + /// + /// - `nm` - The matched node with captured meta-variables + /// + /// # Returns + /// + /// Raw bytes representing the replacement content fn generate_replacement(&self, nm: &NodeMatch<'_, D>) -> Underlying; + + /// Determine the exact range of source code to replace. + /// + /// By default, replaces the entire matched node's range. Some matchers + /// may want to replace only a portion of the matched content. + /// + /// # Parameters + /// + /// - `nm` - The matched node + /// - `matcher` - The matcher that found this node (may provide custom range info) + /// + /// # Returns + /// + /// Byte range in the source code to replace fn get_replaced_range(&self, nm: &NodeMatch<'_, D>, matcher: impl Matcher) -> Range { let range = nm.range(); if let Some(len) = matcher.get_match_len(nm.get_node().clone()) { diff --git a/crates/ast-engine/src/replacer/indent.rs b/crates/ast-engine/src/replacer/indent.rs index ec8bf9a..73040a3 100644 --- a/crates/ast-engine/src/replacer/indent.rs +++ b/crates/ast-engine/src/replacer/indent.rs @@ -6,118 +6,89 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT -/** - This module is for indentation-sensitive replacement. - - Ideally, structural search and replacement should all be based on AST. - But this means our changed AST need to be pretty-printed by structural rules, - which we don't have enough resource to support. An indentation solution is used. - - The algorithm is quite complicated, uncomprehensive, sluggish and buggy. - But let's walk through it by example. - - consider this code - ```ignore - if (true) { - a( - 1 - + 2 - + 3 - ) - } - ``` - - and this pattern and replacement - - ```ignore - // pattern - a($B) - // replacement - c( - $B - ) - ``` - - We need to compute the relative indentation of the captured meta-var. - When we insert the meta-var into replacement, keep the relative indent intact, - while also respecting the replacement indent. - Finally, the whole replacement should replace the matched node - in a manner that maintains the indentation of the source. - - We need to consider multiple indentations. - Key concepts here: - * meta-var node: in this case `$B` in pattern/replacement, or `1+2+3` in source. - * matched node: in this case `a($B)` in pattern, `a(1 + 2 + 3)` in source - * meta-var source indentation: `$B` matches `1+2+3`, the first line's indentation in source code is 4. - * meta-var replacement indentation: in this case 2 - * matched node source indentation: in this case 2 - - ## Extract Meta-var with de-indent - 1. Initial meta-var node B text: - The meta-var source indentation for `$B` is 4. - However, meta-var node does not have the first line indentation. - ```ignore - 1 - + 2 - + 3 - ``` - 2. Deindent meta-var node B, except first line: - De-indenting all lines following the first line by 4 spaces gives us this relative code layout. - - ```ignore - 1 - + 2 - + 3 - ``` - - ## Insert meta-var into replacement with re-indent - - 3. Re-indent by meta-var replacement indentation. - meta-var node $B occurs in replace with first line indentation of 2. - We need to re-indent the meta-var code before replacement, except the first line - ```ignore - 1 - + 2 - + 3 - ``` - - 4. Insert meta-var code in to replacement - ```ignore - c( - 1 - + 2 - + 3 - ) - ``` - - ## Insert replacement into source with re-indent - - 5. Re-indent the replaced template code except first line - The whole matched node first line indentation is 2. - We need to reindent the replacement code by 2, except the first line. - ```ignore - c( - 1 - + 2 - + 3 - ) - ``` - - 6. Inserted replacement code to original tree - - ```ignore - if (true) { - c( - 1 - + 2 - + 3 - ) - } - ``` - - The steps 3,4 and steps 5,6 are similar. We can define a `replace_with_indent` to it. - Following the same path, we can define a `extract_with_deindent` for steps 1,2 -*/ +//! # Indentation-Preserving Code Replacement +//! +//! Handles automatic indentation adjustment during code replacement to maintain +//! proper formatting when inserting multi-line code snippets. +//! +//! ## The Challenge +//! +//! When replacing AST nodes with new code that contains meta-variables, we need to: +//! 1. Preserve the relative indentation within captured variables +//! 2. Adjust indentation to match the replacement context +//! 3. Maintain overall source code formatting +//! +//! ## Algorithm Overview +//! +//! The indentation algorithm works in three phases: +//! +//! ### 1. Extract with De-indent +//! Extract captured meta-variables and normalize their indentation by removing +//! the original context indentation (except from the first line). +//! +//! ### 2. Insert with Re-indent +//! Insert the normalized meta-variable content into the replacement template, +//! applying the replacement context's indentation. +//! +//! ### 3. Final Re-indent +//! Adjust the entire replacement to match the original matched node's indentation +//! in the source code. +//! +//! ## Example Walkthrough +//! +//! **Original Code:** +//! ```ignore +//! if (true) { +//! a( +//! 1 +//! + 2 +//! + 3 +//! ) +//! } +//! ``` +//! +//! **Pattern:** `a($B)` +//! **Replacement:** `c(\n $B\n)` +//! +//! **Step 1 - Extract `$B` (indented at 4 spaces):** +//! ```ignore +//! 1 +//! + 2 // Relative indent preserved +//! + 3 +//! ``` +//! +//! **Step 2 - Insert into replacement (2 space context):** +//! ```ignore +//! c( +//! 1 +//! + 2 // 2 + 2 = 4 spaces total +//! + 3 +//! ) +//! ``` +//! +//! **Step 3 - Final indent (match original 2 space context):** +//! ```ignore +//! if (true) { +//! c( +//! 1 +//! + 2 +//! + 3 +//! ) +//! } +//! ``` +//! +//! ## Key Types +//! +//! - [`DeindentedExtract`] - Represents extracted content with indentation info +//! - [`extract_with_deindent`] - Extracts and normalizes meta-variable content +//! - [`indent_lines`] - Applies indentation to multi-line content +//! +//! ## Limitations +//! +//! - Only supports space-based indentation (tabs not fully supported) +//! - Assumes well-formed input indentation +//! - Performance overhead for large code blocks +//! - Complex algorithm with edge cases use crate::source::Content; use std::borrow::Cow; use std::cmp::Ordering; @@ -134,15 +105,51 @@ fn get_space() -> C::Underlying { const MAX_LOOK_AHEAD: usize = 512; -/// Represents how we de-indent matched meta var. +/// Extracted content with indentation information for later re-indentation. +/// +/// Represents the result of extracting a meta-variable's content from source code, +/// along with the indentation context needed for proper re-insertion. pub enum DeindentedExtract<'a, C: Content> { - /// If meta-var is only one line, no need to de-indent/re-indent + /// Single-line content that doesn't require indentation adjustment. + /// + /// Contains just the raw content bytes since there are no line breaks + /// to worry about for indentation purposes. SingleLine(&'a [C::Underlying]), - /// meta-var's has multiple lines, may need re-indent + + /// Multi-line content with original indentation level recorded. + /// + /// Contains the content bytes and the number of spaces that were used + /// for indentation in the original context. The first line's indentation + /// is not included in the content. + /// + /// # Fields + /// - Content bytes with relative indentation preserved + /// - Original indentation level (number of spaces) MultiLine(&'a [C::Underlying], usize), } -/// Returns [`DeindentedExtract`] for later de-indent/re-indent. +/// Extract content from source code and prepare it for indentation-aware replacement. +/// +/// Analyzes the content at the given range and determines whether it needs +/// indentation processing. For multi-line content, calculates the original +/// indentation level for later re-indentation. +/// +/// # Parameters +/// +/// - `content` - Source content to extract from +/// - `range` - Byte range of the content to extract +/// +/// # Returns +/// +/// [`DeindentedExtract`] containing the content and indentation information +/// +/// # Example +/// +/// ```rust,ignore +/// let source = " if (true) {\n console.log('test');\n }"; +/// let extract = extract_with_deindent(&source, 2..source.len()); +/// // Returns MultiLine with 2-space indentation context +/// ``` pub fn extract_with_deindent( content: &C, range: Range, @@ -210,7 +217,7 @@ where let mut ret = vec![]; let space = get_space::(); let leading: Vec<_> = std::iter::repeat_n(space, indent).collect(); - // first line never got indent + // first line wasn't indented, so we don't add leading spaces if let Some(line) = lines.next() { ret.extend(line.iter().cloned()); } @@ -251,7 +258,7 @@ pub fn get_indent_at_offset(src: &[C::Underlying]) -> usize { } // NOTE: we assume input is well indented. -// following line's should have fewer indentation than initial line +// following lines should have fewer indentations than initial line fn remove_indent(indent: usize, src: &[C::Underlying]) -> Vec { let indentation: Vec<_> = std::iter::repeat_n(get_space::(), indent) .collect(); diff --git a/crates/ast-engine/src/replacer/structural.rs b/crates/ast-engine/src/replacer/structural.rs index 0907eb5..4abb604 100644 --- a/crates/ast-engine/src/replacer/structural.rs +++ b/crates/ast-engine/src/replacer/structural.rs @@ -4,17 +4,104 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # Structural Code Replacement Engine +//! +//! Generates replacement code by traversing replacement templates and substituting +//! meta-variables with captured content from pattern matches. +//! +//! ## Core Concept +//! +//! Structural replacement uses AST-based templates rather than simple string substitution. +//! The replacement template is parsed into an AST, then meta-variables in that AST +//! are replaced with content captured during pattern matching. +//! +//! ## Process Overview +//! +//! 1. **Parse replacement template** - Convert replacement string to AST +//! 2. **Traverse AST nodes** - Visit each node in the replacement template +//! 3. **Identify meta-variables** - Find nodes that match meta-variable patterns +//! 4. **Substitute content** - Replace meta-variables with captured content +//! 5. **Generate output** - Combine unchanged and replaced content into final result +//! +//! ## Example +//! +//! **Pattern:** `function $NAME($$$PARAMS) { $$$BODY }` +//! **Replacement template:** `async function $NAME($$$PARAMS) { $$$BODY }` +//! **Captured variables:** +//! - `$NAME` → `"calculateSum"` +//! - `$$$PARAMS` → `"a, b"` +//! - `$$$BODY` → `"return a + b;"` +//! +//! **Result:** `async function calculateSum(a, b) { return a + b; }` +//! +//! ## Key Functions +//! +//! - [`gen_replacement`] - Main entry point for generating replacement content +//! - [`collect_edits`] - Traverse replacement template and collect substitution edits +//! - [`merge_edits_to_vec`] - Combine original content with edits to produce final result +//! +//! ## Algorithm Details +//! +//! Uses a post-order depth-first traversal to visit all nodes in the replacement +//! template. When a meta-variable is found, it's replaced with the corresponding +//! captured content. The traversal stops at nodes that match meta-variables to +//! avoid processing their children unnecessarily. +//! +//! ## Advantages +//! +//! - **Syntax-aware** - Respects language syntax and structure +//! - **Precise** - Only replaces intended meta-variables, not similar text +//! - **Efficient** - Single-pass traversal with minimal memory allocation +//! - **Language-agnostic** - Works with any language that has AST support + use super::{Edit, Underlying}; use crate::language::Language; use crate::meta_var::MetaVarEnv; use crate::source::{Content, SgNode}; use crate::{Doc, Node, NodeMatch, Root}; +/// Generate replacement content by substituting meta-variables in a template AST. +/// +/// Takes a replacement template (parsed as an AST) and substitutes any meta-variables +/// found in it with content captured during pattern matching. +/// +/// # Parameters +/// +/// - `root` - The replacement template parsed as an AST +/// - `nm` - Node match containing captured meta-variables +/// +/// # Returns +/// +/// Raw bytes representing the final replacement content +/// +/// # Example +/// +/// ```rust,ignore +/// // Template: "async function $NAME() { $$$BODY }" +/// // Variables: $NAME="test", $$$BODY="return 42;" +/// // Result: "async function test() { return 42; }" +/// let replacement = gen_replacement(&template_root, &node_match); +/// ``` pub fn gen_replacement(root: &Root, nm: &NodeMatch) -> Underlying { let edits = collect_edits(root, nm.get_env(), nm.lang()); merge_edits_to_vec(edits, root) } +/// Traverse the replacement template AST and collect edits for meta-variable substitution. +/// +/// Performs a post-order depth-first traversal of the replacement template, +/// identifying nodes that represent meta-variables and creating edit operations +/// to replace them with captured content. +/// +/// # Parameters +/// +/// - `root` - Root of the replacement template AST +/// - `env` - Meta-variable environment with captured content +/// - `lang` - Language implementation for meta-variable extraction +/// +/// # Returns +/// +/// Vector of edit operations to apply for meta-variable substitution fn collect_edits(root: &Root, env: &MetaVarEnv, lang: &D::Lang) -> Vec> { let mut node = root.root(); let root_id = node.node_id(); diff --git a/crates/ast-engine/src/replacer/template.rs b/crates/ast-engine/src/replacer/template.rs index d7718fc..248a495 100644 --- a/crates/ast-engine/src/replacer/template.rs +++ b/crates/ast-engine/src/replacer/template.rs @@ -188,7 +188,7 @@ if (true) { $B )"; let mut src = Tsx.ast_grep(src); - let pattern = Pattern::new(pattern, Tsx); + let pattern = Pattern::new(pattern, &Tsx); let success = src.replace(pattern, template).expect("should replace"); assert!(success); let expect = r"if (true) { diff --git a/crates/ast-engine/src/source.rs b/crates/ast-engine/src/source.rs index cfdd742..7081551 100644 --- a/crates/ast-engine/src/source.rs +++ b/crates/ast-engine/src/source.rs @@ -4,33 +4,94 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT -//! This module defines the `Doc` and `Content` traits to abstract away source code encoding issues. +//! # Document and Content Abstraction //! -//! ast-grep supports three kinds of encoding: utf-8 for CLI, utf-16 for nodeJS napi and `Vec` for wasm. -//! Different encoding will produce different tree-sitter Node's range and position. +//! Core traits for abstracting source code documents and their encoding across different platforms. //! -//! The `Content` trait is defined to abstract different encoding. -//! It is used as associated type bound `Source` in the `Doc` trait. -//! Its associated type `Underlying` represents the underlying type of the content, e.g. `Vec`, `Vec`. +//! ## Multi-Platform Support //! -//! `Doc` is a trait that defines a document that can be parsed by Tree-sitter. -//! It has a `Source` associated type bounded by `Content` that represents the source code of the document, -//! and a `Lang` associated type that represents the language of the document. +//! thread-ast-engine supports multiple text encodings to work across different environments: +//! - **UTF-8** - Standard for CLI applications and most Rust code +//! - **UTF-16** - Required for Node.js NAPI bindings +//! - **`Vec`** - Used in WASM environments for JavaScript interop +//! +//! Different encodings affect how byte positions and ranges are calculated in tree-sitter nodes, +//! so this abstraction ensures consistent behavior across platforms. +//! +//! ## Key Concepts +//! +//! ### Documents ([`Doc`]) +//! Represents a complete source code document with its language and parsing information. +//! Provides methods to access the source text, perform edits, and get AST nodes. +//! +//! ### Content ([`Content`]) +//! Abstracts the underlying text representation (bytes, UTF-16 code units, etc.). +//! Handles encoding/decoding operations needed for text manipulation and replacement. +//! +//! ### Node Interface ([`SgNode`]) +//! Generic interface for AST nodes that works across different parser backends. +//! Provides navigation, introspection, and traversal methods. +//! +//! ## Example Usage +//! +//! ```rust,ignore +//! // Documents abstract over different source encodings +//! let doc = StrDoc::new("const x = 42;", Language::JavaScript); +//! let root = doc.root_node(); +//! +//! // Content trait handles encoding differences transparently +//! let source_bytes = doc.get_source().get_range(0..5); // "const" +//! ``` use crate::{Position, language::Language, node::KindId}; use std::borrow::Cow; use std::ops::Range; +/// Represents an edit operation on source code. +/// +/// Edits specify where in the source to make changes and what new content +/// to insert. Used for incremental parsing and code transformation. +/// +/// # Type Parameters +/// +/// - `S: Content` - The content type (determines encoding) +/// +/// # Example +/// +/// ```rust,ignore +/// let edit = Edit { +/// position: 5, // Start at byte position 5 +/// deleted_length: 3, // Delete 3 bytes +/// inserted_text: "new".as_bytes().to_vec(), // Insert "new" +/// }; +/// ``` // https://github.com/tree-sitter/tree-sitter/blob/e4e5ffe517ca2c668689b24cb17c51b8c6db0790/cli/src/parse.rs #[derive(Debug, Clone)] pub struct Edit { + /// Byte position where the edit starts pub position: usize, + /// Number of bytes to delete from the original content pub deleted_length: usize, + /// New content to insert (in the content's underlying representation) pub inserted_text: Vec, } -/// NOTE: Some method names are the same as tree-sitter's methods. -/// Fully Qualified Syntax may needed +/// Generic interface for AST nodes across different parser backends. +/// +/// `SgNode` (SourceGraph Node) provides a consistent API for working with +/// AST nodes regardless of the underlying parser implementation. Supports +/// navigation, introspection, and traversal operations. +/// +/// # Lifetime +/// +/// The lifetime `'r` ties the node to its root document, ensuring memory safety. +/// +/// # Note +/// +/// Some method names match tree-sitter's API. Use fully qualified syntax +/// if there are naming conflicts with tree-sitter imports. +/// +/// See: pub trait SgNode<'r>: Clone { fn parent(&self) -> Option; fn children(&self) -> impl ExactSizeIterator; @@ -131,27 +192,95 @@ pub trait SgNode<'r>: Clone { fn child_by_field_id(&self, field_id: u16) -> Option; } +/// Represents a source code document with its language and parsed AST. +/// +/// `Doc` provides the core interface for working with parsed source code documents. +/// It combines the source text, language information, and AST representation in +/// a single abstraction that supports editing and node operations. +/// +/// # Type Parameters +/// +/// - `Source: Content` - The text representation (String, UTF-16, etc.) +/// - `Lang: Language` - The programming language implementation +/// - `Node: SgNode` - The AST node implementation +/// +/// # Example +/// +/// ```rust,ignore +/// // Documents provide access to source, language, and AST +/// let doc = StrDoc::new("const x = 42;", JavaScript); +/// +/// // Access different aspects of the document +/// let source = doc.get_source(); // Get source text +/// let lang = doc.get_lang(); // Get language info +/// let root = doc.root_node(); // Get AST root +/// +/// // Extract text from specific nodes +/// let node_text = doc.get_node_text(&some_node); +/// ``` pub trait Doc: Clone + 'static { + /// The source code representation (String, UTF-16, etc.) type Source: Content; + /// The programming language implementation type Lang: Language; + /// The AST node type for this document type Node<'r>: SgNode<'r>; + + /// Get the language implementation for this document fn get_lang(&self) -> &Self::Lang; + + /// Get the source code content fn get_source(&self) -> &Self::Source; + + /// Apply an edit to the document, updating both source and AST fn do_edit(&mut self, edit: &Edit) -> Result<(), String>; + + /// Get the root AST node fn root_node(&self) -> Self::Node<'_>; + + /// Extract the text content of a specific AST node fn get_node_text<'a>(&'a self, node: &Self::Node<'a>) -> Cow<'a, str>; } +/// Abstracts source code text representation across different encodings. +/// +/// `Content` allows the same AST operations to work with different text encodings +/// (UTF-8, UTF-16, etc.) by providing encoding/decoding operations and position +/// calculations. Essential for cross-platform support. +/// +/// # Type Parameters +/// +/// - `Underlying` - The basic unit type (u8 for UTF-8, u16 for UTF-16, etc.) +/// +/// # Example +/// +/// ```rust,ignore +/// // Content trait abstracts encoding differences +/// let content = "Hello, world!"; +/// let bytes = content.get_range(0..5); // [72, 101, 108, 108, 111] for UTF-8 +/// let column = content.get_char_column(0, 7); // Character position +/// ``` pub trait Content: Sized { + /// The underlying data type (u8, u16, char, etc.) type Underlying: Clone + PartialEq; + + /// Get a slice of the underlying data for the given byte range fn get_range(&self, range: Range) -> &[Self::Underlying]; - /// Used for string replacement. We need this for - /// indentation and deindentation. + + /// Convert a string to this content's underlying representation. + /// + /// Used during text replacement to ensure proper encoding. fn decode_str(src: &str) -> Cow<'_, [Self::Underlying]>; - /// Used for string replacement. We need this for - /// transformation. + + /// Convert underlying data back to a string. + /// + /// Used to extract text content after transformations. fn encode_bytes(bytes: &[Self::Underlying]) -> Cow<'_, str>; - /// Get the character column at the given position + + /// Calculate the character column position at a given byte offset. + /// + /// Handles Unicode properly by computing actual character positions + /// rather than byte positions. fn get_char_column(&self, column: usize, offset: usize) -> usize; } diff --git a/crates/ast-engine/src/tree_sitter/mod.rs b/crates/ast-engine/src/tree_sitter/mod.rs index 2652bdb..7e426dd 100644 --- a/crates/ast-engine/src/tree_sitter/mod.rs +++ b/crates/ast-engine/src/tree_sitter/mod.rs @@ -4,32 +4,113 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT +//! # Tree-sitter Integration and AST Backend +//! +//! Core integration layer between thread-ast-engine and the tree-sitter parsing library. +//! Provides the foundational types and functionality for parsing source code into ASTs, +//! editing trees incrementally, and bridging tree-sitter concepts with thread-ast-engine APIs. +//! +//! ## Key Components +//! +//! - [`StrDoc`] - Document type that combines source code with parsed tree-sitter trees +//! - [`LanguageExt`] - Extension trait for languages that work with tree-sitter +//! - [`TSParseError`] - Error types for tree-sitter parsing failures +//! - Tree editing and incremental parsing support +//! - Language injection support for multi-language documents +//! +//! ## Core Concepts +//! +//! ### Documents and Parsing +//! +//! [`StrDoc`] represents a parsed document containing both source code and its tree-sitter AST. +//! It handles incremental parsing when the source is modified, automatically updating the tree +//! structure while preserving unchanged portions for performance. +//! +//! ### Language Extensions +//! +//! [`LanguageExt`] extends the base [`Language`] trait with tree-sitter specific functionality: +//! - Getting tree-sitter language objects for parsing +//! - Creating AST-grep instances +//! - Handling language injections (like JavaScript in HTML) +//! +//! ### Tree Editing +//! +//! Supports incremental tree editing through tree-sitter's edit API, allowing efficient +//! updates when source code changes without full re-parsing. +//! +//! ## Example Usage +//! +//! ```rust,no_run +//! # use thread_ast_engine::tree_sitter::{StrDoc, LanguageExt}; +//! # use thread_ast_engine::Language; +//! # struct Tsx; +//! # impl Language for Tsx { +//! # fn kind_to_id(&self, _: &str) -> u16 { 0 } +//! # fn field_to_id(&self, _: &str) -> Option { None } +//! # fn build_pattern(&self, _: &thread_ast_engine::PatternBuilder) -> Result { todo!() } +//! # } +//! # impl LanguageExt for Tsx { +//! # fn get_ts_language(&self) -> thread_ast_engine::tree_sitter::TSLanguage { todo!() } +//! # } +//! +//! // Create a document from source code +//! let doc = StrDoc::new("let x = 42;", Tsx); +//! +//! // Access the parsed tree +//! let root = doc.root_node(); +//! println!("Root kind: {}", root.kind()); +//! +//! // Create an AST-grep instance for pattern matching +//! let ast_grep = Tsx.ast_grep("function foo() { return 42; }"); +//! let root_node = ast_grep.root(); +//! ``` + pub mod traversal; use crate::node::Root; + +use crate::AstGrep; +#[cfg(feature = "matching")] +use crate::Matcher; +#[cfg(feature = "matching")] use crate::replacer::Replacer; use crate::source::{Content, Doc, Edit, SgNode}; -use crate::{AstGrep, Matcher}; use crate::{Language, Position, node::KindId}; use std::borrow::Cow; use std::num::NonZero; use thiserror::Error; +#[cfg(feature = "matching")] use thread_utils::RapidMap; pub use traversal::{TsPre, Visitor}; pub use tree_sitter::Language as TSLanguage; use tree_sitter::{InputEdit, LanguageError, Node, Parser, Point, Tree}; pub use tree_sitter::{Point as TSPoint, Range as TSRange}; -/// Represents tree-sitter related error +/// Errors that can occur during tree-sitter parsing operations. +/// +/// Tree-sitter parsing can fail for several reasons, from language compatibility +/// issues to timeout problems. These errors provide information about what +/// went wrong during the parsing process. #[derive(Debug, Error)] pub enum TSParseError { + /// The language grammar is incompatible with the parser. + /// + /// Occurs when trying to assign a language that the parser can't handle, + /// typically due to version mismatches between the tree-sitter library + /// and the language grammar. #[error("incompatible `Language` is assigned to a `Parser`.")] Language(#[from] LanguageError), - /// A general error when tree sitter fails to parse in time. It can be caused by - /// the following reasons but tree-sitter does not provide error detail. - /// * The timeout set with [`Parser::set_timeout_micros`] expired - /// * The cancellation flag set with [`Parser::set_cancellation_flag`] was flipped - /// * The parser has not yet had a language assigned with [`Parser::set_language`] + + /// Tree-sitter failed to parse the input within the configured constraints. + /// + /// Can be caused by several conditions: + /// * Parsing timeout exceeded (see [`Parser::set_timeout_micros`]) + /// * Cancellation flag was triggered (see [`Parser::set_cancellation_flag`]) + /// * No language was assigned to the parser (see [`Parser::set_language`]) + /// * The input was too complex or malformed for the parser to handle + /// + /// Tree-sitter doesn't provide detailed error information, so this covers + /// all general parsing failures. #[error("general error when tree-sitter fails to parse.")] TreeUnavailable, } @@ -48,10 +129,41 @@ fn parse_lang( } } +/// Document type that combines source code with its parsed tree-sitter AST. +/// +/// `StrDoc` represents a complete parsed document, holding both the original +/// source code and the tree-sitter AST. It supports incremental parsing, +/// meaning when edits are made to the source, only the affected parts of +/// the tree are re-parsed for better performance. +/// +/// # Type Parameters +/// +/// - `L: LanguageExt` - The language implementation that provides tree-sitter integration +/// +/// # Example +/// +/// ```rust,no_run +/// # use thread_ast_engine::tree_sitter::StrDoc; +/// # struct JavaScript; +/// # impl thread_ast_engine::Language for JavaScript { +/// # fn kind_to_id(&self, _: &str) -> u16 { 0 } +/// # fn field_to_id(&self, _: &str) -> Option { None } +/// # fn build_pattern(&self, _: &thread_ast_engine::PatternBuilder) -> Result { todo!() } +/// # } +/// # impl thread_ast_engine::tree_sitter::LanguageExt for JavaScript { +/// # fn get_ts_language(&self) -> thread_ast_engine::tree_sitter::TSLanguage { todo!() } +/// # } +/// let doc = StrDoc::new("const x = 42;", JavaScript); +/// let root = doc.root_node(); +/// println!("AST root: {}", root.kind()); +/// ``` #[derive(Clone, Debug)] pub struct StrDoc { + /// The source code text pub src: String, + /// Language implementation for parsing and node operations pub lang: L, + /// The parsed tree-sitter AST pub tree: Tree, } @@ -269,25 +381,103 @@ pub fn perform_edit(tree: &mut Tree, input: &mut S, edit: &Edit TSLanguage { +/// tree_sitter_javascript::LANGUAGE.into() +/// } +/// } +/// ``` pub trait LanguageExt: Language { - /// Create an [`AstGrep`] instance for the language + /// Create an [`AstGrep`] instance for parsing and pattern matching. + /// + /// Convenience method that parses the source code and returns an [`AstGrep`] + /// instance ready for pattern matching and tree manipulation. + /// + /// # Parameters + /// + /// - `source` - Source code to parse + /// + /// # Example + /// + /// ```rust,ignore + /// let ast = JavaScript.ast_grep("const x = 42;"); + /// let root = ast.root(); + /// ``` fn ast_grep>(&self, source: S) -> AstGrep> { AstGrep::new(source, self.clone()) } - /// tree sitter language to parse the source + /// Get the tree-sitter language object for parsing. + /// + /// Returns the tree-sitter language grammar that this language uses + /// for parsing source code. This is the core integration point between + /// thread-ast-engine and tree-sitter. + /// + /// # Returns + /// + /// The tree-sitter [`TSLanguage`] object for this language fn get_ts_language(&self) -> TSLanguage; + /// List of languages that can be injected into this language. + /// + /// For languages that support embedding other languages (like HTML with CSS/JavaScript), + /// returns the names of languages that can be injected. Returns `None` if this + /// language doesn't support injections. + /// + /// # Returns + /// + /// Array of injectable language names, or `None` if no injections supported fn injectable_languages(&self) -> Option<&'static [&'static str]> { None } - /// get injected language regions in the root document. e.g. get `JavaScripts` in HTML - /// it will return a list of tuples of (language, regions). - /// The first item is the embedded region language, e.g. javascript - /// The second item is a list of regions in `tree_sitter`. - /// [also see](https://tree-sitter.github.io/tree-sitter/using-parsers#multi-language-documents) + /// Extract language injection regions from a parsed document. + /// + /// Analyzes the AST to find regions where other languages are embedded. + /// For example, finds JavaScript code blocks within HTML ` + +"#; + +let tree = html.ast_grep(source); +let injections = html.extract_injections(tree.root()); +// injections contains JavaScript and CSS code ranges +``` + +## Architecture + +### Core Modules + +- [`lib.rs`](src/lib.rs) - Main module with language definitions and [`SupportLang`](src/lib.rs) enum +- [`parsers.rs`](src/parsers.rs) - Tree-sitter parser initialization and caching +- [`html.rs`](src/html.rs) - Special HTML implementation with language injection support + +### Language Implementation Patterns + +The crate uses two macros to implement languages: + +1. **`impl_lang!`** - For standard languages that accept `$` in identifiers +2. **`impl_lang_expando!`** - For languages requiring custom expando characters + +Both macros generate the same [`Language`](src/lib.rs) and [`LanguageExt`](src/lib.rs) trait implementations but with different pattern preprocessing behavior. + +## Performance + +- **Cached Parsers** - Tree-sitter languages are initialized once and cached using [`OnceLock`](src/parsers.rs) +- **Fast Path Optimizations** - Common file extensions and language names use fast-path matching +- **Zero-Cost Abstractions** - Language traits compile to direct function calls + +## Examples + +### File Type Detection + +```rust +use thread_language::SupportLang; + +// Get file types for a language +let types = SupportLang::Rust.file_types(); +// Use with ignore crate for file filtering +``` + +### Pattern Building + +```rust +use thread_language::JavaScript; +use thread_ast_engine::{Language, PatternBuilder}; + +let js = JavaScript; +let builder = PatternBuilder::new("console.log($MSG)"); +let pattern = js.build_pattern(&builder).unwrap(); +``` + +## Contributing + +When adding a new language: + +1. Add the tree-sitter dependency to `Cargo.toml` +2. Add the parser function to [`parsers.rs`](src/parsers.rs) +3. Choose the appropriate macro (`impl_lang!` or `impl_lang_expando!`) in [`lib.rs`](src/lib.rs) +4. Add the language to [`SupportLang`](src/lib.rs) enum and related functions +5. Add tests in a separate module file + +## License + +Licensed under AGPL-3.0-or-later AND MIT. See license files for details. diff --git a/crates/language/src/html.rs b/crates/language/src/html.rs index ad04c4f..69d22cb 100644 --- a/crates/language/src/html.rs +++ b/crates/language/src/html.rs @@ -6,13 +6,51 @@ use super::pre_process_pattern; use thread_ast_engine::Language; +#[cfg(feature = "matching")] use thread_ast_engine::matcher::{Pattern, PatternBuilder, PatternError}; -use thread_ast_engine::tree_sitter::{LanguageExt, StrDoc, TSLanguage, TSRange}; -use thread_ast_engine::{Doc, Node, matcher::KindMatcher}; +#[cfg(feature = "matching")] +use thread_ast_engine::tree_sitter::{StrDoc, TSRange}; +use thread_ast_engine::tree_sitter::{LanguageExt, TSLanguage}; +#[cfg(feature = "matching")] +use thread_ast_engine::matcher::KindMatcher; +#[cfg(feature = "matching")] +use thread_ast_engine::{Doc, Node}; +#[cfg(feature = "html-embedded")] use thread_utils::RapidMap; -// tree-sitter-html uses locale dependent iswalnum for tagName -// https://github.com/tree-sitter/tree-sitter-html/blob/b5d9758e22b4d3d25704b72526670759a9e4d195/src/scanner.c#L194 +/// HTML language implementation with language injection capabilities. +/// +/// Uses `z` as the expando character for metavariables since HTML attributes +/// and tag names have specific character restrictions. +/// +/// ## Language Injection +/// +/// Automatically detects and extracts embedded languages: +/// - **JavaScript** in ` +/// +/// +/// "#; +/// +/// let tree = html.ast_grep(source); +/// let injections = html.extract_injections(tree.root()); +/// // injections contains JavaScript, CSS, and TypeScript ranges +/// ``` +/// +/// ## Note +/// tree-sitter-html uses locale-dependent `iswalnum` for tag names. +/// See: #[derive(Clone, Copy, Debug)] pub struct Html; impl Language for Html { @@ -30,6 +68,7 @@ impl Language for Html { .field_id_for_name(field) .map(|f| f.get()) } + #[cfg(feature = "matching")] fn build_pattern(&self, builder: &PatternBuilder) -> Result { builder.build(|src| StrDoc::try_new(src, *self)) } @@ -41,6 +80,7 @@ impl LanguageExt for Html { fn injectable_languages(&self) -> Option<&'static [&'static str]> { Some(&["css", "js", "ts", "tsx", "scss", "less", "stylus", "coffee"]) } + #[cfg(feature = "html-embedded")] fn extract_injections( &self, root: Node>, @@ -118,6 +158,7 @@ impl LanguageExt for Html { } } +#[cfg(feature = "html-embedded")] fn find_lang(node: &Node) -> Option { let html = node.lang(); let attr_matcher = KindMatcher::new("attribute", html); @@ -132,7 +173,7 @@ fn find_lang(node: &Node) -> Option { Some(val.text().to_string()) }) } - +#[cfg(feature = "matching")] fn node_to_range(node: &Node) -> TSRange { let r = node.range(); let start = node.start_pos(); diff --git a/crates/language/src/lib.rs b/crates/language/src/lib.rs index 1dc93b8..c8dceaf 100644 --- a/crates/language/src/lib.rs +++ b/crates/language/src/lib.rs @@ -4,38 +4,87 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT -//! This module defines the supported programming languages for ast-grep. +//! Language definitions and tree-sitter parsers for Thread AST analysis. //! -//! It provides a set of customized languages with expando_char / pre_process_pattern, -//! and a set of stub languages without preprocessing. -//! A rule of thumb: if your language does not accept identifiers like `$VAR`. -//! You need use `impl_lang_expando!` macro and a standalone file for testing. -//! Otherwise, you can define it as a stub language using `impl_lang!`. -//! To see the full list of languages, visit `` +//! Provides unified language support through consistent [`Language`] and [`LanguageExt`] traits +//! across 24+ programming languages. Each language can be feature-gated individually or included +//! in groups. +//! +//! ## Language Categories +//! +//! ### Standard Languages +//! Languages that accept `$` as a valid identifier character and use default pattern processing: +//! - [`Bash`], [`Java`], [`JavaScript`], [`Json`], [`Lua`], [`Scala`], [`TypeScript`], [`Tsx`], [`Yaml`] +//! +//! ### Custom Pattern Languages +//! Languages requiring special metavariable handling with custom expando characters: +//! - [`C`] (`µ`), [`Cpp`] (`µ`), [`CSharp`] (`µ`), [`Css`] (`_`), [`Elixir`] (`µ`) +//! - [`Go`] (`µ`), [`Haskell`] (`µ`), [`Html`] (`z`), [`Kotlin`] (`µ`), [`Php`] (`µ`) +//! - [`Python`] (`µ`), [`Ruby`] (`µ`), [`Rust`] (`µ`), [`Swift`] (`µ`) +//! +//! ## Usage +//! +//! ```rust +//! use thread_language::{SupportLang, Rust}; +//! use thread_ast_engine::{Language, LanguageExt}; +//! +//! // Runtime language selection +//! let lang = SupportLang::from_path("main.rs").unwrap(); +//! let tree = lang.ast_grep("fn main() {}"); +//! +//! // Compile-time language selection +//! let rust = Rust; +//! let tree = rust.ast_grep("fn main() {}"); +//! ``` +//! +//! ## Implementation Details +//! +//! Languages are implemented using two macros: +//! - [`impl_lang!`] - Standard languages accepting `$` in identifiers +//! - [`impl_lang_expando!`] - Languages requiring custom expando characters for metavariables pub mod parsers; +#[cfg(feature = "bash")] mod bash; +#[cfg(feature = "cpp")] mod cpp; +#[cfg(feature = "csharp")] mod csharp; +#[cfg(feature = "css")] mod css; +#[cfg(feature = "elixir")] mod elixir; +#[cfg(feature = "go")] mod go; +#[cfg(feature = "haskell")] mod haskell; +#[cfg(feature = "html")] mod html; +#[cfg(feature = "json")] mod json; +#[cfg(feature = "kotlin")] mod kotlin; +#[cfg(feature = "lua")] mod lua; +#[cfg(feature = "php")] mod php; #[cfg(feature = "profiling")] pub mod profiling; +#[cfg(feature = "python")] mod python; +#[cfg(feature = "ruby")] mod ruby; +#[cfg(feature = "rust")] mod rust; +#[cfg(feature = "scala")] mod scala; +#[cfg(feature = "swift")] mod swift; +#[cfg(feature = "yaml")] mod yaml; - +#[cfg(feature = "html")] pub use html::Html; +#[cfg(feature = "matching")] use thread_ast_engine::{Pattern, PatternBuilder, PatternError}; use ignore::types::{Types, TypesBuilder}; @@ -46,15 +95,33 @@ use std::fmt; use std::fmt::{Display, Formatter}; use std::path::Path; use std::str::FromStr; +#[cfg(feature = "matching")] use thread_ast_engine::Node; use thread_ast_engine::meta_var::MetaVariable; -use thread_ast_engine::tree_sitter::{StrDoc, TSLanguage, TSRange}; +use thread_ast_engine::tree_sitter::TSLanguage; +#[cfg(feature = "matching")] +use thread_ast_engine::tree_sitter::{StrDoc, TSRange}; +#[cfg(feature = "matching")] use thread_utils::RapidMap; pub use thread_ast_engine::language::Language; pub use thread_ast_engine::tree_sitter::LanguageExt; -/// this macro implements bare-bone methods for a language +/// Implements standard [`Language`] and [`LanguageExt`] traits for languages that accept `$` in identifiers. +/// +/// Used for languages like JavaScript, Python, and Rust where `$` can appear in variable names +/// and doesn't require special preprocessing for metavariables. +/// +/// # Parameters +/// - `$lang` - The language struct name (e.g., `JavaScript`) +/// - `$func` - The parser function name from [`parsers`] module (e.g., `language_javascript`) +/// +/// # Generated Implementation +/// Creates a zero-sized struct with [`Language`] and [`LanguageExt`] implementations that: +/// - Map node kinds and field names to tree-sitter IDs +/// - Build patterns using the language's parser +/// - Use default metavariable processing (no expando character substitution) +#[allow(unused_macros)] macro_rules! impl_lang { ($lang: ident, $func: ident) => { #[derive(Clone, Copy, Debug)] @@ -69,6 +136,7 @@ macro_rules! impl_lang { .field_id_for_name(field) .map(|f| f.get()) } + #[cfg(feature = "matching")] fn build_pattern(&self, builder: &PatternBuilder) -> Result { builder.build(|src| StrDoc::try_new(src, self.clone())) } @@ -81,6 +149,31 @@ macro_rules! impl_lang { }; } +/// Preprocesses pattern strings by replacing `$` with the language's expando character. +/// +/// Languages that don't accept `$` in identifiers need metavariables like `$VAR` converted +/// to use a different character. This function efficiently replaces `$` symbols that precede +/// uppercase letters, underscores, or appear in triple sequences (`$$$`). +/// +/// # Parameters +/// - `expando` - The character to replace `$` with (e.g., `µ` for most languages, `_` for CSS) +/// - `query` - The pattern string containing `$` metavariables +/// +/// # Returns +/// - `Cow::Borrowed` if no replacement is needed (fast path) +/// - `Cow::Owned` if replacement occurred +/// +/// # Examples +/// ```rust +/// # use thread_language::pre_process_pattern; +/// // Python doesn't accept $ in identifiers, so use µ +/// let result = pre_process_pattern('µ', "def $FUNC($ARG): pass"); +/// assert_eq!(result, "def µFUNC(µARG): pass"); +/// +/// // No change needed +/// let result = pre_process_pattern('µ', "def hello(): pass"); +/// assert_eq!(result, "def hello(): pass"); +/// ``` fn pre_process_pattern(expando: char, query: &str) -> std::borrow::Cow<'_, str> { // Fast path: check if any processing is needed let has_dollar = query.as_bytes().contains(&b'$'); @@ -141,8 +234,35 @@ fn pre_process_pattern(expando: char, query: &str) -> std::borrow::Cow<'_, str> std::borrow::Cow::Owned(ret) } -/// this macro will implement expando_char and pre_process_pattern -/// use this if your language does not accept $ as valid identifier char +/// Implements [`Language`] and [`LanguageExt`] traits for languages requiring custom expando characters. +/// +/// Used for languages that don't accept `$` in identifiers and need metavariables like `$VAR` +/// converted to use a different character (e.g., `µVAR`, `_VAR`). +/// +/// # Parameters +/// - `$lang` - The language struct name (e.g., `Python`) +/// - `$func` - The parser function name from [`parsers`] module (e.g., `language_python`) +/// - `$char` - The expando character to use instead of `$` (e.g., `'µ'`) +/// +/// # Generated Implementation +/// Creates a zero-sized struct with [`Language`] and [`LanguageExt`] implementations that: +/// - Map node kinds and field names to tree-sitter IDs +/// - Build patterns using the language's parser +/// - Preprocess patterns by replacing `$` with the expando character +/// - Provide the expando character via [`Language::expando_char`] +/// +/// # Examples +/// ```rust +/// # use thread_language::Python; +/// # use thread_ast_engine::Language; +/// let python = Python; +/// assert_eq!(python.expando_char(), 'µ'); +/// +/// // Pattern gets automatically preprocessed +/// let pattern = "def $FUNC($ARG): pass"; +/// let processed = python.pre_process_pattern(pattern); +/// assert_eq!(processed, "def µFUNC(µARG): pass"); +/// ``` macro_rules! impl_lang_expando { ($lang: ident, $func: ident, $char: expr) => { #[derive(Clone, Copy, Debug)] @@ -163,6 +283,7 @@ macro_rules! impl_lang_expando { fn pre_process_pattern<'q>(&self, query: &'q str) -> std::borrow::Cow<'q, str> { pre_process_pattern(self.expando_char(), query) } + #[cfg(feature = "matching")] fn build_pattern(&self, builder: &PatternBuilder) -> Result { builder.build(|src| StrDoc::try_new(src, self.clone())) } @@ -216,92 +337,180 @@ macro_rules! impl_alias { /// Generates as convenience conversions between the lang types /// and `SupportedType`. macro_rules! impl_aliases { - ($($lang:ident => $as:expr),* $(,)?) => { - $(impl_alias!($lang => $as);)* + ($($lang:ident, $feature:literal => $as:expr),* $(,)?) => { + $(#[cfg(feature = $feature)] + impl_alias!($lang => $as); + )* const fn alias(lang: SupportLang) -> &'static [&'static str] { match lang { - $(SupportLang::$lang => $lang::ALIAS),* + $( + #[cfg(feature = $feature)] + SupportLang::$lang => $lang::ALIAS, + )* } } }; } - /* Customized Language with expando_char / pre_process_pattern */ + // https://en.cppreference.com/w/cpp/language/identifiers // Due to some issues in the tree-sitter parser, it is not possible to use // unicode literals in identifiers for C/C++ parsers +#[cfg(feature = "c")] impl_lang_expando!(C, language_c, 'µ'); +#[cfg(feature = "cpp")] impl_lang_expando!(Cpp, language_cpp, 'µ'); + // https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/lexical-structure#643-identifiers // all letter number is accepted // https://www.compart.com/en/unicode/category/Nl +#[cfg(feature = "csharp")] impl_lang_expando!(CSharp, language_c_sharp, 'µ'); + // https://www.w3.org/TR/CSS21/grammar.html#scanner +#[cfg(feature = "css")] impl_lang_expando!(Css, language_css, '_'); + // https://github.com/elixir-lang/tree-sitter-elixir/blob/a2861e88a730287a60c11ea9299c033c7d076e30/grammar.js#L245 +#[cfg(feature = "elixir")] impl_lang_expando!(Elixir, language_elixir, 'µ'); + // we can use any Unicode code point categorized as "Letter" // https://go.dev/ref/spec#letter +#[cfg(feature = "go")] impl_lang_expando!(Go, language_go, 'µ'); + // GHC supports Unicode syntax per // https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/unicode_syntax.html // and the tree-sitter-haskell grammar parses it too. +#[cfg(feature = "haskell")] impl_lang_expando!(Haskell, language_haskell, 'µ'); + // https://github.com/fwcd/tree-sitter-kotlin/pull/93 +#[cfg(feature = "kotlin")] impl_lang_expando!(Kotlin, language_kotlin, 'µ'); + // PHP accepts unicode to be used as some name not var name though +#[cfg(feature = "php")] impl_lang_expando!(Php, language_php, 'µ'); + // we can use any char in unicode range [:XID_Start:] // https://docs.python.org/3/reference/lexical_analysis.html#identifiers // see also [PEP 3131](https://peps.python.org/pep-3131/) for further details. +#[cfg(feature = "python")] impl_lang_expando!(Python, language_python, 'µ'); + // https://github.com/tree-sitter/tree-sitter-ruby/blob/f257f3f57833d584050336921773738a3fd8ca22/grammar.js#L30C26-L30C78 +#[cfg(feature = "ruby")] impl_lang_expando!(Ruby, language_ruby, 'µ'); + // we can use any char in unicode range [:XID_Start:] // https://doc.rust-lang.org/reference/identifiers.html +#[cfg(feature = "rust")] impl_lang_expando!(Rust, language_rust, 'µ'); + //https://docs.swift.org/swift-book/documentation/the-swift-programming-language/lexicalstructure/#Identifiers +#[cfg(feature = "swift")] impl_lang_expando!(Swift, language_swift, 'µ'); // Stub Language without preprocessing // Language Name, tree-sitter-name, alias, extension +#[cfg(feature = "bash")] impl_lang!(Bash, language_bash); +#[cfg(feature = "java")] impl_lang!(Java, language_java); +#[cfg(feature = "javascript")] impl_lang!(JavaScript, language_javascript); +#[cfg(feature = "json")] impl_lang!(Json, language_json); +#[cfg(feature = "lua")] impl_lang!(Lua, language_lua); +#[cfg(feature = "scala")] impl_lang!(Scala, language_scala); +#[cfg(feature = "tsx")] impl_lang!(Tsx, language_tsx); +#[cfg(feature = "typescript")] impl_lang!(TypeScript, language_typescript); +#[cfg(feature = "yaml")] impl_lang!(Yaml, language_yaml); // See ripgrep for extensions // https://github.com/BurntSushi/ripgrep/blob/master/crates/ignore/src/default_types.rs -/// Represents all built-in languages. +/// Runtime language selection enum supporting all built-in languages. +/// +/// Provides a unified interface for working with any supported language at runtime. +/// Each variant corresponds to a specific programming language implementation. +/// +/// # Language Detection +/// ```rust,ignore +/// use thread_language::SupportLang; +/// use std::path::Path; +/// +/// // Detect from file extension +/// let lang = SupportLang::from_path("main.rs").unwrap(); +/// assert_eq!(lang, SupportLang::Rust); +/// +/// // Parse from string +/// let lang: SupportLang = "javascript".parse().unwrap(); +/// assert_eq!(lang, SupportLang::JavaScript); +/// ``` +/// +/// # Usage with AST Analysis +/// ```rust +/// use thread_language::SupportLang; +/// use thread_ast_engine::{Language, LanguageExt}; +/// +/// let lang = SupportLang::Rust; +/// let tree = lang.ast_grep("fn main() {}"); +/// let pattern = lang.build_pattern(&pattern_builder).unwrap(); +/// ``` #[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, Hash)] pub enum SupportLang { + #[cfg(feature = "bash")] Bash, + #[cfg(feature = "c")] C, + #[cfg(feature = "cpp")] Cpp, + #[cfg(feature = "csharp")] CSharp, + #[cfg(feature = "css")] Css, + #[cfg(feature = "go")] Go, + #[cfg(feature = "elixir")] Elixir, + #[cfg(feature = "haskell")] Haskell, + #[cfg(feature = "html")] Html, + #[cfg(feature = "java")] Java, + #[cfg(feature = "javascript")] JavaScript, + #[cfg(feature = "json")] Json, + #[cfg(feature = "kotlin")] Kotlin, + #[cfg(feature = "lua")] Lua, + #[cfg(feature = "php")] Php, + #[cfg(feature = "python")] Python, + #[cfg(feature = "ruby")] Ruby, + #[cfg(feature = "rust")] Rust, + #[cfg(feature = "scala")] Scala, + #[cfg(feature = "swift")] Swift, + #[cfg(feature = "tsx")] Tsx, + #[cfg(feature = "typescript")] TypeScript, + #[cfg(feature = "yaml")] Yaml, } @@ -309,8 +518,52 @@ impl SupportLang { pub const fn all_langs() -> &'static [SupportLang] { use SupportLang::*; &[ - Bash, C, Cpp, CSharp, Css, Elixir, Go, Haskell, Html, Java, JavaScript, Json, Kotlin, - Lua, Php, Python, Ruby, Rust, Scala, Swift, Tsx, TypeScript, Yaml, + #[cfg(feature = "bash")] + Bash, + #[cfg(feature = "c")] + C, + #[cfg(feature = "cpp")] + Cpp, + #[cfg(feature = "csharp")] + CSharp, + #[cfg(feature = "css")] + Css, + #[cfg(feature = "elixir")] + Elixir, + #[cfg(feature = "go")] + Go, + #[cfg(feature = "haskell")] + Haskell, + #[cfg(feature = "html")] + Html, + #[cfg(feature = "java")] + Java, + #[cfg(feature = "javascript")] + JavaScript, + #[cfg(feature = "json")] + Json, + #[cfg(feature = "kotlin")] + Kotlin, + #[cfg(feature = "lua")] + Lua, + #[cfg(feature = "php")] + Php, + #[cfg(feature = "python")] + Python, + #[cfg(feature = "ruby")] + Ruby, + #[cfg(feature = "rust")] + Rust, + #[cfg(feature = "scala")] + Scala, + #[cfg(feature = "swift")] + Swift, + #[cfg(feature = "tsx")] + Tsx, + #[cfg(feature = "typescript")] + TypeScript, + #[cfg(feature = "yaml")] + Yaml, ] } @@ -328,6 +581,7 @@ impl fmt::Display for SupportLang { #[derive(Debug)] pub enum SupportLangErr { LanguageNotSupported(String), + LanguageNotEnabled(String) } impl Display for SupportLangErr { @@ -335,6 +589,7 @@ impl Display for SupportLangErr { use SupportLangErr::*; match self { LanguageNotSupported(lang) => write!(f, "{lang} is not supported!"), + LanguageNotEnabled(lang) => write!(f, "{lang} is available but not enabled. You need to enable the feature flag for this language.") } } } @@ -390,29 +645,29 @@ impl Visitor<'_> for AliasVisitor { } impl_aliases! { - Bash => &["bash"], - C => &["c"], - Cpp => &["cc", "c++", "cpp", "cxx"], - CSharp => &["cs", "csharp"], - Css => &["css"], - Elixir => &["ex", "elixir"], - Go => &["go", "golang"], - Haskell => &["hs", "haskell"], - Html => &["html"], - Java => &["java"], - JavaScript => &["javascript", "js", "jsx"], - Json => &["json"], - Kotlin => &["kotlin", "kt"], - Lua => &["lua"], - Php => &["php"], - Python => &["py", "python"], - Ruby => &["rb", "ruby"], - Rust => &["rs", "rust"], - Scala => &["scala"], - Swift => &["swift"], - TypeScript => &["ts", "typescript"], - Tsx => &["tsx"], - Yaml => &["yaml", "yml"], + Bash, "bash" => &["bash"], + C, "c" => &["c"], + Cpp, "cpp" => &["cc", "c++", "cpp", "cxx"], + CSharp, "csharp" => &["cs", "csharp"], + Css, "css" => &["css"], + Elixir, "elixir" => &["ex", "elixir"], + Go, "go" => &["go", "golang"], + Haskell, "haskell" => &["hs", "haskell"], + Html, "html" => &["html"], + Java, "java" => &["java"], + JavaScript, "javascript" => &["javascript", "js", "jsx"], + Json, "json" => &["json"], + Kotlin, "kotlin" => &["kotlin", "kt"], + Lua, "lua" => &["lua"], + Php, "php" => &["php"], + Python, "python" => &["py", "python"], + Ruby, "ruby" => &["rb", "ruby"], + Rust, "rust" => &["rs", "rust"], + Scala, "scala" => &["scala"], + Swift, "swift" => &["swift"], + TypeScript, "typescript" => &["ts", "typescript"], + Tsx, "tsx" => &["tsx"], + Yaml, "yaml" => &["yaml", "yml"], } /// Implements the language names and aliases. @@ -421,28 +676,51 @@ impl FromStr for SupportLang { fn from_str(s: &str) -> Result { // Fast path: try exact matches first (most common case) match s { + #[cfg(feature = "bash")] "bash" => return Ok(SupportLang::Bash), + #[cfg(feature = "c")] "c" => return Ok(SupportLang::C), + #[cfg(feature = "cpp")] "cpp" | "c++" => return Ok(SupportLang::Cpp), + #[cfg(feature = "csharp")] "cs" | "csharp" => return Ok(SupportLang::CSharp), + #[cfg(feature = "css")] "css" => return Ok(SupportLang::Css), + #[cfg(feature = "elixir")] "elixir" | "ex" => return Ok(SupportLang::Elixir), + #[cfg(feature = "go")] "go" | "golang" => return Ok(SupportLang::Go), + #[cfg(feature = "haskell")] "haskell" | "hs" => return Ok(SupportLang::Haskell), + #[cfg(feature = "html")] "html" => return Ok(SupportLang::Html), + #[cfg(feature = "java")] "java" => return Ok(SupportLang::Java), + #[cfg(feature = "javascript")] "javascript" | "js" => return Ok(SupportLang::JavaScript), + #[cfg(feature = "json")] "json" => return Ok(SupportLang::Json), + #[cfg(feature = "kotlin")] "kotlin" | "kt" => return Ok(SupportLang::Kotlin), + #[cfg(feature = "lua")] "lua" => return Ok(SupportLang::Lua), + #[cfg(feature = "php")] "php" => return Ok(SupportLang::Php), + #[cfg(feature = "python")] "python" | "py" => return Ok(SupportLang::Python), + #[cfg(feature = "ruby")] "ruby" | "rb" => return Ok(SupportLang::Ruby), + #[cfg(feature = "rust")] "rust" | "rs" => return Ok(SupportLang::Rust), + #[cfg(feature = "scala")] "scala" => return Ok(SupportLang::Scala), + #[cfg(feature = "swift")] "swift" => return Ok(SupportLang::Swift), + #[cfg(feature = "typescript")] "typescript" | "ts" => return Ok(SupportLang::TypeScript), + #[cfg(feature = "tsx")] "tsx" => return Ok(SupportLang::Tsx), + #[cfg(feature = "yaml")] "yaml" | "yml" => return Ok(SupportLang::Yaml), _ => {} // Fall through to case-insensitive search } @@ -463,28 +741,51 @@ macro_rules! execute_lang_method { ($me: path, $method: ident, $($pname:tt),*) => { use SupportLang as S; match $me { + #[cfg(feature = "bash")] S::Bash => Bash.$method($($pname,)*), + #[cfg(feature = "c")] S::C => C.$method($($pname,)*), + #[cfg(feature = "cpp")] S::Cpp => Cpp.$method($($pname,)*), + #[cfg(feature = "csharp")] S::CSharp => CSharp.$method($($pname,)*), + #[cfg(feature = "css")] S::Css => Css.$method($($pname,)*), + #[cfg(feature = "elixir")] S::Elixir => Elixir.$method($($pname,)*), + #[cfg(feature = "go")] S::Go => Go.$method($($pname,)*), + #[cfg(feature = "haskell")] S::Haskell => Haskell.$method($($pname,)*), + #[cfg(feature = "html")] S::Html => Html.$method($($pname,)*), + #[cfg(feature = "json")] S::Java => Java.$method($($pname,)*), + #[cfg(feature = "javascript")] S::JavaScript => JavaScript.$method($($pname,)*), + #[cfg(feature = "json")] S::Json => Json.$method($($pname,)*), + #[cfg(feature = "kotlin")] S::Kotlin => Kotlin.$method($($pname,)*), + #[cfg(feature = "lua")] S::Lua => Lua.$method($($pname,)*), + #[cfg(feature = "php")] S::Php => Php.$method($($pname,)*), + #[cfg(feature = "python")] S::Python => Python.$method($($pname,)*), + #[cfg(feature = "ruby")] S::Ruby => Ruby.$method($($pname,)*), + #[cfg(feature = "rust")] S::Rust => Rust.$method($($pname,)*), + #[cfg(feature = "scala")] S::Scala => Scala.$method($($pname,)*), + #[cfg(feature = "swift")] S::Swift => Swift.$method($($pname,)*), + #[cfg(feature = "typescript")] S::Tsx => Tsx.$method($($pname,)*), + #[cfg(feature = "typescript")] S::TypeScript => TypeScript.$method($($pname,)*), + #[cfg(feature = "yaml")] S::Yaml => Yaml.$method($($pname,)*), } } @@ -504,6 +805,7 @@ impl Language for SupportLang { impl_lang_method!(meta_var_char, () => char); impl_lang_method!(expando_char, () => char); impl_lang_method!(extract_meta_var, (source: &str) => Option); + #[cfg(feature = "matching")] impl_lang_method!(build_pattern, (builder: &PatternBuilder) => Result); fn pre_process_pattern<'q>(&self, query: &'q str) -> Cow<'q, str> { execute_lang_method! { self, pre_process_pattern, query } @@ -513,6 +815,7 @@ impl Language for SupportLang { } } +#[cfg(feature = "matching")] impl LanguageExt for SupportLang { impl_lang_method!(get_ts_language, () => TSLanguage); impl_lang_method!(injectable_languages, () => Option<&'static [&'static str]>); @@ -521,6 +824,7 @@ impl LanguageExt for SupportLang { root: Node>, ) -> RapidMap> { match self { + #[cfg(feature = "html-embedded")] SupportLang::Html => Html.extract_injections(root), _ => RapidMap::default(), } @@ -530,30 +834,53 @@ impl LanguageExt for SupportLang { const fn extensions(lang: SupportLang) -> &'static [&'static str] { use SupportLang::*; match lang { + #[cfg(feature = "bash")] Bash => &[ "bash", "bats", "cgi", "command", "env", "fcgi", "ksh", "sh", "tmux", "tool", "zsh", ], + #[cfg(feature = "c")] C => &["c", "h"], + #[cfg(feature = "cpp")] Cpp => &["cc", "hpp", "cpp", "c++", "hh", "cxx", "cu", "ino"], + #[cfg(feature = "csharp")] CSharp => &["cs"], + #[cfg(feature = "css")] Css => &["css", "scss"], + #[cfg(feature = "elixir")] Elixir => &["ex", "exs"], + #[cfg(feature = "go")] Go => &["go"], + #[cfg(feature = "haskell")] Haskell => &["hs"], + #[cfg(feature = "html")] Html => &["html", "htm", "xhtml"], + #[cfg(feature = "java")] Java => &["java"], + #[cfg(feature = "javascript")] JavaScript => &["cjs", "js", "mjs", "jsx"], + #[cfg(feature = "json")] Json => &["json"], + #[cfg(feature = "kotlin")] Kotlin => &["kt", "ktm", "kts"], + #[cfg(feature = "lua")] Lua => &["lua"], + #[cfg(feature = "php")] Php => &["php"], + #[cfg(feature = "python")] Python => &["py", "py3", "pyi", "bzl"], + #[cfg(feature = "ruby")] Ruby => &["rb", "rbw", "gemspec"], + #[cfg(feature = "rust")] Rust => &["rs"], + #[cfg(feature = "scala")] Scala => &["scala", "sc", "sbt"], + #[cfg(feature = "swift")] Swift => &["swift"], + #[cfg(feature = "typescript")] TypeScript => &["ts", "cts", "mts"], + #[cfg(feature = "tsx")] Tsx => &["tsx"], + #[cfg(feature = "yaml")] Yaml => &["yaml", "yml"], } } @@ -566,18 +893,35 @@ fn from_extension(path: &Path) -> Option { // Fast path: try most common extensions first match ext { - "rs" => return Some(SupportLang::Rust), - "js" | "mjs" | "cjs" => return Some(SupportLang::JavaScript), - "ts" | "cts" | "mts" => return Some(SupportLang::TypeScript), - "tsx" => return Some(SupportLang::Tsx), - "py" | "py3" | "pyi" => return Some(SupportLang::Python), - "java" => return Some(SupportLang::Java), + #[cfg(feature = "bash")] + "bash" | "sh" | ".bashrc" | "bash_aliases" | "bats" | "cgi" | "command" | "env" | "fcgi" | "ksh" | "tmux" | "tool" | "zsh" | "bash_logout" | "bash_profile" | "profile" | "login" | "logout" => { + return Some(SupportLang::Bash) + } + #[cfg(feature = "c")] + "c" | "h" => return Some(SupportLang::C), + #[cfg(feature = "csharp")] "cpp" | "cc" | "cxx" => return Some(SupportLang::Cpp), - "c" => return Some(SupportLang::C), + #[cfg(feature = "css")] + "css" => return Some(SupportLang::Css), + #[cfg(feature = "go")] "go" => return Some(SupportLang::Go), + #[cfg(feature = "html")] "html" | "htm" => return Some(SupportLang::Html), - "css" => return Some(SupportLang::Css), + #[cfg(feature = "java")] + "java" => return Some(SupportLang::Java), + #[cfg(feature = "javascript")] + "js" | "mjs" | "cjs" => return Some(SupportLang::JavaScript), + #[cfg(feature = "json")] "json" => return Some(SupportLang::Json), + #[cfg(feature = "python")] + "py" | "py3" | "pyi" => return Some(SupportLang::Python), + #[cfg(feature = "rust")] + "rs" => return Some(SupportLang::Rust), + #[cfg(feature = "typescript")] + "ts" | "cts" | "mts" => return Some(SupportLang::TypeScript), + #[cfg(feature = "tsx")] + "tsx" => return Some(SupportLang::Tsx), + #[cfg(feature = "yaml")] "yaml" | "yml" => return Some(SupportLang::Yaml), _ => {} } @@ -624,7 +968,7 @@ mod test { pub fn test_match_lang(query: &str, source: &str, lang: impl LanguageExt) { let cand = lang.ast_grep(source); - let pattern = Pattern::new(query, lang); + let pattern = Pattern::new(query, &lang); assert!( pattern.find_node(cand.root()).is_some(), "goal: {pattern:?}, candidate: {}", @@ -634,7 +978,7 @@ mod test { pub fn test_non_match_lang(query: &str, source: &str, lang: impl LanguageExt) { let cand = lang.ast_grep(source); - let pattern = Pattern::new(query, lang); + let pattern = Pattern::new(query, &lang); assert!( pattern.find_node(cand.root()).is_none(), "goal: {pattern:?}, candidate: {}", diff --git a/crates/language/src/parsers.rs b/crates/language/src/parsers.rs index 0e4dd25..aeb09a4 100644 --- a/crates/language/src/parsers.rs +++ b/crates/language/src/parsers.rs @@ -4,11 +4,40 @@ // // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT -//! This mod maintains a list of tree-sitter parsers crate. -//! When feature flag `builtin-parser` is on, this mod will import all dependent crates. -//! However, tree-sitter bs cannot be compiled by wasm-pack. -//! In this case, we can use a blank implementation by turning feature flag off. -//! And use other implementation. +//! Tree-sitter parser initialization and caching for all supported languages. +//! +//! Provides cached, zero-cost access to tree-sitter language parsers. Each parser is initialized once and cached using +//! [`std::sync::OnceLock`] for thread-safe, lazy initialization. +//! +//! ## Feature Flags +//! +//! ### `builtin-parser` +//! When enabled (default), imports all tree-sitter parser crates and provides +//! full parser functionality. Disable for WebAssembly builds where tree-sitter +//! cannot be compiled. +//! +//! ### `napi-lang` +//! Enables NAPI-compatible parsers (CSS, HTML, JavaScript, TypeScript) for +//! Node.js environments. +//! +//! ## Parser Functions +//! +//! Each language has a corresponding `language_*()` function that returns a +//! cached [`TSLanguage`] instance: +//! +//! ```rust +//! use thread_language::parsers::{language_rust, language_javascript}; +//! +//! let rust_lang = language_rust(); +//! let js_lang = language_javascript(); +//! ``` +//! +//! ## Caching Strategy +//! +//! Parsers use [`std::sync::OnceLock`] for optimal performance: +//! - First call initializes the parser +//! - Subsequent calls return the cached instance +//! - Thread-safe with no synchronization overhead after initialization #[cfg(feature = "builtin-parser")] macro_rules! into_lang { @@ -51,154 +80,181 @@ use std::sync::OnceLock; use thread_ast_engine::tree_sitter::TSLanguage; // Cached language instances for zero-cost repeated access +#[cfg(feature = "bash")] static BASH_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "c")] static C_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "cpp")] static CPP_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "csharp")] static CSHARP_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "css")] static CSS_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "elixir")] static ELIXIR_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "go")] static GO_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "haskell")] static HASKELL_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "html")] static HTML_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "java")] static JAVA_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "javascript")] static JAVASCRIPT_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "json")] static JSON_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "kotlin")] static KOTLIN_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "lua")] static LUA_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "php")] static PHP_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "python")] static PYTHON_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "ruby")] static RUBY_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "rust")] static RUST_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "scala")] static SCALA_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "swift")] static SWIFT_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "tsx")] static TSX_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "typescript")] static TYPESCRIPT_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "yaml")] static YAML_LANG: OnceLock = OnceLock::new(); +#[cfg(feature = "bash")] pub fn language_bash() -> TSLanguage { BASH_LANG .get_or_init(|| into_lang!(tree_sitter_bash)) .clone() } - +#[cfg(feature = "c")] pub fn language_c() -> TSLanguage { C_LANG.get_or_init(|| into_lang!(tree_sitter_c)).clone() } - +#[cfg(feature = "cpp")] pub fn language_cpp() -> TSLanguage { CPP_LANG.get_or_init(|| into_lang!(tree_sitter_cpp)).clone() } - +#[cfg(feature = "csharp")] pub fn language_c_sharp() -> TSLanguage { CSHARP_LANG .get_or_init(|| into_lang!(tree_sitter_c_sharp)) .clone() } - +#[cfg(feature = "css")] pub fn language_css() -> TSLanguage { CSS_LANG .get_or_init(|| into_napi_lang!(tree_sitter_css::LANGUAGE)) .clone() } - +#[cfg(feature = "elixir")] pub fn language_elixir() -> TSLanguage { ELIXIR_LANG .get_or_init(|| into_lang!(tree_sitter_elixir)) .clone() } - +#[cfg(feature = "go")] pub fn language_go() -> TSLanguage { GO_LANG.get_or_init(|| into_lang!(tree_sitter_go)).clone() } +#[cfg(feature = "haskell")] pub fn language_haskell() -> TSLanguage { HASKELL_LANG .get_or_init(|| into_lang!(tree_sitter_haskell)) .clone() } - +#[cfg(feature = "html")] pub fn language_html() -> TSLanguage { HTML_LANG .get_or_init(|| into_napi_lang!(tree_sitter_html::LANGUAGE)) .clone() } +#[cfg(feature = "java")] pub fn language_java() -> TSLanguage { JAVA_LANG .get_or_init(|| into_lang!(tree_sitter_java)) .clone() } - +#[cfg(feature = "javascript")] pub fn language_javascript() -> TSLanguage { JAVASCRIPT_LANG .get_or_init(|| into_napi_lang!(tree_sitter_javascript::LANGUAGE)) .clone() } - +#[cfg(feature = "json")] pub fn language_json() -> TSLanguage { JSON_LANG .get_or_init(|| into_lang!(tree_sitter_json)) .clone() } - +#[cfg(feature = "kotlin")] pub fn language_kotlin() -> TSLanguage { KOTLIN_LANG .get_or_init(|| into_lang!(tree_sitter_kotlin)) .clone() } +#[cfg(feature = "lua")] pub fn language_lua() -> TSLanguage { LUA_LANG.get_or_init(|| into_lang!(tree_sitter_lua)).clone() } - +#[cfg(feature = "php")] pub fn language_php() -> TSLanguage { PHP_LANG .get_or_init(|| into_lang!(tree_sitter_php, LANGUAGE_PHP_ONLY)) .clone() } - +#[cfg(feature = "python")] pub fn language_python() -> TSLanguage { PYTHON_LANG .get_or_init(|| into_lang!(tree_sitter_python)) .clone() } - +#[cfg(feature = "ruby")] pub fn language_ruby() -> TSLanguage { RUBY_LANG .get_or_init(|| into_lang!(tree_sitter_ruby)) .clone() } - +#[cfg(feature = "rust")] pub fn language_rust() -> TSLanguage { RUST_LANG .get_or_init(|| into_lang!(tree_sitter_rust)) .clone() } - +#[cfg(feature = "scala")] pub fn language_scala() -> TSLanguage { SCALA_LANG .get_or_init(|| into_lang!(tree_sitter_scala)) .clone() } - +#[cfg(feature = "swift")] pub fn language_swift() -> TSLanguage { SWIFT_LANG .get_or_init(|| into_lang!(tree_sitter_swift)) .clone() } - +#[cfg(feature = "tsx")] pub fn language_tsx() -> TSLanguage { TSX_LANG .get_or_init(|| into_napi_lang!(tree_sitter_typescript::LANGUAGE_TSX)) .clone() } - +#[cfg(feature = "typescript")] pub fn language_typescript() -> TSLanguage { TYPESCRIPT_LANG .get_or_init(|| into_napi_lang!(tree_sitter_typescript::LANGUAGE_TYPESCRIPT)) .clone() } - +#[cfg(feature = "yaml")] pub fn language_yaml() -> TSLanguage { YAML_LANG .get_or_init(|| into_lang!(tree_sitter_yaml)) diff --git a/crates/language/src/profiling.rs b/crates/language/src/profiling.rs index 97ea62e..fff35b0 100644 --- a/crates/language/src/profiling.rs +++ b/crates/language/src/profiling.rs @@ -5,6 +5,11 @@ // SPDX-License-Identifier: AGPL-3.0-or-later AND MIT //! Memory profiling utilities for performance analysis +//! +//! This module is behind the "profiling" feature. +//! It's not intended for external use. +//! It's not in 'benches` because it needs access to the private API. +//! use std::alloc::{GlobalAlloc, Layout, System}; use std::sync::atomic::{AtomicUsize, Ordering}; diff --git a/crates/language/src/ruby.rs b/crates/language/src/ruby.rs index 5b444c3..da3d44b 100644 --- a/crates/language/src/ruby.rs +++ b/crates/language/src/ruby.rs @@ -22,7 +22,7 @@ fn test_ruby_pattern() { // https://github.com/ast-grep/ast-grep/issues/713 #[test] fn test_ruby_tree_sitter_panic() { - let pattern = Pattern::new("Foo::barbaz", Ruby); + let pattern = Pattern::new("Foo::barbaz", &Ruby); assert_eq!(pattern.fixed_string(), "barbaz"); } diff --git a/crates/rule-engine/serialization_analysis/analyze_serialization b/crates/rule-engine/serialization_analysis/analyze_serialization deleted file mode 100755 index 3ac5c40..0000000 Binary files a/crates/rule-engine/serialization_analysis/analyze_serialization and /dev/null differ diff --git a/crates/rule-engine/src/fixer.rs b/crates/rule-engine/src/fixer.rs index 4ca7b9e..e6d9f81 100644 --- a/crates/rule-engine/src/fixer.rs +++ b/crates/rule-engine/src/fixer.rs @@ -323,7 +323,7 @@ mod test { }; let fixer = parse(config)?; let grep = TypeScript::Tsx.ast_grep("var a = { b: 123, }"); - let matcher = KindMatcher::new("pair", TypeScript::Tsx); + let matcher = KindMatcher::new("pair", &TypeScript::Tsx); let node = grep.root().find(&matcher).expect("should found"); let edit = node.make_edit(&matcher, &fixer); let text = String::from_utf8_lossy(&edit.inserted_text); diff --git a/crates/rule-engine/src/label.rs b/crates/rule-engine/src/label.rs index 157884d..0c43632 100644 --- a/crates/rule-engine/src/label.rs +++ b/crates/rule-engine/src/label.rs @@ -118,7 +118,7 @@ mod tests { #[test] fn test_get_labels_from_config_single() { let doc = TypeScript::Tsx.ast_grep("let foo = 42;"); - let pattern = Pattern::try_new("let $A = $B;", TypeScript::Tsx).unwrap(); + let pattern = Pattern::try_new("let $A = $B;", &TypeScript::Tsx).unwrap(); let m = doc.root().find(pattern).unwrap(); let mut config = thread_utils::RapidMap::default(); config.insert( @@ -137,7 +137,7 @@ mod tests { #[test] fn test_get_labels_from_config_multiple() { let doc = TypeScript::Tsx.ast_grep("let foo = 42, bar = 99;"); - let pattern = Pattern::try_new("let $A = $B, $C = $D;", TypeScript::Tsx).unwrap(); + let pattern = Pattern::try_new("let $A = $B, $C = $D;", &TypeScript::Tsx).unwrap(); let m = doc.root().find(pattern).unwrap(); let mut config = thread_utils::RapidMap::default(); config.insert( @@ -155,7 +155,7 @@ mod tests { #[test] fn test_get_default_labels() { let doc = TypeScript::Tsx.ast_grep("let foo = 42;"); - let pattern = Pattern::try_new("let $A = $B;", TypeScript::Tsx).unwrap(); + let pattern = Pattern::try_new("let $A = $B;", &TypeScript::Tsx).unwrap(); let m = doc.root().find(pattern).unwrap(); let labels = get_default_labels(&m); assert!(!labels.is_empty()); diff --git a/crates/rule-engine/src/rule/referent_rule.rs b/crates/rule-engine/src/rule/referent_rule.rs index 235fe6b..f48450d 100644 --- a/crates/rule-engine/src/rule/referent_rule.rs +++ b/crates/rule-engine/src/rule/referent_rule.rs @@ -271,7 +271,7 @@ mod test { fn test_success_rule() -> Result { let registration = RuleRegistration::default(); let rule = ReferentRule::try_new("test".into(), ®istration)?; - let pattern = Rule::Pattern(Pattern::new("some", TS::Tsx)); + let pattern = Rule::Pattern(Pattern::new("some", &TS::Tsx)); let ret = registration.insert_local("test", pattern); assert!(ret.is_ok()); assert!(rule.potential_kinds().is_some()); diff --git a/crates/rule-engine/src/rule/relational_rule.rs b/crates/rule-engine/src/rule/relational_rule.rs index 663aded..bba5a7b 100644 --- a/crates/rule-engine/src/rule/relational_rule.rs +++ b/crates/rule-engine/src/rule/relational_rule.rs @@ -300,13 +300,13 @@ mod test { } fn make_rule(target: &str, relation: Rule) -> impl Matcher { - o::All::new(vec![Rule::Pattern(Pattern::new(target, TS::Tsx)), relation]) + o::All::new(vec![Rule::Pattern(Pattern::new(target, &TS::Tsx)), relation]) } #[test] fn test_precedes_operator() { let precedes = Precedes { - later: Rule::Pattern(Pattern::new("var a = 1", TS::Tsx)), + later: Rule::Pattern(Pattern::new("var a = 1", &TS::Tsx)), stop_by: StopBy::End, }; let rule = make_rule("var b = 2", Rule::Precedes(Box::new(precedes))); @@ -334,7 +334,7 @@ mod test { #[test] fn test_precedes_immediate() { let precedes = Precedes { - later: Rule::Pattern(Pattern::new("var a = 1", TS::Tsx)), + later: Rule::Pattern(Pattern::new("var a = 1", &TS::Tsx)), stop_by: StopBy::Neighbor, }; let rule = make_rule("var b = 2", Rule::Precedes(Box::new(precedes))); @@ -363,7 +363,7 @@ mod test { #[test] fn test_follows_operator() { let follows = Follows { - former: Rule::Pattern(Pattern::new("var b = 2", TS::Tsx)), + former: Rule::Pattern(Pattern::new("var b = 2", &TS::Tsx)), stop_by: StopBy::End, }; let rule = make_rule("var a = 1", Rule::Follows(Box::new(follows))); @@ -394,7 +394,7 @@ mod test { #[test] fn test_follows_immediate() { let follows = Follows { - former: Rule::Pattern(Pattern::new("var b = 2", TS::Tsx)), + former: Rule::Pattern(Pattern::new("var b = 2", &TS::Tsx)), stop_by: StopBy::Neighbor, }; let rule = make_rule("var a = 1", Rule::Follows(Box::new(follows))); @@ -426,7 +426,7 @@ mod test { fn test_has_rule() { let has = Has { stop_by: StopBy::End, - inner: Rule::Pattern(Pattern::new("var a = 1", TS::Tsx)), + inner: Rule::Pattern(Pattern::new("var a = 1", &TS::Tsx)), field: None, }; let rule = make_rule("function test() { $$$ }", Rule::Has(Box::new(has))); @@ -455,9 +455,9 @@ mod test { let has = Has { stop_by: StopBy::Rule(Rule::Kind(KindMatcher::new( "function_declaration", - TS::Tsx, + &TS::Tsx, ))), - inner: Rule::Pattern(Pattern::new("var a = 1", TS::Tsx)), + inner: Rule::Pattern(Pattern::new("var a = 1", &TS::Tsx)), field: None, }; let rule = make_rule("function test() { $$$ }", Rule::Has(Box::new(has))); @@ -482,9 +482,9 @@ mod test { let has = Has { stop_by: StopBy::Rule(Rule::Kind(KindMatcher::new( "function_declaration", - TS::Tsx, + &TS::Tsx, ))), - inner: Rule::Pattern(Pattern::new("function inner() {$$$}", TS::Tsx)), + inner: Rule::Pattern(Pattern::new("function inner() {$$$}", &TS::Tsx)), field: None, }; let rule = make_rule("function test() { $$$ }", Rule::Has(Box::new(has))); @@ -509,13 +509,13 @@ mod test { fn test_has_immediate() { let has = Has { stop_by: StopBy::Neighbor, - inner: Rule::Pattern(Pattern::new("var a = 1", TS::Tsx)), + inner: Rule::Pattern(Pattern::new("var a = 1", &TS::Tsx)), field: None, }; let rule = o::All::new(vec![ - Rule::Pattern(Pattern::new("{ $$$ }", TS::Tsx)), + Rule::Pattern(Pattern::new("{ $$$ }", &TS::Tsx)), Rule::Inside(Box::new(Inside { - outer: Rule::Pattern(Pattern::new("function test() { $$$ }", TS::Tsx)), + outer: Rule::Pattern(Pattern::new("function test() { $$$ }", &TS::Tsx)), stop_by: StopBy::Neighbor, field: None, })), @@ -546,7 +546,7 @@ mod test { fn test_inside_rule() { let inside = Inside { stop_by: StopBy::End, - outer: Rule::Pattern(Pattern::new("function test() { $$$ }", TS::Tsx)), + outer: Rule::Pattern(Pattern::new("function test() { $$$ }", &TS::Tsx)), field: None, }; let rule = make_rule("var a = 1", Rule::Inside(Box::new(inside))); @@ -575,9 +575,9 @@ mod test { let inside = Inside { stop_by: StopBy::Rule(Rule::Kind(KindMatcher::new( "function_declaration", - TS::Tsx, + &TS::Tsx, ))), - outer: Rule::Pattern(Pattern::new("function test() { $$$ }", TS::Tsx)), + outer: Rule::Pattern(Pattern::new("function test() { $$$ }", &TS::Tsx)), field: None, }; let rule = make_rule("var a = 1", Rule::Inside(Box::new(inside))); @@ -606,9 +606,9 @@ mod test { let inside = Inside { stop_by: StopBy::Neighbor, outer: Rule::All(o::All::new(vec![ - Rule::Pattern(Pattern::new("{ $$$ }", TS::Tsx)), + Rule::Pattern(Pattern::new("{ $$$ }", &TS::Tsx)), Rule::Inside(Box::new(Inside { - outer: Rule::Pattern(Pattern::new("function test() { $$$ }", TS::Tsx)), + outer: Rule::Pattern(Pattern::new("function test() { $$$ }", &TS::Tsx)), stop_by: StopBy::Neighbor, field: None, })), @@ -641,7 +641,7 @@ mod test { fn test_inside_field() { let inside = Inside { stop_by: StopBy::End, - outer: Rule::Kind(KindMatcher::new("for_statement", TS::Tsx)), + outer: Rule::Kind(KindMatcher::new("for_statement", &TS::Tsx)), field: TS::Tsx.field_to_id("condition"), }; let rule = make_rule("a = 1", Rule::Inside(Box::new(inside))); @@ -653,11 +653,11 @@ mod test { fn test_has_field() { let has = Has { stop_by: StopBy::End, - inner: Rule::Pattern(Pattern::new("a = 1", TS::Tsx)), + inner: Rule::Pattern(Pattern::new("a = 1", &TS::Tsx)), field: TS::Tsx.field_to_id("condition"), }; let rule = o::All::new(vec![ - Rule::Kind(KindMatcher::new("for_statement", TS::Tsx)), + Rule::Kind(KindMatcher::new("for_statement", &TS::Tsx)), Rule::Has(Box::new(has)), ]); test_found(&["for (;a = 1;) {}"], &rule); @@ -683,24 +683,24 @@ mod test { #[test] fn test_defined_vars() { let precedes = Precedes { - later: Rule::Pattern(Pattern::new("var a = $A", TS::Tsx)), - stop_by: StopBy::Rule(Rule::Pattern(Pattern::new("var b = $B", TS::Tsx))), + later: Rule::Pattern(Pattern::new("var a = $A", &TS::Tsx)), + stop_by: StopBy::Rule(Rule::Pattern(Pattern::new("var b = $B", &TS::Tsx))), }; assert_eq!(precedes.defined_vars(), ["A", "B"].into_iter().collect()); let follows = Follows { - former: Rule::Pattern(Pattern::new("var a = 123", TS::Tsx)), - stop_by: StopBy::Rule(Rule::Pattern(Pattern::new("var b = $B", TS::Tsx))), + former: Rule::Pattern(Pattern::new("var a = 123", &TS::Tsx)), + stop_by: StopBy::Rule(Rule::Pattern(Pattern::new("var b = $B", &TS::Tsx))), }; assert_eq!(follows.defined_vars(), ["B"].into_iter().collect()); let inside = Inside { - stop_by: StopBy::Rule(Rule::Pattern(Pattern::new("var $C", TS::Tsx))), - outer: Rule::Pattern(Pattern::new("var a = $A", TS::Tsx)), + stop_by: StopBy::Rule(Rule::Pattern(Pattern::new("var $C", &TS::Tsx))), + outer: Rule::Pattern(Pattern::new("var a = $A", &TS::Tsx)), field: TS::Tsx.field_to_id("condition"), }; assert_eq!(inside.defined_vars(), ["A", "C"].into_iter().collect()); let has = Has { - stop_by: StopBy::Rule(Rule::Kind(KindMatcher::new("for_statement", TS::Tsx))), - inner: Rule::Pattern(Pattern::new("var a = $A", TS::Tsx)), + stop_by: StopBy::Rule(Rule::Kind(KindMatcher::new("for_statement", &TS::Tsx))), + inner: Rule::Pattern(Pattern::new("var a = $A", &TS::Tsx)), field: TS::Tsx.field_to_id("condition"), }; assert_eq!(has.defined_vars(), ["A"].into_iter().collect()); diff --git a/crates/rule-engine/src/rule_core.rs b/crates/rule-engine/src/rule_core.rs index 0f7702f..711bb8f 100644 --- a/crates/rule-engine/src/rule_core.rs +++ b/crates/rule-engine/src/rule_core.rs @@ -372,7 +372,7 @@ transform: "A".to_string(), Rule::Regex(RegexMatcher::try_new("a").unwrap()), ); - let rule = RuleCore::new(Rule::Pattern(Pattern::new("$A", TypeScript::Tsx))) + let rule = RuleCore::new(Rule::Pattern(Pattern::new("$A", &TypeScript::Tsx))) .with_matchers(constraints); let grep = TypeScript::Tsx.ast_grep("a"); assert!(grep.root().find(&rule).is_some()); diff --git a/hk.pkl b/hk.pkl index ca6fbd8..8abed11 100644 --- a/hk.pkl +++ b/hk.pkl @@ -39,7 +39,6 @@ local linters = new Mapping { fix = "./scripts/update-licenses.py add {{ files }}" } - // check hk.pkl (and any others) ["pkl"] = new Step { glob = "*.pkl" diff --git a/mise.toml b/mise.toml index 33a0fa4..db7745c 100644 --- a/mise.toml +++ b/mise.toml @@ -13,6 +13,8 @@ cargo-binstall = "latest" "cargo:cargo-edit" = "latest" "cargo:cargo-generate" = "latest" "cargo:cargo-nextest" = "latest" +"cargo:cargo-release" = "latest" +"cargo:cargo-smart-release" = "latest" "cargo:cargo-watch" = "latest" gh = "latest" gitsign = "latest" @@ -20,6 +22,7 @@ hk = "latest" node = "24" "npm:wasm-opt" = "latest" "npm:wasm-pack" = "latest" +"pipx:SuperClaude" = "latest" "pipx:reuse" = "latest" pkl = "latest" ripgrep = "latest" @@ -44,6 +47,9 @@ chmod +x scripts/* &>/dev/null && mise run activate && mise run install-tools && mise run update-tools +alias jj='jj --no-pager' +alias git='git --no-pager' +alias claude='claude --dangerously-skip-permissions' """ # deactivate/unhook when you leave leave = """eval "$(mise deactivate)" &/dev/null""" diff --git a/sbom.spdx b/sbom.spdx index 1c08b8e..58a9804 100644 --- a/sbom.spdx +++ b/sbom.spdx @@ -2,133 +2,178 @@ SPDXVersion: SPDX-2.1 DataLicense: CC0-1.0 SPDXID: SPDXRef-DOCUMENT DocumentName: thread -DocumentNamespace: http://spdx.org/spdxdocs/spdx-v2.1-bdaf1bc4-27a6-497c-91b3-84bb446f3f79 +DocumentNamespace: http://spdx.org/spdxdocs/spdx-v2.1-57b0f83e-8b15-4b0f-966d-445319e3f8c0 Creator: Person: Adam Poulemanos () Creator: Organization: Knitli Inc. () Creator: Tool: reuse-5.0.2 -Created: 2025-07-14T01:39:20Z +Created: 2025-07-19T20:22:05Z CreatorComment: This document was created automatically using available reuse information consistent with REUSE. -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c09350ad7681b31e00d4cbe9347d9a28 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9772cb77ce5f8e5ab40c5d0d883c6b6a +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-07e72f6ebe551d6b95993c9c9a521efd +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c54a7e1da625b12bea9d5d2e49d246ca +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-be45515061eb2b0ca15134f996f70d05 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f97dbf2efbcd421cf45c55fdb45ac4c6 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-cc7a2a5256f01f6b53ffbf2c91d8816d -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-21b9569e1356654b65ce429b761504cc -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-717f431b53324ab5982503ba0942204c -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f242c4117b8e90f50a4d999c8f6c41d3 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c8f20abbd94580a0c18d89b6dd006e95 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2a025576974c7d06cd76b3e6f9a03eb2 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-bb3756cee51bf678543b049afb283a49 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-30a1d7c15b8accb80dd84e03cfbac958 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2da833ce75133e5a7f1fe03c4277501f -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3d234cbb2af38a2552a260838c2f2780 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b40d0890d8726673b522cad1983fa61d +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-470aec490d4744118f0e42c8e2440f45 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b52bef1803120595f4b2a7829dcbb9ff Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-78c5a17a25c0fbe9a485604f75a75a88 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-254efab25cd56c3d35894c02bf061106 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f49a1411a64e66a5b47a9f313f58203d +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7c64483b50d50057e9bf774832a41335 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-bfa8b0f656a6947307a8a9ee5571b8d4 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3b4809ddb478fee87ed9acc56cd80263 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c788f66f6d7bce45f2995c46adb99977 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c16667d35b7d7d38b7560050130cd6aa Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7383628c997a024f664b4622800b3266 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-0fcae69dc9c3a42d7d0f5ac5ec0d4e73 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b8aca7d6890642118a77e49b86db1b02 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2a39efeba1882ae4dea0a92c6ca89dc7 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-39099658cc1b787334fbcc7ed50ef03b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-a1e1f4aabd2e746a94db5b19ceab994b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5824608efe5377128ba30a46450c059b Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-da4f36c220bf6db4007063fdd168f1f7 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4ec674bfa203a164ea31b2464eafb449 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8a6591f856789d9609a1dbf8700d8f27 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-440a0faf4d201fe3c28a5b2a3eff13e3 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-aed67bd6c3c2f6a3ffea77891c719508 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3dcad423b391e453fbcd509041b97d5c -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-09fa6f8a26e96de897874878768a974a -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5e236e0e972d31b24852404c91041914 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-e5a2d0f29ecfa6b574bbacdb4fe20afe -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-678e829f8b6e1ea82c4598ec9fe3fa77 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3e193753a71b49d2d17a6013aca71a00 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-a7144ef2367f48f40b80a010550cbfa4 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9f11ae50ad1f27f036c90be0bdeef272 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-079ca752921b2b370f642dd3673a3dad -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f73073cd1916b86a614c6401c1c6be54 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-e4755e8a20b532662c4530fbc225e898 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8d9a2705429af553ea97908a15b1a250 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-eca11699fdb87d6c1440c09cb46cd048 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-a6bf022467040f75b7cf2db6a7f24fad -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c96a8cba67b92fac68d6f5e206270b4c -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-0f021921fd74ba2de2a831053b5a9646 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-69ac961743696227e7411fc5ca2e6a94 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-99407358b674bbb340970ee47b97c40d +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-10d5611a1e0c8e3e02bd3d4b6324160e +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-0617cdfaf5e8a6a2fbc107f258595637 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-511280701d0ea20f85dc8a0164648145 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-ec6b164aa3b721341d579b6978719839 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-6f3e55dd5b25c4af7dbd70921c9abd52 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9e78a25e26d11febe57b342f5650eb32 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b2f41e5acab68a9ce200ea2ee88723d8 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-143d447ff55195e81166cd9f44ba8244 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9fc7f2e7b5aada08f2b3557fed51a3ff +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-991c6b69c972b880a69ad024f0c07311 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-cc42a83a476dead4a95ad10c60c332a8 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-01713a9784b65de30ec6e463ae5cb2a8 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b1621c440e11b332e6a1c7cea66d04d1 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-82236b51eee80388f254120dde3e41b3 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d21244bb838c7d9a2e5de8619a6377fd +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-28b06523c48f3ba2a481874bcf9f78e9 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c375f56ad1017c32d8cd14b6bb931155 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-26feb5cd4a18c478407544ca536efbe6 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d6903737be443edab488390cde72ef44 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8a5a7f1116a67058386473b2917e3764 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-01e2dccd15ff532c7ffc6fde1f9d610a +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d44b27a20b1566215674619b967ea82f +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-378498ea06c3736f1e80991eb40e05f7 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-e556605ef942814f1aa6257de95a6f55 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-09f48fed9cb0332237f44fa8dfa5ebe0 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-ec9301d771546992ab95448a1b9f6c4e Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-6f08941627643ff7977c7a9a5310735c -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8d363400b7e6b0d2caf830a079503a3f -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-38f6c247a07e71a3e1a4c768ad7f9309 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-02449f9481b452608b5c7e583570efd0 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-449bbe9e330b8492ee9368aac216056e -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-14ccdccbf1f49876d41aebffa5d6f489 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-100a39ce86f3adc287535c4d31e2ed57 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b9cf4bbbc385f28b19a397c567b22ca2 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-01ebf8805fb04c95006afb70a074f4bf -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-e0a8d8b7c0cef31c000331c66a98db64 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5149c394d8e1c6866a99fdb403e2b3e1 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7d06bafb476dca352c5f6373fe6bdc07 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2b16ccc2ab42f014960c65bbc80e7b85 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-ff05fc4fdc5336e80d443fd4d8c02b92 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3a0d04868bd59307158fa676595721be -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4fe6c42670dd239542cdc3dac7bd82dd -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-e301ecf3829e863cb4f05049e91152aa -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d360182a7086a20b3c0705c58463e3b6 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7d48e8f79f50a633a04cbf4594fcb5b1 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3cf739e8034d59665604a60997ca06c9 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-1453f5fb98775d83908543158e864718 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-0cd6be9db8cfe7338445e9d77500e13b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-1dbadf0d5f299a87e67e13e50912ea46 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-6a7d9b1bf74e92a66496575c971496c9 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-93cc0cca6ac0a415a8e94ed725ee0ac6 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9adcbf86a0c18c69d64450486315c9c3 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8ce21b9450fb9727e9e910389d6eccb0 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-fe05fb9fad746d3907eea8f5ebf0e52e +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-84dd26a32e71e8598737af625401f1c9 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4755e858a3ddf7999bf59f40c48821c6 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-87aabc808822ef9f16e6eb7ec57c4225 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-62bef916fc2e3aca8bc6f2582e01a3cb +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9242e18eec18fbbe46872b994d521352 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-1f3ad30e3477c1e63ce14c2b0491f134 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4679e2fb2736dfbe261dbc7014925aaa +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9f316d46be3893ad33252a9f85f0cd69 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-1a0b0ab05a8a32eb52ae983d7993792b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-03dbb7df859510a45820e8afcb4db8b8 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-cc74a7e5dcaa4e3e6e59af5747b774c8 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5f68a60d241551b9478e8da4e1947f32 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5cf6bde490d11af95e99fe591950a539 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4e5549d97f78322abe3bf02fa034442f +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-88ddbbb712f12d1ad192aee361c9f00b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c2ece0178066fe3eb3590b9967240cc9 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4f4f1b13f17307595fe3fe7d84552320 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c63352adae2e4c79a3b366b724c62636 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3de79ae5595ae3fff8fb9a3943888bc7 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-37778e089c4f835cc37a80d4e0033e33 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8a13b12f73326f068db26c486b0b53b9 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-6f1c4f174deef93c39f8e3513e31a2ab -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-fd89fb00e1677ed98e349ce1ee67cd47 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7184ddfe7e64d5a27c5ce6e6e2481cad -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-865a5c4b0712c0fab675e38cae3522fa -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d042521c12b5e973e7d11f9257d66bc0 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-992b79db975e1d3dd355d6fb729e3127 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4438ac8225780467f1f6af36cac3c607 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-e9240ed0fd5d93b5e591c23bfe7828ee -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d41de76e0684cf1b8272f4524d01d712 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7d89758e1293be0feef9f86cc906aaf4 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c62ba560541c1b3e24aaf44e75124458 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c584db239ad3a674d3d80e685a810051 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b59c4db430766b146a4d16788cee7744 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8d9d11f1a601311bb2d47b9789518d10 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-1018ceacc606f9bbb70eff0af5af576d -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-da48ef7cd1a99ab090dcbff5fbadedbb -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f9e22dfe0b5443081b3325e5d29ab5a4 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-92c37420e3cb62fb03ea8771867d17d8 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5809c17d3c8ce2fdfe390918384ff879 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7cd1d69437809fba1e7da6f80d225dff -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-84cf09fe3ed0320d1e4514ff60bd281a -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f46cf71b99032f5ccf7acc5458518a42 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4eb0e1712f8060bad99f9cf87751bae9 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9b0f570ff5d7001399b0082cfd36655b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-44e3265b570a8ea06c6de3c1c19a88e8 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f0bcf2e352906ea7e459f2904978e7e1 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5d82d5b7b5fcb8e574ff8f9e9f6cd950 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-72421e01ae5225be857aec8880eff7ff +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4344b19edab200ad510e59f6f65d9e67 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-6b093b0c4568d88fbe435cb2d5f8a6cf +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b4984edc7c211bd6548a399e64225b7c +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-69b1eaf75e7587a9ff31c790e499773f +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8c69dae73521d0cae6103c91eca41537 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-78ba32a9a61842d00cd60c7aa7b53870 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d84cd3748a1af00f29c0574be5dbf285 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-84b279b1a28794e6812ca95ba3c3b32b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-755c900c2c113574ce12387662134517 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5b1d40119bd22e0bdda07ff21d0bfcb4 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-71503a120709919546f858fabb02fef5 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3b391b3b3ace0233c17487d0c8c59bc3 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4ff01a81cb40fa09167b85fcdb7d717c +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-aaeebd3424e75edc252e0fc0f9c40357 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-16c322de0974dc8ede200f764998850d +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c8716e9c443b191bdda41f18123231bc +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c20e2a59852c3e934e7c5219f37f164d +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-84c5799fc4644e42e9377609b1a0d8ba +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d4e5aee67a46f52b20e9ddc3cbf7f8a1 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-66c8e5b8b74181dab0efa93fedf04775 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3e20833918d83e1501367c81f699cd28 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-43b6916f4130a2f307648cbd8780c6ca +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2a1b8523c7ed302d1ae757565e9833ba +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-f8fc8dfa9cd986e616d1261fa6e3b60b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-78101a273943c2ee663817abf1cec511 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9fb9f55a41c065aac1ce6ce1c46a6548 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-46d9043e3c5e09e750583361293dc3e3 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-387aea8823d88cdf386814687489a8a9 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-67c0c3ec0d27bbefa006f4f6b4435aaa +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2ec964f49a3ef9ff1163a8f86f6abd52 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-8f82799968b369568ab71617b47acae1 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-1186548580f204803d45c200407cf83e +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-fbc59c602fbeb91a3ad6eb814d67fcbe +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-ab578ef52433772de1a1ac40c24c5dd7 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7f63724cdb7306f17cb7ebb13b9696cf +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2f42843e8bad608efebd3fe792506733 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-cf45e0b9fb205344b522fc26a8298235 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5ee3f66b8956f2ce1eff477aa68edd88 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-d979566a50fe94181ba534401c2c62b1 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3c88c068e0fd6eb9b98ec09e04413a5b +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4e5f3eee8ceebc5edde78a3911cfdb49 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-402f57903a72a929777f9ecb50757632 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7b2801debffe14ddad59a2b912c1b3d1 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-cce5ffe10aba663335680e4e43bd9288 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-5aa13c18af5ef3dec3ecab0f8c63bb7a +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-c4c348a439df60cdf4aff38a25210bc1 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-a25112f597e8755c458099641b73386d Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-a52bee01e6a3136dfb1aa4c5801ca671 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-db4b8d6674824e5018c48e38aff38022 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-3748ffc1bb58ea2ea7cf599ef81e64a7 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-979fb0d254aeba65211578ff8b35684d -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-640c584f4da01eb49b738f7c45c188af -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-49d2aa98c1d7212438eee6fd73d05d7f +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-cd66635fb95e3bdc6e6138501afb03cc +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-ba1acab5d40bff5ac621372263b6d331 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b6ae9ad907495912115b7c8f9d53809e +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2c0d6c272508977525c03e1c21ab05dc +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-a2d59c457404f3e2c7adf8057a6e3767 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-9f2ead9ce46a115b43f8e59e3f8daf88 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-7d9220e1bfa8d6cd26e5486c4d0116d1 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2b3c6dc79aaa8ab187f18575424cec72 Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-b0c7afe8a516b792025a21fac26f330d Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-00ab9cf365c27b94b2da4a73ab9274ba -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-acdd7bc65db8153d946983a5cd0906b5 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-0fa26d060b21f2196307d9d54cc6b9d4 -Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-45d1bf4a69990c4f8b0526410fd5fc08 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-fd90e7c132390ec22cac057ad5f86804 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-2a14fe7d658a46cff7436cfe88998325 +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-4939545db8b1a8a0b923d19c81ab970d +Relationship: SPDXRef-DOCUMENT DESCRIBES SPDXRef-29201f21040fd5b280c7cd4c8a504dda + +FileName: ./.gitattributes +SPDXID: SPDXRef-07e72f6ebe551d6b95993c9c9a521efd +FileChecksum: SHA1: 6456f547856c9ccc390f83394754d1318932b448 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./.github/actionlint.yml +SPDXID: SPDXRef-c54a7e1da625b12bea9d5d2e49d246ca +FileChecksum: SHA1: 2be288649886e73b96f3e60f9f9f3b9dc04fde19 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./.github/chatmodes/analyze.chatmode.md -SPDXID: SPDXRef-c09350ad7681b31e00d4cbe9347d9a28 -FileChecksum: SHA1: 861098d2e2826f0c3fa82f390f5191b68657bd24 +SPDXID: SPDXRef-be45515061eb2b0ca15134f996f70d05 +FileChecksum: SHA1: fc16b6888e7fefccbdb7130836fd7170eb5fd0e7 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./.github/dependabot.yml -SPDXID: SPDXRef-9772cb77ce5f8e5ab40c5d0d883c6b6a -FileChecksum: SHA1: c48fceac5d24c9f1cf826c6e9bba141ad70e3735 +SPDXID: SPDXRef-f97dbf2efbcd421cf45c55fdb45ac4c6 +FileChecksum: SHA1: 2169a4f3f33aca28f9128fdd26c0ceb9ae703cf9 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -143,24 +188,24 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./.github/workflows/ci.yml -SPDXID: SPDXRef-21b9569e1356654b65ce429b761504cc -FileChecksum: SHA1: abfddbf281da1261272cffb71e0b5389facca5a9 +SPDXID: SPDXRef-c8f20abbd94580a0c18d89b6dd006e95 +FileChecksum: SHA1: b790dad3a272abeede72eeb7668ceb6761de8afc LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./.github/workflows/cla.yml -SPDXID: SPDXRef-717f431b53324ab5982503ba0942204c -FileChecksum: SHA1: 456e15b358b7b603e3ef4c8e044badff5f18241d +SPDXID: SPDXRef-2a025576974c7d06cd76b3e6f9a03eb2 +FileChecksum: SHA1: 49de827f0cbe38e4d48c753281faa497bf1a9a26 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./.gitignore -SPDXID: SPDXRef-f242c4117b8e90f50a4d999c8f6c41d3 -FileChecksum: SHA1: 7462ac10548581cc76e63cf5c585755fbce3149b +SPDXID: SPDXRef-bb3756cee51bf678543b049afb283a49 +FileChecksum: SHA1: 4075a806092f7757ec209c0e234028b2c91a6cc9 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -175,16 +220,16 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./.vscode/settings.json -SPDXID: SPDXRef-2da833ce75133e5a7f1fe03c4277501f -FileChecksum: SHA1: c595166ccf40a441ad0371b9f32fae9923466b12 +SPDXID: SPDXRef-b40d0890d8726673b522cad1983fa61d +FileChecksum: SHA1: 9b31788fc27afb2274b52ca4f6cfa207f64774b9 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./.yamlfmt.yml -SPDXID: SPDXRef-3d234cbb2af38a2552a260838c2f2780 -FileChecksum: SHA1: 85fb0f5c06af8d78d149d41445566a9bed4ba3fe +SPDXID: SPDXRef-470aec490d4744118f0e42c8e2440f45 +FileChecksum: SHA1: e12c7340cebc58a4a60beeac59e2c0aa610f6186 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -207,16 +252,16 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./Cargo.lock -SPDXID: SPDXRef-254efab25cd56c3d35894c02bf061106 -FileChecksum: SHA1: eac1a8443fb125367ad3746046045a967694ea70 +SPDXID: SPDXRef-7c64483b50d50057e9bf774832a41335 +FileChecksum: SHA1: f062c7049a6dfc29818c3007d8d4e47851839a0e LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./Cargo.toml -SPDXID: SPDXRef-f49a1411a64e66a5b47a9f313f58203d -FileChecksum: SHA1: 0ed26e100efc39f4d4520b37c06bd840ccf1ccd6 +SPDXID: SPDXRef-bfa8b0f656a6947307a8a9ee5571b8d4 +FileChecksum: SHA1: ba77c3c6e4b8be699267b760e210530a76820234 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -231,8 +276,8 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./README.md -SPDXID: SPDXRef-c788f66f6d7bce45f2995c46adb99977 -FileChecksum: SHA1: 32c0f8f702ff587817c9b143cb1dcbd9bd0196bc +SPDXID: SPDXRef-c16667d35b7d7d38b7560050130cd6aa +FileChecksum: SHA1: cefb1614025e52e89680215c1ed5198f8d35d07f LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -247,8 +292,8 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./_typos.toml -SPDXID: SPDXRef-0fcae69dc9c3a42d7d0f5ac5ec0d4e73 -FileChecksum: SHA1: 45f3331f8e49d6c4691bbe1ceffb5493e1390b17 +SPDXID: SPDXRef-b8aca7d6890642118a77e49b86db1b02 +FileChecksum: SHA1: 62cd32ccdbfbe23cb82baed05b12cf0fce04a4c0 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -263,8 +308,16 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/Cargo.toml -SPDXID: SPDXRef-39099658cc1b787334fbcc7ed50ef03b -FileChecksum: SHA1: f4052fe44bd26ee3b42fe7fca74c6d14ab2ca540 +SPDXID: SPDXRef-a1e1f4aabd2e746a94db5b19ceab994b +FileChecksum: SHA1: 6fc3aad9796b65042b08f54b76dbbb2dfb46fb75 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/ast-engine/README.md +SPDXID: SPDXRef-5824608efe5377128ba30a46450c059b +FileChecksum: SHA1: 435d13dbfedf1660d5d2ec6dc842517e79247876 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -278,9 +331,16 @@ LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. +FileName: ./crates/ast-engine/benches/performance_improvements.rs +SPDXID: SPDXRef-10d5611a1e0c8e3e02bd3d4b6324160e +FileChecksum: SHA1: 61ae466bb6ecbc71d7d26cdd837a5fba2f744d1e +LicenseConcluded: AGPL-3.0-or-later +LicenseInfoInFile: AGPL-3.0-or-later +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + FileName: ./crates/ast-engine/src/language.rs -SPDXID: SPDXRef-4ec674bfa203a164ea31b2464eafb449 -FileChecksum: SHA1: 082758427677ecf4d213c96a37111cf471aa547f +SPDXID: SPDXRef-0617cdfaf5e8a6a2fbc107f258595637 +FileChecksum: SHA1: f4b2a189dd277fa0277fdca20e73cca36e3e51ed LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -288,8 +348,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/lib.rs -SPDXID: SPDXRef-8a6591f856789d9609a1dbf8700d8f27 -FileChecksum: SHA1: 8dfafad95d23f9d63c457b1facb80461962be64a +SPDXID: SPDXRef-511280701d0ea20f85dc8a0164648145 +FileChecksum: SHA1: 7f17156e3c72e7cb869a03ec0f4df11683dc973a LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -297,8 +357,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/match_tree/match_node.rs -SPDXID: SPDXRef-440a0faf4d201fe3c28a5b2a3eff13e3 -FileChecksum: SHA1: 470546cd5b13e2df533408f9d0efe96ced9cf837 +SPDXID: SPDXRef-ec6b164aa3b721341d579b6978719839 +FileChecksum: SHA1: b94adcabaff30b55e024df24c86cdf0a72d0012e LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -306,8 +366,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/match_tree/mod.rs -SPDXID: SPDXRef-aed67bd6c3c2f6a3ffea77891c719508 -FileChecksum: SHA1: 9ad496d56b4dbdf8c617a778e4800cb81459875f +SPDXID: SPDXRef-6f3e55dd5b25c4af7dbd70921c9abd52 +FileChecksum: SHA1: 14c60649c88759a2cfc0d17c559f5f3b158fbd3d LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -315,8 +375,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/match_tree/strictness.rs -SPDXID: SPDXRef-3dcad423b391e453fbcd509041b97d5c -FileChecksum: SHA1: 36ada28965ca4e4babefb2ceb733c7b77ea38102 +SPDXID: SPDXRef-9e78a25e26d11febe57b342f5650eb32 +FileChecksum: SHA1: b722fc3717e4971bfcf5ed1702c179e86e5fdbfe LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -324,53 +384,69 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/matcher.rs -SPDXID: SPDXRef-09fa6f8a26e96de897874878768a974a -FileChecksum: SHA1: 044f31fbf693c33f08e143776eb27842cab36ab0 +SPDXID: SPDXRef-b2f41e5acab68a9ce200ea2ee88723d8 +FileChecksum: SHA1: cb61df836b9eb4df491995a37c519b9d9c414a8f LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> SPDX-FileCopyrightText: 2025 Knitli Inc. -FileName: ./crates/ast-engine/src/matcher/kind.rs -SPDXID: SPDXRef-5e236e0e972d31b24852404c91041914 -FileChecksum: SHA1: cbf8fbabf090ddb5945adbd93f0db726738e07d3 +FileName: ./crates/ast-engine/src/matchers/kind.rs +SPDXID: SPDXRef-143d447ff55195e81166cd9f44ba8244 +FileChecksum: SHA1: c86a66f7a3758c7460e4d2a891f942a1d06d385c LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> SPDX-FileCopyrightText: 2025 Knitli Inc. -FileName: ./crates/ast-engine/src/matcher/node_match.rs -SPDXID: SPDXRef-e5a2d0f29ecfa6b574bbacdb4fe20afe -FileChecksum: SHA1: 0d88702211c643dc5758c184448bd2ffac74c0f1 +FileName: ./crates/ast-engine/src/matchers/mod.rs +SPDXID: SPDXRef-9fc7f2e7b5aada08f2b3557fed51a3ff +FileChecksum: SHA1: 1e3a28399ef3ea31826ba5cf811a83e8cf8df067 +LicenseConcluded: AGPL-3.0-or-later +LicenseInfoInFile: AGPL-3.0-or-later +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/ast-engine/src/matchers/node_match.rs +SPDXID: SPDXRef-991c6b69c972b880a69ad024f0c07311 +FileChecksum: SHA1: 0b1ed8fbdd0ba2809c6436572ccafed574a53f12 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> SPDX-FileCopyrightText: 2025 Knitli Inc. -FileName: ./crates/ast-engine/src/matcher/pattern.rs -SPDXID: SPDXRef-678e829f8b6e1ea82c4598ec9fe3fa77 -FileChecksum: SHA1: 4e5a27f9051b73d4b9acf433f71d40ce16d0539d +FileName: ./crates/ast-engine/src/matchers/pattern.rs +SPDXID: SPDXRef-cc42a83a476dead4a95ad10c60c332a8 +FileChecksum: SHA1: 6bd4e09da650b62c4811231d8fbe1fb5710c9ca7 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> SPDX-FileCopyrightText: 2025 Knitli Inc. -FileName: ./crates/ast-engine/src/matcher/text.rs -SPDXID: SPDXRef-3e193753a71b49d2d17a6013aca71a00 -FileChecksum: SHA1: 5f6ac26ed9667c8f40e668b498ce8e5616b6dfd8 +FileName: ./crates/ast-engine/src/matchers/text.rs +SPDXID: SPDXRef-01713a9784b65de30ec6e463ae5cb2a8 +FileChecksum: SHA1: 3961bdef1fe8816e77be7fc99f020d014c6a4250 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> SPDX-FileCopyrightText: 2025 Knitli Inc. +FileName: ./crates/ast-engine/src/matchers/types.rs +SPDXID: SPDXRef-b1621c440e11b332e6a1c7cea66d04d1 +FileChecksum: SHA1: 37fe02f426016e4407a682d5b5136a39ef79fce3 +LicenseConcluded: AGPL-3.0-or-later AND MIT +LicenseInfoInFile: AGPL-3.0-or-later +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> +SPDX-FileCopyrightText: 2025 Knitli Inc. + FileName: ./crates/ast-engine/src/meta_var.rs -SPDXID: SPDXRef-a7144ef2367f48f40b80a010550cbfa4 -FileChecksum: SHA1: 0b28bb5f587a960c757864401e3f8ec9c693e994 +SPDXID: SPDXRef-82236b51eee80388f254120dde3e41b3 +FileChecksum: SHA1: 61b1e099ca5826bba0ea5dec4453bc88dbd5e885 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -378,8 +454,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/node.rs -SPDXID: SPDXRef-9f11ae50ad1f27f036c90be0bdeef272 -FileChecksum: SHA1: d7e3ca0e6cc67f20c424aaa03f1b18bf7434a9c0 +SPDXID: SPDXRef-d21244bb838c7d9a2e5de8619a6377fd +FileChecksum: SHA1: aa0e9a9f6f52c69040b75ecc07508e2365f7e4ee LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -387,8 +463,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/ops.rs -SPDXID: SPDXRef-079ca752921b2b370f642dd3673a3dad -FileChecksum: SHA1: c1559f4bc58a3613ace4f22229a42ed07fe2d98a +SPDXID: SPDXRef-28b06523c48f3ba2a481874bcf9f78e9 +FileChecksum: SHA1: 745443b40b1c0ee5482d263fecfb36c3e2ffa99b LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -396,8 +472,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/pinned.rs -SPDXID: SPDXRef-f73073cd1916b86a614c6401c1c6be54 -FileChecksum: SHA1: 6c3f40112ec87e6e8a8dd398801371bed77d7d45 +SPDXID: SPDXRef-c375f56ad1017c32d8cd14b6bb931155 +FileChecksum: SHA1: 8cab1668d6f88f5ff9af57030f6c9ffd258a24ec LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -405,8 +481,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/replacer.rs -SPDXID: SPDXRef-e4755e8a20b532662c4530fbc225e898 -FileChecksum: SHA1: e67dcc6408c5d3c424311f9cc9d9fce38dc71b2c +SPDXID: SPDXRef-26feb5cd4a18c478407544ca536efbe6 +FileChecksum: SHA1: 4a244a268c7b4f4128ae4607c62918c53e0ece9c LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -414,8 +490,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/replacer/indent.rs -SPDXID: SPDXRef-8d9a2705429af553ea97908a15b1a250 -FileChecksum: SHA1: 27ba9f2a32da151baa8f4d9de5eb07dc27371093 +SPDXID: SPDXRef-d6903737be443edab488390cde72ef44 +FileChecksum: SHA1: 1feca68b1a3a43c7c72ba60b0f1295f87b6a0ee8 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -423,8 +499,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/replacer/structural.rs -SPDXID: SPDXRef-eca11699fdb87d6c1440c09cb46cd048 -FileChecksum: SHA1: 18052197b6e0a771c8c421573b8108a7ed96e26b +SPDXID: SPDXRef-8a5a7f1116a67058386473b2917e3764 +FileChecksum: SHA1: 2cde4c088be5f128f1833a0329349f115d2c8b6e LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -432,8 +508,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/replacer/template.rs -SPDXID: SPDXRef-a6bf022467040f75b7cf2db6a7f24fad -FileChecksum: SHA1: fd09fc7b00e8279746417c560a3e40fc688e03b8 +SPDXID: SPDXRef-01e2dccd15ff532c7ffc6fde1f9d610a +FileChecksum: SHA1: ea927fc52f716c1b7440cc847a174d794ec8a923 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -441,8 +517,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/source.rs -SPDXID: SPDXRef-c96a8cba67b92fac68d6f5e206270b4c -FileChecksum: SHA1: cf89065168462ce594c1a33be4aa900168937faa +SPDXID: SPDXRef-d44b27a20b1566215674619b967ea82f +FileChecksum: SHA1: a646365ec895b187e18a08f8acb494651c1ca6b0 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -450,8 +526,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/tree_sitter/mod.rs -SPDXID: SPDXRef-0f021921fd74ba2de2a831053b5a9646 -FileChecksum: SHA1: e6f0e5e4492cbb9a1ea6ed45d52dc8ddf6184169 +SPDXID: SPDXRef-378498ea06c3736f1e80991eb40e05f7 +FileChecksum: SHA1: cb8a97a5e7f2a2c45ed4660599b8108ef7c6ea7a LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -459,8 +535,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/ast-engine/src/tree_sitter/traversal.rs -SPDXID: SPDXRef-69ac961743696227e7411fc5ca2e6a94 -FileChecksum: SHA1: 297876778fc6fcafe560759e37dfdcd055b07809 +SPDXID: SPDXRef-e556605ef942814f1aa6257de95a6f55 +FileChecksum: SHA1: 1712ce876b9f51e7f5047a477a3436caed3cacab LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -468,8 +544,16 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/Cargo.toml -SPDXID: SPDXRef-99407358b674bbb340970ee47b97c40d -FileChecksum: SHA1: 2d16d75e9b73c842a089e0e50ddc59405f825042 +SPDXID: SPDXRef-09f48fed9cb0332237f44fa8dfa5ebe0 +FileChecksum: SHA1: 51afc75835902170180785253d188cc07551d5bb +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/language/README.md +SPDXID: SPDXRef-ec9301d771546992ab95448a1b9f6c4e +FileChecksum: SHA1: 435d13dbfedf1660d5d2ec6dc842517e79247876 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -483,9 +567,18 @@ LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. +FileName: ./crates/language/benches/performance.rs +SPDXID: SPDXRef-1dbadf0d5f299a87e67e13e50912ea46 +FileChecksum: SHA1: e7ff347781b9b83648504b0f937865f18d7bd32d +LicenseConcluded: AGPL-3.0-or-later AND MIT +LicenseInfoInFile: AGPL-3.0-or-later +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> +SPDX-FileCopyrightText: 2025 Knitli Inc. + FileName: ./crates/language/src/bash.rs -SPDXID: SPDXRef-8d363400b7e6b0d2caf830a079503a3f -FileChecksum: SHA1: f6032441cdc06ad020c543da11ed928a17e3e30b +SPDXID: SPDXRef-6a7d9b1bf74e92a66496575c971496c9 +FileChecksum: SHA1: 8072953096f1ca0dcae055fca596159f21bc8e76 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -493,8 +586,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/cpp.rs -SPDXID: SPDXRef-38f6c247a07e71a3e1a4c768ad7f9309 -FileChecksum: SHA1: 71c803a8e2a86eec2190692d18c3628cacc1141c +SPDXID: SPDXRef-93cc0cca6ac0a415a8e94ed725ee0ac6 +FileChecksum: SHA1: 10bd6ccfe574caa4dd739f734ebad4d6badb086d LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -502,8 +595,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/csharp.rs -SPDXID: SPDXRef-02449f9481b452608b5c7e583570efd0 -FileChecksum: SHA1: 7efeb46943df29ad1c143f75228da5bdf655d17a +SPDXID: SPDXRef-9adcbf86a0c18c69d64450486315c9c3 +FileChecksum: SHA1: 32f02da0da4204b0089939c940ad8e711794ba59 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -511,8 +604,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/css.rs -SPDXID: SPDXRef-449bbe9e330b8492ee9368aac216056e -FileChecksum: SHA1: 74a9a06ce769907841924042a39cf8885a36d6f0 +SPDXID: SPDXRef-8ce21b9450fb9727e9e910389d6eccb0 +FileChecksum: SHA1: 4c9ab64b4977931ea9995194cd2aa0a251a9c3fc LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -520,8 +613,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/elixir.rs -SPDXID: SPDXRef-14ccdccbf1f49876d41aebffa5d6f489 -FileChecksum: SHA1: a5bd1b34e4431fb00f54c52d00caf67dd4ad1aba +SPDXID: SPDXRef-fe05fb9fad746d3907eea8f5ebf0e52e +FileChecksum: SHA1: 15a09881e448b5f2d56dc7c4d1bf558da0ef27a1 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -529,8 +622,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/go.rs -SPDXID: SPDXRef-100a39ce86f3adc287535c4d31e2ed57 -FileChecksum: SHA1: 1aa88793ae58baef1c1e825118ea8cda1d48678e +SPDXID: SPDXRef-84dd26a32e71e8598737af625401f1c9 +FileChecksum: SHA1: 71ac17064b16d4db11a6983ee1910992ec186228 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -538,8 +631,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/haskell.rs -SPDXID: SPDXRef-b9cf4bbbc385f28b19a397c567b22ca2 -FileChecksum: SHA1: c49d9ffcf483b2550c4216051c137daf2caae915 +SPDXID: SPDXRef-4755e858a3ddf7999bf59f40c48821c6 +FileChecksum: SHA1: ab8e3fdfba48f455fe452e85066b92bb3e9ffef8 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -547,8 +640,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/html.rs -SPDXID: SPDXRef-01ebf8805fb04c95006afb70a074f4bf -FileChecksum: SHA1: d65a2b124845c1cc2bebc64012efa788adbbb93b +SPDXID: SPDXRef-87aabc808822ef9f16e6eb7ec57c4225 +FileChecksum: SHA1: c49d87934ad12e94039ee59f7a000f82a71f775b LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -556,8 +649,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/json.rs -SPDXID: SPDXRef-e0a8d8b7c0cef31c000331c66a98db64 -FileChecksum: SHA1: b67a3b0df7087430397b79523fbd1aad26b0cdc7 +SPDXID: SPDXRef-62bef916fc2e3aca8bc6f2582e01a3cb +FileChecksum: SHA1: bc22290cd59cbca72ebc4d7ad42101f851fb188d LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -565,8 +658,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/kotlin.rs -SPDXID: SPDXRef-5149c394d8e1c6866a99fdb403e2b3e1 -FileChecksum: SHA1: d5d19506d172c8f683abc81514e61a4f2bad54ab +SPDXID: SPDXRef-9242e18eec18fbbe46872b994d521352 +FileChecksum: SHA1: 3b303f089d0c743802c0e0a653468d17ed6d4e75 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -574,8 +667,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/lib.rs -SPDXID: SPDXRef-7d06bafb476dca352c5f6373fe6bdc07 -FileChecksum: SHA1: 236802e1012e19faf3659de807a8ccfd04507e99 +SPDXID: SPDXRef-1f3ad30e3477c1e63ce14c2b0491f134 +FileChecksum: SHA1: d8e20e50f91cb1a6792af5c4bfc2ba478303cd5b LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -583,8 +676,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/lua.rs -SPDXID: SPDXRef-2b16ccc2ab42f014960c65bbc80e7b85 -FileChecksum: SHA1: 0563649957fe370d9645c05b922a0693baac001e +SPDXID: SPDXRef-4679e2fb2736dfbe261dbc7014925aaa +FileChecksum: SHA1: 7384ddc3ee2af2941fcd15c6a7a7fee662ba10c5 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -592,8 +685,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/parsers.rs -SPDXID: SPDXRef-ff05fc4fdc5336e80d443fd4d8c02b92 -FileChecksum: SHA1: 0724d08aa1d6eca5124d7a57ae94aa71a0a1b530 +SPDXID: SPDXRef-9f316d46be3893ad33252a9f85f0cd69 +FileChecksum: SHA1: a163dc163c7c86ad7514e5202d3c30cddd2a77e4 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -601,17 +694,26 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/php.rs -SPDXID: SPDXRef-3a0d04868bd59307158fa676595721be -FileChecksum: SHA1: 766232ae563e1238d2ee7ab8bc31d286cd3cf236 +SPDXID: SPDXRef-1a0b0ab05a8a32eb52ae983d7993792b +FileChecksum: SHA1: 864c85dc910714892ed2e4f060697ec4808cb6c3 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> SPDX-FileCopyrightText: 2025 Knitli Inc. +FileName: ./crates/language/src/profiling.rs +SPDXID: SPDXRef-03dbb7df859510a45820e8afcb4db8b8 +FileChecksum: SHA1: 6d7bb03592543231f225f031b6916621386e9f97 +LicenseConcluded: AGPL-3.0-or-later AND MIT +LicenseInfoInFile: AGPL-3.0-or-later +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> +SPDX-FileCopyrightText: 2025 Knitli Inc. + FileName: ./crates/language/src/python.rs -SPDXID: SPDXRef-4fe6c42670dd239542cdc3dac7bd82dd -FileChecksum: SHA1: 26b1bc6f9a6e27f59f0695988c25d514bb701da0 +SPDXID: SPDXRef-cc74a7e5dcaa4e3e6e59af5747b774c8 +FileChecksum: SHA1: 64a41c9190f076c447a9fa714bb7c5f11131645d LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -619,8 +721,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/ruby.rs -SPDXID: SPDXRef-e301ecf3829e863cb4f05049e91152aa -FileChecksum: SHA1: 8d973d35aa6a18c81bdc49e6bc58c9ce1db4cc97 +SPDXID: SPDXRef-5f68a60d241551b9478e8da4e1947f32 +FileChecksum: SHA1: d8216d76760e924e5cd2154c4693b6fb53bd0262 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -628,8 +730,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/rust.rs -SPDXID: SPDXRef-d360182a7086a20b3c0705c58463e3b6 -FileChecksum: SHA1: b83564f8fb64d2ba47f35af947f7b59962b28cf1 +SPDXID: SPDXRef-5cf6bde490d11af95e99fe591950a539 +FileChecksum: SHA1: bbab11a186618a91081d7dfe3236535fce3e5658 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -637,8 +739,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/scala.rs -SPDXID: SPDXRef-7d48e8f79f50a633a04cbf4594fcb5b1 -FileChecksum: SHA1: edf5fc8676806bfc3bc0553a599fcd5f730c2496 +SPDXID: SPDXRef-4e5549d97f78322abe3bf02fa034442f +FileChecksum: SHA1: ff26a8d00beeeb1f0a1c5bf36c2b338215743e60 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -646,8 +748,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/swift.rs -SPDXID: SPDXRef-3cf739e8034d59665604a60997ca06c9 -FileChecksum: SHA1: f424dc137132086bc093fc57a4ea79a9588bdbcc +SPDXID: SPDXRef-88ddbbb712f12d1ad192aee361c9f00b +FileChecksum: SHA1: 6da351622739120e60c7d3fa2bde984ea67d3a5e LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -655,8 +757,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/language/src/yaml.rs -SPDXID: SPDXRef-1453f5fb98775d83908543158e864718 -FileChecksum: SHA1: b443dd552fb8867179d7c18bb04cd89ee6215003 +SPDXID: SPDXRef-c2ece0178066fe3eb3590b9967240cc9 +FileChecksum: SHA1: 87cccb3eb69636563e81e616d5f1244247e65813 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -664,8 +766,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/Cargo.toml -SPDXID: SPDXRef-0cd6be9db8cfe7338445e9d77500e13b -FileChecksum: SHA1: dd9e5959bc4bdb867cf51fd3051615f4c09ba03e +SPDXID: SPDXRef-4f4f1b13f17307595fe3fe7d84552320 +FileChecksum: SHA1: 940d954b44ec7580dd82ae67fae05f81af8b62a4 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -679,9 +781,105 @@ LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. +FileName: ./crates/rule-engine/benches/README.md +SPDXID: SPDXRef-44e3265b570a8ea06c6de3c1c19a88e8 +FileChecksum: SHA1: 1225527dd65a326592c10946e4ecbe10b92a5831 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/benches/ast_grep_comparison.rs +SPDXID: SPDXRef-f0bcf2e352906ea7e459f2904978e7e1 +FileChecksum: SHA1: 892939018a7d87efac60d0a08f5ecaf49da8570c +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/benches/comparison_benchmarks.rs +SPDXID: SPDXRef-5d82d5b7b5fcb8e574ff8f9e9f6cd950 +FileChecksum: SHA1: 33bb7572656b5ac35e73c90543adbebddd4632dc +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/benches/rule.yml +SPDXID: SPDXRef-72421e01ae5225be857aec8880eff7ff +FileChecksum: SHA1: 5e67b80b45f899ff3788cb5897c1d73c70a0666c +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/benches/rule_engine_benchmarks.rs +SPDXID: SPDXRef-4344b19edab200ad510e59f6f65d9e67 +FileChecksum: SHA1: 714042067ff7314471d9c5755e60be86e0beef0f +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/benches/simple_benchmarks.rs +SPDXID: SPDXRef-6b093b0c4568d88fbe435cb2d5f8a6cf +FileChecksum: SHA1: a54f0749f0583d632572fea1c9b6455184aea0b0 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/serialization_analysis/README_SERIALIZATION_ANALYSIS.md +SPDXID: SPDXRef-b4984edc7c211bd6548a399e64225b7c +FileChecksum: SHA1: fbee6747ceaabff74d7f889bcf034c04d1e4aea3 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/serialization_analysis/SERIALIZATION_ANALYSIS_REPORT.md +SPDXID: SPDXRef-69b1eaf75e7587a9ff31c790e499773f +FileChecksum: SHA1: 748032a5d6360a760aa20288b339256ce10f8c4a +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/serialization_analysis/analyze_serialization.rs +SPDXID: SPDXRef-8c69dae73521d0cae6103c91eca41537 +FileChecksum: SHA1: 6800ac1f36fe9fd1c362e8120fa68537e78f5fc4 +LicenseConcluded: AGPL-3.0-or-later AND MIT +LicenseInfoInFile: AGPL-3.0-or-later +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> +SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/serialization_analysis/separation_helper.sh +SPDXID: SPDXRef-78ba32a9a61842d00cd60c7aa7b53870 +FileChecksum: SHA1: 96bf1a6a35c1dbbf1004fc3822fb7afb0cd11ad1 +LicenseConcluded: AGPL-3.0-or-later +LicenseInfoInFile: AGPL-3.0-or-later +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/serialization_analysis/serialization_analysis.yml +SPDXID: SPDXRef-d84cd3748a1af00f29c0574be5dbf285 +FileChecksum: SHA1: 8207a9d42e00fb6879ddb1903a048fcff095f570 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/serialization_analysis/serialization_analysis_report.md +SPDXID: SPDXRef-84b279b1a28794e6812ca95ba3c3b32b +FileChecksum: SHA1: 634c2ec44bfddca7ae800e3546e82166f77b5c84 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + FileName: ./crates/rule-engine/src/check_var.rs -SPDXID: SPDXRef-3de79ae5595ae3fff8fb9a3943888bc7 -FileChecksum: SHA1: b4fe5353a4d440043ef69d91174a4c27bd7f1611 +SPDXID: SPDXRef-755c900c2c113574ce12387662134517 +FileChecksum: SHA1: 3098b53eb8d5c7354ca25d7b1be92e13d1a8ed80 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -689,8 +887,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/combined.rs -SPDXID: SPDXRef-37778e089c4f835cc37a80d4e0033e33 -FileChecksum: SHA1: 39634d3e682d89c08654fa8988bf938f74ee50b0 +SPDXID: SPDXRef-5b1d40119bd22e0bdda07ff21d0bfcb4 +FileChecksum: SHA1: e4a6fe5eb3ffee7a9479ba5d894dc87731c1d7a1 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -698,8 +896,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/fixer.rs -SPDXID: SPDXRef-8a13b12f73326f068db26c486b0b53b9 -FileChecksum: SHA1: 312a9d8fd7ac6c7d357c6f1105a4cd67326e6507 +SPDXID: SPDXRef-71503a120709919546f858fabb02fef5 +FileChecksum: SHA1: 07dbdedfc81ad0c4df220c8c98b0bff4cd7abfe6 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -707,8 +905,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/label.rs -SPDXID: SPDXRef-6f1c4f174deef93c39f8e3513e31a2ab -FileChecksum: SHA1: 5f812130fcb5ad071b9817f36f8cf2fd791a7742 +SPDXID: SPDXRef-3b391b3b3ace0233c17487d0c8c59bc3 +FileChecksum: SHA1: 1d7f4a305c09ec5a6aba6e0afda7d7dbbe0770c8 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -716,8 +914,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/lib.rs -SPDXID: SPDXRef-fd89fb00e1677ed98e349ce1ee67cd47 -FileChecksum: SHA1: 929c5540082e807f52371d1170e4617043383203 +SPDXID: SPDXRef-4ff01a81cb40fa09167b85fcdb7d717c +FileChecksum: SHA1: 4f896496007a9ca62f4fde23d258e688cf91103e LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -725,8 +923,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/maybe.rs -SPDXID: SPDXRef-7184ddfe7e64d5a27c5ce6e6e2481cad -FileChecksum: SHA1: 2f6337afa60f87909bc166c52725e8d4b84324e3 +SPDXID: SPDXRef-aaeebd3424e75edc252e0fc0f9c40357 +FileChecksum: SHA1: 68c65672b6ac761ebca03d7eed41f54651222477 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -734,8 +932,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule/deserialize_env.rs -SPDXID: SPDXRef-865a5c4b0712c0fab675e38cae3522fa -FileChecksum: SHA1: 085123a0a5bc43f264addab2338b4a6ae8761503 +SPDXID: SPDXRef-16c322de0974dc8ede200f764998850d +FileChecksum: SHA1: 93c3b5b7dbc2f0c6dcd647206916c13e9294d660 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -743,8 +941,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule/mod.rs -SPDXID: SPDXRef-d042521c12b5e973e7d11f9257d66bc0 -FileChecksum: SHA1: 9cd0e8f8a51c94936f0ef8a1d84d3bf0cb84f5e0 +SPDXID: SPDXRef-c8716e9c443b191bdda41f18123231bc +FileChecksum: SHA1: f0d36f0f03354b98c229b598e5be2f098ab5f488 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -752,8 +950,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule/nth_child.rs -SPDXID: SPDXRef-992b79db975e1d3dd355d6fb729e3127 -FileChecksum: SHA1: e5ad145f16d75eb582f5952a7436fb416a69260a +SPDXID: SPDXRef-c20e2a59852c3e934e7c5219f37f164d +FileChecksum: SHA1: dd7fd3b130a2e7ecf73e293d2cd89df9effee25b LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -761,8 +959,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule/range.rs -SPDXID: SPDXRef-4438ac8225780467f1f6af36cac3c607 -FileChecksum: SHA1: 5ff31e3bd3195b380d6bde980ef759ce4d14b3e1 +SPDXID: SPDXRef-84c5799fc4644e42e9377609b1a0d8ba +FileChecksum: SHA1: 583ebcd5014fbf31e3d7d04ac39c63d10467fd40 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -770,8 +968,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule/referent_rule.rs -SPDXID: SPDXRef-e9240ed0fd5d93b5e591c23bfe7828ee -FileChecksum: SHA1: 461bd7135890f5c500ed3008adc2d1f3d460624d +SPDXID: SPDXRef-d4e5aee67a46f52b20e9ddc3cbf7f8a1 +FileChecksum: SHA1: 3410d941b6caf35ae06bbceee30aeb8cf6a5fb11 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -779,8 +977,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule/relational_rule.rs -SPDXID: SPDXRef-d41de76e0684cf1b8272f4524d01d712 -FileChecksum: SHA1: c8a6c7c77765af4c5dc010296126f665f2f02a4b +SPDXID: SPDXRef-66c8e5b8b74181dab0efa93fedf04775 +FileChecksum: SHA1: bba0c9184158934a041ef3e5bfd33b4b97ca5a03 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -788,8 +986,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule/stop_by.rs -SPDXID: SPDXRef-7d89758e1293be0feef9f86cc906aaf4 -FileChecksum: SHA1: 606dbbbec37dcad9bbe77513e0bf045b71038326 +SPDXID: SPDXRef-3e20833918d83e1501367c81f699cd28 +FileChecksum: SHA1: 6a6725ef5b74d5f31c7d31472f5670328b6a5f68 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -797,8 +995,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule_collection.rs -SPDXID: SPDXRef-c62ba560541c1b3e24aaf44e75124458 -FileChecksum: SHA1: 97d19958210ceb8ec8d7c1daf43e1a9140d38b0c +SPDXID: SPDXRef-43b6916f4130a2f307648cbd8780c6ca +FileChecksum: SHA1: 4c76b67eb0864f671163b577a8adbc8465f0b329 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -806,8 +1004,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule_config.rs -SPDXID: SPDXRef-c584db239ad3a674d3d80e685a810051 -FileChecksum: SHA1: 15514a63ab2cd47cd5782193d4513f8f1ed462b2 +SPDXID: SPDXRef-2a1b8523c7ed302d1ae757565e9833ba +FileChecksum: SHA1: 67f9fd8817f84d7bb3728efddcd3ce073f313d87 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -815,8 +1013,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/rule_core.rs -SPDXID: SPDXRef-b59c4db430766b146a4d16788cee7744 -FileChecksum: SHA1: 229fdeae7376f2e4d0ed5c24b3e8b6d02bf7a4f4 +SPDXID: SPDXRef-f8fc8dfa9cd986e616d1261fa6e3b60b +FileChecksum: SHA1: 9b25ea6eaa10a0daf31a2a580ee17cba89d1edba LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -824,8 +1022,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/transform/mod.rs -SPDXID: SPDXRef-8d9d11f1a601311bb2d47b9789518d10 -FileChecksum: SHA1: 1172df792c087a0f35060ee833abdf8f8274b324 +SPDXID: SPDXRef-78101a273943c2ee663817abf1cec511 +FileChecksum: SHA1: 3c419cb6a73997b0655c8d461d5ddf2d024f8781 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -833,8 +1031,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/transform/parse.rs -SPDXID: SPDXRef-1018ceacc606f9bbb70eff0af5af576d -FileChecksum: SHA1: 82bab17cf05ded5eb40b2c39f8052e45bc9fe77f +SPDXID: SPDXRef-9fb9f55a41c065aac1ce6ce1c46a6548 +FileChecksum: SHA1: c393f511bc1aa4efd4713751b74c96c64f8bc95e LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -842,8 +1040,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/transform/rewrite.rs -SPDXID: SPDXRef-da48ef7cd1a99ab090dcbff5fbadedbb -FileChecksum: SHA1: cf53f6b262b9cca03a0e994447b934e264103255 +SPDXID: SPDXRef-46d9043e3c5e09e750583361293dc3e3 +FileChecksum: SHA1: 171e04949d7ca93d3775813351f47bb760e0740e LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -851,8 +1049,8 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/transform/string_case.rs -SPDXID: SPDXRef-f9e22dfe0b5443081b3325e5d29ab5a4 -FileChecksum: SHA1: bd670c800289077f66be1c525d1f9bdc585f18be +SPDXID: SPDXRef-387aea8823d88cdf386814687489a8a9 +FileChecksum: SHA1: 92daabff413359fe9e0ce87049bbbdd63f98c918 LicenseConcluded: AGPL-3.0-or-later AND MIT LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT @@ -860,53 +1058,107 @@ FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883 SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/rule-engine/src/transform/trans.rs -SPDXID: SPDXRef-92c37420e3cb62fb03ea8771867d17d8 -FileChecksum: SHA1: a6023fd2159d94a1d3fc945b6646b60347da031e -LicenseConcluded: MIT +SPDXID: SPDXRef-67c0c3ec0d27bbefa006f4f6b4435aaa +FileChecksum: SHA1: 413ec7958fc17b88173eab82b74cd9e0b4f3206a +LicenseConcluded: AGPL-3.0-or-later AND MIT +LicenseInfoInFile: AGPL-3.0-or-later LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2022 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> SPDX-FileCopyrightText: 2025 Knitli Inc. +FileName: ./crates/rule-engine/test_data/sample_javascript.js +SPDXID: SPDXRef-2ec964f49a3ef9ff1163a8f86f6abd52 +FileChecksum: SHA1: d854ee84a37715307da6c512a34befb6e5476aad +LicenseConcluded: AGPL-3.0-or-later +LicenseInfoInFile: AGPL-3.0-or-later +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/test_data/sample_python.py +SPDXID: SPDXRef-8f82799968b369568ab71617b47acae1 +FileChecksum: SHA1: 1f864a1e0c3abf6f98819594e3a1013d4409e671 +LicenseConcluded: AGPL-3.0-or-later +LicenseInfoInFile: AGPL-3.0-or-later +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/test_data/sample_rust.rs +SPDXID: SPDXRef-1186548580f204803d45c200407cf83e +FileChecksum: SHA1: a50d6b36c13cdb2048f21dece72581b9a13bd957 +LicenseConcluded: AGPL-3.0-or-later AND MIT +LicenseInfoInFile: AGPL-3.0-or-later +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com> +SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/rule-engine/test_data/sample_typescript.ts +SPDXID: SPDXRef-fbc59c602fbeb91a3ad6eb814d67fcbe +FileChecksum: SHA1: 1c9a937682da21649671c999cce7c17154fdef45 +LicenseConcluded: AGPL-3.0-or-later +LicenseInfoInFile: AGPL-3.0-or-later +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + FileName: ./crates/services/Cargo.toml -SPDXID: SPDXRef-5809c17d3c8ce2fdfe390918384ff879 -FileChecksum: SHA1: aba99c2c38dc7adf6f050377c9d71e353d392dd3 +SPDXID: SPDXRef-ab578ef52433772de1a1ac40c24c5dd7 +FileChecksum: SHA1: b3d9cdc7bed48fef1637fb679b7ef8752cf724fe +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/services/README.md +SPDXID: SPDXRef-7f63724cdb7306f17cb7ebb13b9696cf +FileChecksum: SHA1: 435d13dbfedf1660d5d2ec6dc842517e79247876 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/services/src/lib.rs -SPDXID: SPDXRef-7cd1d69437809fba1e7da6f80d225dff -FileChecksum: SHA1: dd903c10449b6fff6e59b9a6e8ab1963a8328014 +SPDXID: SPDXRef-2f42843e8bad608efebd3fe792506733 +FileChecksum: SHA1: 27ad6389ddf9c9d2c5f18b2bc301bb5cf7ea553e LicenseConcluded: AGPL-3.0-or-later LicenseInfoInFile: AGPL-3.0-or-later FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/utils/Cargo.toml -SPDXID: SPDXRef-84cf09fe3ed0320d1e4514ff60bd281a -FileChecksum: SHA1: 9de4654c60ab672b4f8cf80a65c806c0cb4e418b +SPDXID: SPDXRef-cf45e0b9fb205344b522fc26a8298235 +FileChecksum: SHA1: 914a2208eed4703895b2f6faf0e1fa88aac938a0 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. -FileName: ./crates/utils/src/fastmap.rs -SPDXID: SPDXRef-f46cf71b99032f5ccf7acc5458518a42 -FileChecksum: SHA1: ff11312dd50a0c6c50042dbdea6ef1ce939b77f9 +FileName: ./crates/utils/README.md +SPDXID: SPDXRef-5ee3f66b8956f2ce1eff477aa68edd88 +FileChecksum: SHA1: fb4293c22fd8b65392c8d505673b5b2728023fda +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/utils/src/hash_help.rs +SPDXID: SPDXRef-d979566a50fe94181ba534401c2c62b1 +FileChecksum: SHA1: ba564844435251db7a3f8f818f816f9b6ffbc7e9 LicenseConcluded: AGPL-3.0-or-later LicenseInfoInFile: AGPL-3.0-or-later FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/utils/src/lib.rs -SPDXID: SPDXRef-4eb0e1712f8060bad99f9cf87751bae9 -FileChecksum: SHA1: 4e236c852a8bfcb6f3a551745ea1d8b1dc6f0b91 +SPDXID: SPDXRef-3c88c068e0fd6eb9b98ec09e04413a5b +FileChecksum: SHA1: fcda5df421f821dc6d46767a5a87d68b14d1bda7 +LicenseConcluded: AGPL-3.0-or-later +LicenseInfoInFile: AGPL-3.0-or-later +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./crates/utils/src/simd.rs +SPDXID: SPDXRef-4e5f3eee8ceebc5edde78a3911cfdb49 +FileChecksum: SHA1: 1b2f5b34a284a426f86e155ff09988f369bd65b8 LicenseConcluded: AGPL-3.0-or-later LicenseInfoInFile: AGPL-3.0-or-later FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/wasm/.appveyor.yml -SPDXID: SPDXRef-9b0f570ff5d7001399b0082cfd36655b -FileChecksum: SHA1: 8c050d78b239c9199f30f0cce8654c654ad6a6bb +SPDXID: SPDXRef-402f57903a72a929777f9ecb50757632 +FileChecksum: SHA1: b93862a2f2a88005e06f200aae67b526c84c670a LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -921,16 +1173,16 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/wasm/.travis.yml -SPDXID: SPDXRef-cce5ffe10aba663335680e4e43bd9288 -FileChecksum: SHA1: 848f0cc8834d6f91c2890da82f16b1a6770c36b2 +SPDXID: SPDXRef-c4c348a439df60cdf4aff38a25210bc1 +FileChecksum: SHA1: bba68b8c29c494b923867326281d6733d3c13c32 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/wasm/Cargo.toml -SPDXID: SPDXRef-5aa13c18af5ef3dec3ecab0f8c63bb7a -FileChecksum: SHA1: 840c119fc76390ff5e47b4b984b1ebe3c2cb9221 +SPDXID: SPDXRef-a25112f597e8755c458099641b73386d +FileChecksum: SHA1: 89f7e275a698f4d680c536bb6e4b51b3ba201999 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -945,37 +1197,37 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/wasm/src/lib.rs -SPDXID: SPDXRef-db4b8d6674824e5018c48e38aff38022 -FileChecksum: SHA1: d65ab95dc716267b64e1025dd96e790748cfd701 +SPDXID: SPDXRef-cd66635fb95e3bdc6e6138501afb03cc +FileChecksum: SHA1: ea3e48f14cd350c8deec7b46cbfc1089d23a694d LicenseConcluded: AGPL-3.0-or-later LicenseInfoInFile: AGPL-3.0-or-later FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/wasm/src/utils.rs -SPDXID: SPDXRef-3748ffc1bb58ea2ea7cf599ef81e64a7 -FileChecksum: SHA1: bd41aa3d055fa8a40cd35fa5c581521caee92772 +SPDXID: SPDXRef-ba1acab5d40bff5ac621372263b6d331 +FileChecksum: SHA1: a289cd51675dd0c9d5619eb8dd776d23bb572164 LicenseConcluded: AGPL-3.0-or-later LicenseInfoInFile: AGPL-3.0-or-later FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./crates/wasm/tests/web.rs -SPDXID: SPDXRef-979fb0d254aeba65211578ff8b35684d -FileChecksum: SHA1: 5b6592815f627b5026bb3c4e160a749811cf0765 +SPDXID: SPDXRef-b6ae9ad907495912115b7c8f9d53809e +FileChecksum: SHA1: f3c60e53827ea1d1b31dfd8b0f9849aaf7018597 LicenseConcluded: AGPL-3.0-or-later LicenseInfoInFile: AGPL-3.0-or-later FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./deny.toml -SPDXID: SPDXRef-640c584f4da01eb49b738f7c45c188af -FileChecksum: SHA1: fdbfd3d32e793170213fe35830100cbcd7f16300 +SPDXID: SPDXRef-2c0d6c272508977525c03e1c21ab05dc +FileChecksum: SHA1: 4ff7f872f977d2215a579a962eb6b24f6384de95 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./hk.pkl -SPDXID: SPDXRef-49d2aa98c1d7212438eee6fd73d05d7f -FileChecksum: SHA1: 3f1091c14bba2e45bcb0ea6e47676d168f96fc90 +SPDXID: SPDXRef-a2d59c457404f3e2c7adf8057a6e3767 +FileChecksum: SHA1: 243a977f13a98def08e0fdd6c6eb33308fae4db4 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT @@ -1021,25 +1273,32 @@ LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./scripts/update-licenses.py -SPDXID: SPDXRef-acdd7bc65db8153d946983a5cd0906b5 -FileChecksum: SHA1: d7dc344e802e29297dee75494fd470743913e2d8 +SPDXID: SPDXRef-fd90e7c132390ec22cac057ad5f86804 +FileChecksum: SHA1: beb16286a14d168af6112cd3a4d41718620657d1 LicenseConcluded: AGPL-3.0-or-later LicenseInfoInFile: AGPL-3.0-or-later FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./xtask/Cargo.toml -SPDXID: SPDXRef-0fa26d060b21f2196307d9d54cc6b9d4 -FileChecksum: SHA1: 3f709c1f4bc707432c9c9b7339f6602819f522c3 +SPDXID: SPDXRef-2a14fe7d658a46cff7436cfe88998325 +FileChecksum: SHA1: ba0dcaaef669d3e8dc7accbdfdc08ed66d148362 +LicenseConcluded: Apache-2.0 OR MIT +LicenseInfoInFile: Apache-2.0 +LicenseInfoInFile: MIT +FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. + +FileName: ./xtask/README.md +SPDXID: SPDXRef-4939545db8b1a8a0b923d19c81ab970d +FileChecksum: SHA1: 4a26700a3c5803115c3c3ef82f46454141948047 LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. FileName: ./xtask/src/main.rs -SPDXID: SPDXRef-45d1bf4a69990c4f8b0526410fd5fc08 -FileChecksum: SHA1: d4f88da0a6c7927efa956c0fc26293357c899477 -LicenseConcluded: AGPL-3.0-or-later AND (Apache-2.0 OR MIT) -LicenseInfoInFile: AGPL-3.0-or-later +SPDXID: SPDXRef-29201f21040fd5b280c7cd4c8a504dda +FileChecksum: SHA1: 8e1c00c48a78fe252bd804ced5afbc0a80a0f07b +LicenseConcluded: Apache-2.0 OR MIT LicenseInfoInFile: Apache-2.0 LicenseInfoInFile: MIT FileCopyrightText: SPDX-FileCopyrightText: 2025 Knitli Inc. diff --git a/scripts/README-llm-edit.md b/scripts/README-llm-edit.md new file mode 100644 index 0000000..791936a --- /dev/null +++ b/scripts/README-llm-edit.md @@ -0,0 +1,124 @@ +# Multi-File Output System - llm-edit.sh + +## Overview + +This system enables Claude to deliver multiple files in a single JSON payload. The JSON is processed by a bash script that writes all files in parallel with stylized output. + +## How to Use + +When the user needs multiple files generated as a single output, follow these instructions: + +1. Understand the user's request for multiple files +2. Format your response as a valid JSON object following the schema below +3. Inform the user they can save this output to a file and process it with the llm-edit.sh script + +### JSON Schema for Multi-File Output + +```json "example editing schema" +{ + "files": [ + { + "file_name": "path/to/file1.extension", + "file_type": "text", + "file_content": "The content of the first file" + }, + { + "file_name": "path/to/file2.extension", + "file_type": "text", + "file_content": "The content of the second file" + }, + { + "file_name": "path/to/binary_file.bin", + "file_type": "binary", + "file_content": "base64_encoded_content_here" + } + ] +} +``` + +### Field Definitions + +- `file_name`: The path where the file should be written (including filename and extension) + - IMPORTANT: Always use project-relative paths (e.g., "src/main/java/...") or absolute paths + - Files will be written to exactly the location specified - no test directories are used + - For tool creation, always use actual project paths, not test directories +- `file_type`: Either "text" (default) or "binary" for base64-encoded content +- `file_content`: The actual content of the file (base64 encoded for binary files) + +### Important Rules + +1. ALWAYS validate the JSON before providing it to ensure it's properly formatted +2. ALWAYS ensure all file paths are properly escaped +3. For binary files, encode the content as base64 and specify "binary" as the file_type +4. NEVER include explanatory text or markdown outside the JSON structure +5. When asked to generate multiple files, ALWAYS use this format unless explicitly directed otherwise + +## How Users Can Process the Output + +Instruct users to: + +1. Save the JSON output to a file (e.g., `files.json`), or request to use the tool after the user reviews the file. +2. Run the llm-edit.sh script: + + ```bash + ./llm-edit.sh files.json + ``` + +## Script Features + +The llm-edit.sh script includes the following enhancements: + +- Stylized output with color-coded and emoji status indicators +- Compact progress display with timestamp and elapsed time +- Green circle (🟢) for success items +- White circle (⚪) for neutral items +- Red circle (🔴) for error conditions +- Calendar emoji (📅) for timestamps +- Clock emoji (⏱️) for elapsed time display +- Support for both text and binary files +- Parallel extraction for improved performance +- Detailed error reporting and logging options +- Verbose mode for detailed progress tracking + +### Advanced Usage Options + +```bash +# Basic usage +./llm-edit.sh files.json + +# Verbose output with detailed progress +./llm-edit.sh files.json --verbose + +# Log details to a file for debugging +./llm-edit.sh files.json --log-to-file logs/extraction.log + +# Write results to a file (silent mode) +./llm-edit.sh files.json --output-file results.md + +# Suppress all console output +./llm-edit.sh files.json --silent + +# Disable compact output format +./llm-edit.sh files.json --no-compact +``` + +## Example Response + +When asked to generate multiple files, your entire response should be a valid JSON object like this: + +```json +{ + "files": [ + { + "file_name": "example.py", + "file_type": "text", + "file_content": "def hello_world():\n print(\"Hello, world!\")\n\nif __name__ == \"__main__\":\n hello_world()" + }, + { + "file_name": "README.md", + "file_type": "text", + "file_content": "# Example Project\n\nThis is an example README file." + } + ] +} +``` diff --git a/scripts/get-langs.sh b/scripts/get-langs.sh index b924358..2db3136 100755 --- a/scripts/get-langs.sh +++ b/scripts/get-langs.sh @@ -113,7 +113,7 @@ error_exit() { # Get URL for languages in misc. repositories (REPO array) get_repo() { local lang="$1" - local repo="${REPO[$lang]}" + local repo="${REPO[lang]}" if [[ -z "$repo" ]]; then error_exit "No repository found for language: $lang" fi @@ -190,7 +190,7 @@ main() { continue fi repo_url=$(get_repo "$lang") - branch=${BRANCH[$lang]:-main} + branch=${BRANCH[lang]:-main} cmd=$(get_cmd "$lang" "$repo_url" "$ARG" "$branch") echo "executing command: $cmd" eval "$cmd" || { diff --git a/scripts/llm-edit.sh b/scripts/llm-edit.sh new file mode 100644 index 0000000..df6f5af --- /dev/null +++ b/scripts/llm-edit.sh @@ -0,0 +1,611 @@ +#!/bin/bash +# Heavily based on a script by @inventorblack, and +# shared on [ClaudeLog](https://claudelog.com/multi-file-system/) +# SPDX-FileCopyrightText: 2025 Knitli Inc. +# SPDX-FileContributor: Adam Poulemanos +# +# SPDX-License-Identifier: MIT OR Apache-2.0 + +# shellcheck disable=SC2317,SC2034 +# llm-edit.sh - Script to parse JSON files payload and write files in parallel +# Usage: ./llm-edit.sh [--verbose] [--log-to-file ] + +set -e # Exit on error + +# Default settings +VERBOSE=false +LOG_TO_FILE=false +LOG_FILE="" +CLAUDE_OUTPUT=true # Set to true by default to use the new styling +OUTPUT_FILE="" +SILENT=false +COMPACT=true # Enable compact output by default + +# ANSI color codes +WHITE='\033[1;37m' # Bright white +GRAY='\033[0;37m' # Gray +GREEN='\033[0;32m' # Green +RED='\033[0;31m' # Red +NC='\033[0m' # No Color (reset) + +# Icon settings +SUCCESS_ICON="🟢" +NEUTRAL_ICON="⚪" +ERROR_ICON="🔴" +INFO_ICON="ℹ️" +CLOCK_ICON="⏱️" +DATE_ICON="📅" +SIMPLE_CHECK="✓" + +# Process command line arguments +process_args() { + while [[ $# -gt 0 ]]; do + case "$1" in + --verbose) + VERBOSE=true + COMPACT=false # Disable compact output in verbose mode + shift + ;; + --log-to-file) + LOG_TO_FILE=true + LOG_FILE="$2" + shift 2 + ;; + --claude-output) + CLAUDE_OUTPUT=true + shift + ;; + --output-file) + OUTPUT_FILE="$2" + SILENT=true # When output file is specified, run silently by default + shift 2 + ;; + --silent) + SILENT=true + shift + ;; + --no-compact) + COMPACT=false + shift + ;; + --no-color) + COLOR=false + shift + ;; + -h|--help) + print_usage + exit 0 + ;; + *) + if [[ -z "$JSON_FILE" ]]; then + JSON_FILE="$1" + shift + else + echo "Error: Unknown argument: $1" + print_usage + exit 1 + fi + ;; + esac + done + + # Validate required arguments + if [[ -z "$JSON_FILE" ]]; then + echo "Error: JSON input file is required." + print_usage + exit 1 + fi + + # Check if the file exists + if [[ ! -f "$JSON_FILE" ]]; then + echo "Error: File $JSON_FILE does not exist." + exit 1 + fi + + # Set up logging + if [[ "$LOG_TO_FILE" = true && -n "$LOG_FILE" ]]; then + # Create log directory if it doesn't exist + LOG_DIR=$(dirname "$LOG_FILE") + mkdir -p "$LOG_DIR" + # Initialize log file + echo "--- Multi-File Extraction Log $(date) ---" > "$LOG_FILE" + fi + + # Set up output file if requested + if [[ -n "$OUTPUT_FILE" ]]; then + # Create output directory if it doesn't exist + OUTPUT_DIR=$(dirname "$OUTPUT_FILE") + mkdir -p "$OUTPUT_DIR" + # Initialize output file with the new format (no colors) + echo "Multi-File Extraction Results ${DATE_ICON} $(date)" > "$OUTPUT_FILE" + fi +} + +# Print usage information +print_usage() { + cat << EOF +Usage: $0 [--verbose] [--log-to-file ] [--claude-output] [--output-file ] [--silent] [--no-compact] + +Arguments: + Path to the JSON file containing file data + --verbose Show detailed output during extraction (disables compact mode) + --log-to-file Write detailed log to specified file + --claude-output Format output for Claude to render (styled output) + --output-file Write formatted output to file (for Claude to read later, implies --silent) + --silent Suppress all console output except errors + --no-compact Disable compact output format + --help, -h Show this help message + +Examples: + $0 tool_data.json # Extract files with minimal output + $0 tool_data.json --verbose # Extract with detailed progress + $0 tool_data.json --log-to-file logs/extraction.log # Log details to file + $0 tool_data.json --claude-output # Format output for Claude rendering + $0 tool_data.json --output-file results.md # Write results to file for Claude (silent mode) + $0 tool_data.json --silent # Run without any console output + +JSON Format: + { "files": [ { "file_name": "path/to/file", "file_type": "text", "file_content": "content" } ] } +EOF +} + +# Log messages to file if enabled +log_to_file() { + if [[ "$LOG_TO_FILE" = true && -n "$LOG_FILE" ]]; then + echo "$(date +"%Y-%m-%d %H:%M:%S") - $1" >> "$LOG_FILE" + fi +} + +# Write to output file if enabled +write_output() { + if [[ -n "$OUTPUT_FILE" ]]; then + echo "$1" >> "$OUTPUT_FILE" + fi + + # Only print to stdout if not in silent mode + if [[ "$SILENT" = false ]]; then + echo "$1" + fi +} + +# Write colored output to terminal (no files) +write_colored_output() { + if [[ "$SILENT" = false ]]; then + echo -e "$1" # -e flag enables interpretation of backslash escapes for colors + fi +} + +# Global variables for output +declare -a NUMBER_OUTPUTS +declare -a SIMPLE_OUTPUTS +number_items=0 +simple_items=0 + +# Output functions with the new formatting style +print_section() { + local text="$1" + log_to_file "[SECTION] $text" + + # For non-compact mode + if [[ "$COMPACT" = false ]]; then + write_output "**$text**" + fi + # For compact mode, we don't need section titles +} + +# Functions for different types of output +print_number_item() { + local text="$1" + local icon="$2" + log_to_file "[NUMBER_ITEM] $text" + + if [[ "$COMPACT" = false ]]; then + write_colored_output "${GRAY}${icon} $text${NC}" + else + # Store items with numbers for later output on a single line + number_items=$((number_items + 1)) + NUMBER_OUTPUTS[number_items]="${icon} ${text}" + fi +} + +print_simple_item() { + local text="$1" + log_to_file "[SIMPLE_ITEM] $text" + + if [[ "$COMPACT" = false ]]; then + write_colored_output "${GRAY}${SIMPLE_CHECK} $text${NC}" + else + # Store simple items for later output on a single line + simple_items=$((simple_items + 1)) + SIMPLE_OUTPUTS[simple_items]="${SIMPLE_CHECK} $text" + fi +} + +print_success() { + local text="$1" + log_to_file "[SUCCESS] $text" + + # Check if text contains numbers + if [[ "$text" =~ [0-9] ]]; then + print_number_item "$text" "$SUCCESS_ICON" + else + print_simple_item "$text" + fi +} + +print_warning() { + local text="$1" + log_to_file "[WARNING] $text" + + # Check if text contains numbers + if [[ "$text" =~ [0-9] ]]; then + print_number_item "$text" "$NEUTRAL_ICON" + else + print_simple_item "$text" + fi +} + +print_error() { + local text="$1" + log_to_file "[ERROR] $text" + + # Check if text contains numbers + if [[ "$text" =~ [0-9] ]]; then + print_number_item "$text" "$ERROR_ICON" + else + print_simple_item "$text" + fi +} + +print_info() { + local text="$1" + log_to_file "[INFO] $text" + + # Check if text contains numbers + if [[ "$text" =~ [0-9] ]]; then + print_number_item "$text" "$INFO_ICON" + else + print_simple_item "$text" + fi +} + +# Logical color-coding for file numbers +print_file_count() { + local count="$1" + local description="$2" + log_to_file "[COUNT] $description: $count" + + # Format with colored numbers and add to number items + local formatted_text="${description}: ${count}" + + # Use green circle for count items + print_number_item "$formatted_text" "$SUCCESS_ICON" +} + +print_file_warning() { + local count="$1" + local description="$2" + log_to_file "[COUNT_WARNING] $description: $count" + + # Format with colored numbers and add to number items + local formatted_text="${description}: ${count}" + + # Use white circle for warning items + print_number_item "$formatted_text" "$NEUTRAL_ICON" +} + +print_file_error() { + local count="$1" + local description="$2" + log_to_file "[COUNT_ERROR] $description: $count" + + # Format with colored numbers and add to number items + local formatted_text="${description}: ${count}" + + # Use red circle for error items, but only if count > 0 + if [[ "$count" -gt 0 ]]; then + print_number_item "$formatted_text" "$ERROR_ICON" + else + print_number_item "$formatted_text" "$NEUTRAL_ICON" + fi +} + +print_verbose() { + local text="$1" + log_to_file "[VERBOSE] $text" + + if [[ "$VERBOSE" = true ]]; then + write_output "⚪ $text" + fi +} + +# Function to decode base64 content safely +decode_base64() { + local content="$1" + if [[ "$OSTYPE" == "darwin"* ]]; then + # macOS version + echo "$content" | base64 -D + else + # Linux version + echo "$content" | base64 -d + fi +} + +# Function to format elapsed time +format_elapsed_time() { + local elapsed=$1 + + # Format elapsed time + if [[ $elapsed -lt 60 ]]; then + echo "${elapsed}s" + else + mins=$((elapsed / 60)) + secs=$((elapsed % 60)) + echo "${mins}m ${secs}s" + fi +} + +# Print all compact outputs +print_compact_output() { + if [[ "$COMPACT" = true && "$SILENT" = false ]]; then + # Calculate elapsed time + end_time=$(date +%s) + elapsed=$((end_time - start_time)) + elapsed_str=$(format_elapsed_time $elapsed) + + # Print the timestamp header with white title, colored date, and elapsed time + write_colored_output "${WHITE}Multi-File Extraction Results ${DATE_ICON} $(date) ${CLOCK_ICON} ${elapsed_str}${NC}" + + # Combine all number items on a single line if there are any + if [[ $number_items -gt 0 ]]; then + local number_line="" + for i in $(seq 1 $number_items); do + if [[ -n "$number_line" ]]; then + number_line="${number_line} ${NUMBER_OUTPUTS[i]}" + else + number_line="${NUMBER_OUTPUTS[i]}" + fi + done + write_colored_output "${GRAY}${number_line}${NC}" + fi + + # Combine all simple items on a single line if there are any + if [[ $simple_items -gt 0 ]]; then + local simple_line="" + for i in $(seq 1 $simple_items); do + if [[ -n "$simple_line" ]]; then + simple_line="${simple_line} ${SIMPLE_OUTPUTS[i]}" + else + simple_line="${SIMPLE_OUTPUTS[i]}" + fi + done + write_colored_output "${GRAY}${simple_line}${NC}" + fi + + # Final success line - always in green + if [[ "$CREATED_COUNT" -eq "$FILE_COUNT" ]]; then + write_colored_output "${GREEN}${SUCCESS_ICON} Extraction completed successfully!${NC}" + else + write_colored_output "${RED}${ERROR_ICON} Extraction completed with issues${NC}" + fi + fi +} + +# Check if jq is installed +check_dependencies() { + print_section "Checking Dependencies" + if ! command -v jq &> /dev/null; then + print_error "This script requires 'jq' to be installed." + echo "Please install it with: sudo apt-get install jq" + exit 1 + fi + print_success "All dependencies satisfied" +} + +# Parse JSON file and prepare extraction +prepare_extraction() { + print_section "Preparing Extraction" + + # Get the number of files to create + FILE_COUNT=$(jq '.files | length' "$JSON_FILE") + if [[ "$FILE_COUNT" -eq 0 ]]; then + print_warning "No files found in the JSON payload." + exit 0 + fi + + print_file_count "$FILE_COUNT" "Found files to extract" + + # Display file summary in verbose mode + if [[ "$VERBOSE" = true ]]; then + for i in $(seq 0 $((FILE_COUNT - 1))); do + FILE_NAME=$(jq -r ".files[$i].file_name" "$JSON_FILE") + FILE_TYPE=$(jq -r ".files[$i].file_type // \"text\"" "$JSON_FILE") + print_verbose "File $((i + 1))/$FILE_COUNT: $FILE_NAME ($FILE_TYPE)" + done + fi + + # Create temporary directory for extraction scripts + TEMP_DIR=$(mktemp -d) + trap 'rm -rf "$TEMP_DIR"' EXIT + + print_success "Extraction prepared successfully" +} + +# Create individual extraction scripts +create_extraction_scripts() { + print_section "Creating Extraction Scripts" + + for i in $(seq 0 $((FILE_COUNT - 1))); do + FILE_INFO=$(jq -c ".files[$i]" "$JSON_FILE") + + FILE_NAME=$(echo "$FILE_INFO" | jq -r '.file_name') + FILE_TYPE=$(echo "$FILE_INFO" | jq -r '.file_type // "text"') + + # Create a separate file to store the content to avoid shell interpretation issues + CONTENT_FILE="$TEMP_DIR/content_$i.txt" + jq -r '.file_content' <<< "$FILE_INFO" > "$CONTENT_FILE" + + # Create directory if it doesn't exist + DIR_NAME=$(dirname "$FILE_NAME") + + # Create extraction script that uses the content file + cat > "$TEMP_DIR/extract_$i.sh" << EOF +#!/bin/bash +# Create directory structure +mkdir -p "$DIR_NAME" + +# Check if file content is base64 encoded +if [[ "$FILE_TYPE" == "binary" ]]; then + # Handle binary file + cat "$CONTENT_FILE" | base64 -d > "$FILE_NAME" + echo "EXTRACTED|binary|$FILE_NAME" +else + # Handle text file - direct copy without shell interpretation + cat "$CONTENT_FILE" > "$FILE_NAME" + echo "EXTRACTED|text|$FILE_NAME" +fi +EOF + + chmod +x "$TEMP_DIR/extract_$i.sh" + + # Log verbose progress + print_verbose "Created extraction script for: $FILE_NAME" + done + + print_success "All extraction scripts created successfully" +} + +# Execute all extraction scripts in parallel and capture output +execute_extraction() { + print_section "Extracting Files in Parallel" + + # Create a place to store extraction results + RESULTS_FILE="$TEMP_DIR/extraction_results.txt" + touch "$RESULTS_FILE" + + # Execute all extraction scripts in parallel and capture their output + find "$TEMP_DIR" -name "extract_*.sh" -print0 | + xargs -0 -P 8 -I {} bash -c "{} >> $RESULTS_FILE 2>/dev/null" + + # Process results + extract_count=0 + + # Display extraction results based on verbosity + if [[ "$VERBOSE" = true ]]; then + while IFS= read -r line; do + if [[ "$line" == EXTRACTED* ]]; then + IFS='|' read -r _ type file_path <<< "$line" + extract_count=$((extract_count + 1)) + print_success "Extracted $type file: $file_path" + fi + done < "$RESULTS_FILE" + else + # Just count the extracted files for non-verbose mode + extract_count=$(grep -c "EXTRACTED" "$RESULTS_FILE") + fi +} + +# Verify all files were created correctly +verify_extraction() { + print_section "Verifying Extraction" + + CREATED_COUNT=0 + FAILED_FILES=() + + for i in $(seq 0 $((FILE_COUNT - 1))); do + FILE_NAME=$(jq -r ".files[$i].file_name" "$JSON_FILE") + if [[ -f "$FILE_NAME" ]]; then + CREATED_COUNT=$((CREATED_COUNT + 1)) + print_verbose "Verified: $FILE_NAME" + else + FAILED_FILES+=("$FILE_NAME") + print_verbose "Missing: $FILE_NAME" + fi + done + + # Log results to file regardless of verbosity + log_to_file "Files processed: $FILE_COUNT" + log_to_file "Files created: $CREATED_COUNT" + log_to_file "Files failed: $((FILE_COUNT - CREATED_COUNT))" + + if [[ ${#FAILED_FILES[@]} -gt 0 ]]; then + log_to_file "Failed files:" + for file in "${FAILED_FILES[@]}"; do + log_to_file " $file" + done + fi +} + +# Print summary statistics at the end +print_summary() { + print_section "Extraction Summary" + print_file_count "$FILE_COUNT" "Files processed" + print_file_count "$CREATED_COUNT" "Files created" + print_file_error "$((FILE_COUNT - CREATED_COUNT))" "Files failed" + + if [[ "$COMPACT" = false ]]; then + # Calculate and display elapsed time for non-compact mode + end_time=$(date +%s) + elapsed=$((end_time - start_time)) + elapsed_str=$(format_elapsed_time $elapsed) + write_colored_output "${GRAY}${CLOCK_ICON} Completed in: ${WHITE}${elapsed_str}${NC}" + + if [[ "$CREATED_COUNT" -eq "$FILE_COUNT" ]]; then + write_colored_output "${GREEN}${SUCCESS_ICON} Extraction completed successfully!${NC}" + else + write_colored_output "${RED}${ERROR_ICON} Extraction completed with issues${NC}" + if [[ "$VERBOSE" = false && ${#FAILED_FILES[@]} -gt 0 ]]; then + print_info "Run with --verbose flag to see details of failed files" + fi + fi + + if [[ "$LOG_TO_FILE" = true && -n "$LOG_FILE" ]]; then + print_info "Full extraction log available at: $LOG_FILE" + fi + + # If output file was used but not in silent mode, print its location + if [[ -n "$OUTPUT_FILE" && "$SILENT" = false ]]; then + echo "Results saved to: $OUTPUT_FILE" + fi + fi +} + +# Main function +main() { + # Record start time + start_time=$(date +%s) + + # Process and validate arguments + process_args "$@" + + # Only add timestamp header if not in compact mode + if [[ "$COMPACT" = false && "$SILENT" = false ]]; then + # Calculate and display elapsed time for non-compact mode at the end + end_time=$(date +%s) + elapsed=$((end_time - start_time)) + elapsed_str=$(format_elapsed_time $elapsed) + write_colored_output "${WHITE}Multi-File Extraction Results ${DATE_ICON} $(date) ${CLOCK_ICON} ${elapsed_str}${NC}" + fi + + check_dependencies + prepare_extraction + create_extraction_scripts + execute_extraction + verify_extraction + print_summary + + # Print compact output if enabled + if [[ "$COMPACT" = true ]]; then + print_compact_output + fi + + # If we're in silent mode but have an output file, return the path as the only output + if [[ "$SILENT" = true && -n "$OUTPUT_FILE" ]]; then + echo "$OUTPUT_FILE" + fi + + exit 0 +} + +# Execute the main function with all arguments +main "$@" diff --git a/scripts/update-licenses.py b/scripts/update-licenses.py index 95ce21a..44789f9 100755 --- a/scripts/update-licenses.py +++ b/scripts/update-licenses.py @@ -1,4 +1,4 @@ -#!/usr/bin/env -S uv run --all-extras -s +#!/usr/bin/env -S uv run -s # /// script # requires-python = ">=3.11" # dependencies = ["rignore", "cyclopts"] @@ -9,8 +9,12 @@ # # SPDX-License-Identifier: AGPL-3.0-or-later -"""Update licenses for files in the repository.""" +"""Update licenses for files in the repository. +TODO: Add interactive prompt for contributors. +""" + +import json import subprocess import sys from pathlib import Path @@ -71,6 +75,12 @@ def years() -> str: "--skip-existing" ] +CHECK_CMD = [ + "reuse", + "lint", + "-j", +] + # Collect non-code paths that are not in the AST-Grep or code paths # Some of these are shell scripts, so technically code, but we treat them as non-code for license purposes. NON_CODE_EXTS = { @@ -140,7 +150,6 @@ def years() -> str: "Herrington Darkholme <2883231+HerringtonDarkholme@users.noreply.github.com>" ) - class PathsForProcessing(NamedTuple): """Paths for processing.""" ast_grep_paths: list[Path] @@ -203,6 +212,27 @@ def filter_path(paths: tuple[Path] | None = None, path: Path | None = None) -> b return path.is_file() and not path.is_symlink() return path in paths and path.is_file() and not path.is_symlink() +def get_files_with_missing() -> list[Path] | None: + """Get files with missing licenses.""" + try: + result = subprocess.run( + CHECK_CMD, + capture_output=True, + text=True, + check=True + ) + output = json.loads(result.stdout.strip()) + non_compliant_report = output.get("non_compliant", {}) + missing_files = non_compliant_report.get("missing_copyright_info", []) + non_compliant_report.get("missing_licensing_info", []) + if not missing_files: + print("No files with missing licenses found.") + return None + print(f"Found {len(missing_files)} files with missing licenses.") + return sorted({BASE_PATH / file for file in missing_files}) + except subprocess.CalledProcessError as e: + print(f"Error checking files: {e}") + return None + def get_empty_lists() -> tuple[list, list, list]: """Get empty lists for AST-Grep paths, code paths, and non-code paths.""" return [], [], [] @@ -243,6 +273,20 @@ def update_all(*, contributors: Annotated[list[str], CONTRIBUTORS] = DEFAULT_CON except Exception as e: print(f"Error updating licenses: {e}") +@app.command(help="Add licenses for only those files missing license information in the repository. Will check every file in the repository and add license information if it's missing.") +def missing(*, contributors: Annotated[list[str], CONTRIBUTORS] = DEFAULT_CONTRIBUTORS) -> None: + """Add licenses for only those files missing license information in the repository.""" + missing_files = get_files_with_missing() + if not missing_files: + print("No files with missing licenses found.") + return + path_obj = sort_paths(missing_files) + BASE_CMD.extend(process_contributors(contributors)) + try: + path_obj.process_with_cmd(BASE_CMD) + except Exception as e: + print(f"Error updating licenses: {e}") + @app.command(help="Update licenses for staged files in the repository. Will only check files that are staged for commit.") def staged(*, contributors: Annotated[list[str], CONTRIBUTORS] = DEFAULT_CONTRIBUTORS) -> None: """Update licenses for staged files in the repository."""