diffract is an OCaml-based library and CLI tool for parsing source files using tree-sitter and performing pattern matching, structural diffing, and semantic transformations (find-and-replace at the AST level).
- OCaml (5.4+): Core logic and CLI.
- Dune: Build system.
- Tree-sitter: Used for generating concrete syntax trees (CSTs) for various languages.
- Ctypes / C FFI: For bridging OCaml with the C-based tree-sitter library and custom C helpers (grammar language functions are statically linked, not dynamically loaded).
- Alcotest: For unit testing.
- Bechamel: For benchmarking.
lib/: Core library containing the matching engine, tree representation, and FFI bindings.tree.ml: Pure OCaml representation of the syntax tree to avoid FFI overhead during traversal.match_parse.ml: Handles pattern parsing, including metavariables and ellipsis expansion.match_engine.ml: Core structural matching algorithms (strict,field,partial).match_search.ml: Implements search, nested pattern contexts, indexing, and formatting.match_transform.ml: Computes and applies edits from match results (semantic patches).languages.ml: Static registry mapping language names to grammar functions statically linked into the binary.tree_sitter_helper.c: C wrappers to handleTSNodestruct-by-value issues and tree-sitter memory management.
bin/: CLI entry point (main.ml).grammars/: contains scripts and infrastructure to build language-specific grammars.tests/: Comprehensive test suite using Alcotest.
Before building the OCaml project, you must build the grammar libraries:
cd grammars && ./build-grammars.sh && cd ..- Build project:
dune build - Run tests:
dune test - Format code:
dune fmt - Run CLI:
dune exec diffract -- [args](or use the built binary in_build/default/bin/main.exe)
Managed via opam. Key dependencies include ctypes, ctypes-foreign, cmdliner, yojson, and alcotest.
- Formatting: Strictly enforced via
ocamlformat. Rundune fmtbefore committing. - Architecture: Follow the split between pure OCaml logic (
tree.ml) and FFI-based nodes (node.ml). - Naming: Module-level matching logic is prefixed with
match_(e.g.,match_engine.ml,match_parse.ml).
- New features should include tests in
tests/. - Tests are organized by module (e.g.,
test_node.ml,test_match.ml). - The main test entry point is
tests/test_runner.ml.
To add a new language:
- Add a C wrapper in
lib/tree_sitter_helper.cand anexternalbinding + entry inlib/languages.ml. - Add copy rule and
(foreign_archives ...)entry inlib/dune. - Add compilation commands to
grammars/build-grammars.shand run it. - Rebuild with
dune build. - Add relevant test cases in
tests/test_match.ml.
README.md: General overview and usage examples.docs/internals.md: Deep dive into architecture and FFI implementation.docs/patterns.md: Documentation for the pattern syntax and transformation rules.lib/diffract.mli: Public API for the library.dune-project: Project metadata and dependency definitions.