|
1 | 1 | # C++ bindings for `lib-ruby-parser` |
2 | 2 |
|
3 | | -tldr; You can find examples in `test/test.cpp`. Valgrind and ASAN give no errors. |
| 3 | +[Documentation](https://lib-ruby-parser.github.io/cpp-bindings/) |
4 | 4 |
|
5 | | -## API |
| 5 | +All classes/methods are defined under `lib_ruby_parser` namespace. API mostly mirrors Rust version. |
6 | 6 |
|
7 | | -All classes/methods are defined in the `lib_ruby_parser` namespace. |
| 7 | +Pre-compiled library and header file are available on [Releases](https://github.com/lib-ruby-parser/cpp-bindings/releases), supported platforms: |
8 | 8 |
|
9 | | -1. `ParserResult::from_source(Bytes source, ParserOptions options)` |
| 9 | ++ `x86_64-apple-darwin` |
| 10 | ++ `x86_64-unknown-linux-gnu` |
| 11 | ++ `x86_64-pc-windows-msvc` |
| 12 | ++ `x86_64-pc-windows-gnu` |
10 | 13 |
|
11 | | - Parses given input into `ParserResult`, has the following fields: |
12 | | - ```cpp |
13 | | - // AST |
14 | | - std::unique_ptr<Node> ast; |
| 14 | +## Basic API |
15 | 15 |
|
16 | | - // List of tokns |
17 | | - std::vector<Token> tokens; |
| 16 | +```cpp |
| 17 | +// Configure parsing options |
| 18 | +lib_ruby_parser::ParserOptions options( |
| 19 | + /* 1. filename */ |
| 20 | + lib_ruby_parser::String::Copied("(eval)"), |
18 | 21 |
|
19 | | - // List of diagnostic messages |
20 | | - std::vector<Diagnostic> diagnostics; |
| 22 | + /* 2. decoder */ |
| 23 | + lib_ruby_parser::MaybeDecoder(lib_ruby_parser::Decoder(nullptr)), |
21 | 24 |
|
22 | | - // List of comments |
23 | | - std::vector<Comment> comments; |
| 25 | + /* 3. token_rewriter */ |
| 26 | + lib_ruby_parser::MaybeTokenRewriter(lib_ruby_parser::TokenRewriter(nullptr)), |
24 | 27 |
|
25 | | - // List of magic comments |
26 | | - std::vector<MagicComment> magic_comments; |
| 28 | + /* 4. record_tokens */ |
| 29 | + true); |
27 | 30 |
|
28 | | - // Decoded input |
29 | | - Input input; |
30 | | - ``` |
| 31 | +// Setup input to parse |
| 32 | +lib_ruby_parser::ByteList input = lib_ruby_parser::ByteList::Copied("2 + 3", 5); |
31 | 33 |
|
32 | | -2. `Node::is<T>` where `T` is one of the ~100 node types. |
| 34 | +lib_ruby_parser::ParserResult result = lib_ruby_parser::parse( |
| 35 | + std::move(input), |
| 36 | + std::move(options)); |
33 | 37 |
|
34 | | - ```cpp |
35 | | - ast.is<Args>() |
36 | | - // => true |
| 38 | +assert_eq(result.ast->tag, lib_ruby_parser::Node::Tag::SEND); |
| 39 | +assert_eq(result.tokens.len, 4); // tINT tPLUS tINT EOF |
| 40 | +assert_eq(result.comments.len, 0); |
| 41 | +assert_eq(result.magic_comments.len, 0); |
| 42 | +assert_byte_list(result.input.bytes, "2 + 3"); |
| 43 | +``` |
37 | 44 |
|
38 | | - ast.is<Defs>() |
39 | | - // => false |
40 | | - ``` |
| 45 | +`ParserResult` contains the following fields: |
41 | 46 |
|
42 | | -3. `Node::get<T>` where `T` is one of the ~100 node locs |
| 47 | +1. `Node* ast` - potentually nullable AST, tagged enum |
| 48 | +2. `TokenList tokens` - list of tokens |
| 49 | +3. `DiagnosticList diagnostics` - list of diagnostics |
| 50 | +4. `CommentList comments` - list of comments |
| 51 | +5. `MagicCommentList magic_comments` - list of magic comments |
| 52 | +6. `DecodedInput input` - decoded input |
43 | 53 |
|
44 | | - ```cpp |
45 | | - Args *args = ast.get<Args>() |
46 | | - ``` |
| 54 | +All node classes fully match node structs of the original Rust implementation. You can check [full documentation](https://docs.rs/lib-ruby-parser) (`nodes` module) |
47 | 55 |
|
48 | | -4. All node classes fully match node structs of the original Rust implementation. You can check [full documentation](https://docs.rs/lib-ruby-parser) (`nodes` module) |
49 | | - |
50 | | -5. `Token` has the following fields and methods: |
51 | | - |
52 | | - ```cpp |
53 | | - std::string token_value; |
54 | | - std::unique_ptr<Loc> loc; // has numeric "begin" and "end" fields |
55 | | - std::string name(); |
56 | | - ``` |
57 | | - |
58 | | - Also it has a numeric `token_type` field that probably could be used for fast comparison. It is used to get `name()`, so it's different for different token types. |
59 | | -
|
60 | | -6. `Diagnostic` has the following fields: |
61 | | -
|
62 | | - ```cpp |
63 | | - ErrorLevel level; // enum with WARNING and ERROR values |
64 | | - std::unique_ptr<DiagnosticMessage> message; |
65 | | - std::unique_ptr<Loc> loc; |
66 | | - ``` |
67 | | -
|
68 | | - can be rendered either using `render_message()` or `render(const Bytes &)` |
69 | | -
|
70 | | -7. `Comment` has the following fields: |
71 | | -
|
72 | | - ```cpp |
73 | | - CommentType kind; // enum with INLINE, DOCUMENT and UNKNOWN values |
74 | | - std::unique_ptr<Loc> location; |
75 | | - ``` |
76 | | -
|
77 | | -8. `Loc` has the following fields and methods: |
78 | | -
|
79 | | - ```cpp |
80 | | - uint32_t begin; |
81 | | - uint32_t end; |
82 | | - std::string source(Input &input); |
83 | | - ``` |
84 | | -
|
85 | | - `input` is what you get from `ParserResult::from`. It can be different from your original source if it has magic encoding comment. |
86 | | -
|
87 | | -9. `MagicComment` has the following fields: |
88 | | -
|
89 | | - ```cpp |
90 | | - MagicCommentKind kind; // enum with ENCODING, FROZEN_STRING_LITERAL, WARN_INDENT values |
91 | | -
|
92 | | - // location of key/value |
93 | | - // "# encoding: utf-8" |
94 | | - // ~~~~~~~~ key |
95 | | - // ~~~~~ value |
96 | | - std::unique_ptr<Loc> key_l; |
97 | | - std::unique_ptr<Loc> value_l; |
98 | | - ``` |
0 commit comments