Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 57 additions & 18 deletions cfgrammar/src/lib/yacc/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -382,34 +382,46 @@ impl YaccParser {
update_yacc_kind: bool,
) -> Result<usize, YaccGrammarError> {
// Compares haystack converted to lowercase to needle (assumed to be lowercase).
fn starts_with_lower(needle: &'static str, haystack: &'_ str) -> bool {
fn starts_with_lower(needle: &'_ str, haystack: &'_ str) -> bool {
if let Some((prefix, _)) = haystack.split_at_checked(needle.len()) {
prefix.to_lowercase() == needle
} else {
false
}
}
const ACTION_KINDS: [(&str, YaccOriginalActionKind); 3] = [
("noaction", YaccOriginalActionKind::NoAction),
("useraction", YaccOriginalActionKind::UserAction),
("genericparsetree", YaccOriginalActionKind::GenericParseTree),
];

const YACC_KINDS: [(&str, YaccKind); 5] = [
("grmtools", YaccKind::Grmtools),
(
"original(noaction)",
YaccKind::Original(YaccOriginalActionKind::NoAction),
),
(
"original(useraction)",
YaccKind::Original(YaccOriginalActionKind::UserAction),
),
(
"original(genericparsetree)",
YaccKind::Original(YaccOriginalActionKind::GenericParseTree),
),
("Eco", YaccKind::Eco),
let mut yacc_kinds = vec![
("grmtools".to_string(), YaccKind::Grmtools),
("yacckind::grmtools".to_string(), YaccKind::Grmtools),
("Eco".to_string(), YaccKind::Eco),
("yackind::Eco".to_string(), YaccKind::Eco),
];
for (name, action_kind) in ACTION_KINDS {
let yk = "YaccKind".to_lowercase();
let ak = "YaccOriginalActionKind".to_lowercase();
yacc_kinds.push((format!("original({name})"), YaccKind::Original(action_kind)));
yacc_kinds.push((
format!("{yk}::original({name})"),
YaccKind::Original(action_kind),
));
yacc_kinds.push((
format!("{yk}::original({ak}::{name})"),
YaccKind::Original(action_kind),
));
yacc_kinds.push((
format!("original({ak}::{name})"),
YaccKind::Original(action_kind),
));
}
let j = self.parse_ws(i, false)?;
let s = &self.src[i..];
for (kind_name, kind) in YACC_KINDS {
if starts_with_lower(kind_name, s) {
for (kind_name, kind) in yacc_kinds {
if starts_with_lower(&kind_name, s) {
if update_yacc_kind {
self.yacc_kind = Some(kind);
}
Expand Down Expand Up @@ -2764,4 +2776,31 @@ B";
";
parse(YaccKind::Original(YaccOriginalActionKind::NoAction), src).unwrap();
}

#[test]
fn test_grmtools_section_yacckinds() {
let srcs = [
"%grmtools{yacckind Original(NoAction)}
%%
Start: ;",
"%grmtools{yacckind YaccKind::Original(GenericParseTree)}
%%
Start: ;",
"%grmtools{yacckind YaccKind::Original(yaccoriginalactionkind::useraction)}
%actiontype ()
%%
Start: ;",
"%grmtools{yacckind Original(YACCOriginalActionKind::NoAction)}
%%
Start: ;",
"%grmtools{yacckind YaccKind::Grmtools}
%%
Start -> () : ;",
];
for src in srcs {
YaccParser::new(YaccKindResolver::NoDefault, src.to_string())
.parse()
.unwrap();
}
}
}
2 changes: 2 additions & 0 deletions doc/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,12 @@
- [Quickstart Guide](quickstart.md)
- [Lexing](lexing.md)
- [Lex compatibility](lexcompatibility.md)
- [Extensions](lexextensions.md)
- [Hand-written lexers](manuallexer.md)
- [Start States](start_states.md)
- [Parsing](parsing.md)
- [Yacc compatibility](yacccompatibility.md)
- [Extensions](yaccextensions.md)
- [Return types and action code](actioncode.md)
- [grmtools parsing idioms](parsing_idioms.md)
- [Error recovery](errorrecovery.md)
Expand Down
3 changes: 2 additions & 1 deletion doc/src/lexcompatibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ There are several major differences between Lex and grmtools:
and ASCII escape sequences. `\\` `\a` `\f` `\n` `\r` `\t` `\v`.

Lex also interprets the escape sequence `\b` as `backspace`. While regex treats `\b`
as a word boundary subsequently grmtools will too.
as a word boundary subsequently grmtools will too. The Lex behavior can be enabled
using [posix_escapes](lexextensions.md).

Additional escape sequences supported by regex:

Expand Down
64 changes: 64 additions & 0 deletions doc/src/lexextensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Lex extensions

Flags can be specified at compile time through `LexFlags` or at `.l` file parse time using
a `%grmtools{ }` section. At compile time these flags can be enabled using
[`CTLexerBuilder`](https://docs.rs/lrlex/latest/lrlex/struct.CTLexerBuilder.html) methods.

Flags commonly affect the parsing of the lex file, the interpretation regular expressions,
and set limits.

Boolean flags are specified by their name, and can be negated by prefixing with `!`
other flags should specify their value immediately after the flag name.


## Example

```
%grmtools {
allow_wholeline_comments
!octal
size_limit 1024
}
%%
. "rule"
```


## List of flags:

| Flag | Value | Required | Regex[^regex] |
|-------------------------------|-------|----------|---------------|
| `posix_escapes`[^†] | bool | &cross; | &cross; |
| `allow_wholeline_comment`[^‡] | bool | &cross; | &cross; |
| `case_insensitive` | bool | &cross; | &checkmark; |
| `dot_matches_new_line` | bool | &cross; | &checkmark; |
| `multi_line` | bool | &cross; | &checkmark; |
| `octal` | bool | &cross; | &checkmark; |
| `swap_greed` | bool | &cross; | &checkmark; |
| `ignore_whitespace` | bool | &cross; | &checkmark; |
| `unicode` | bool | &cross; | &checkmark; |
| `size_limit` | usize | &cross; | &checkmark; |
| `dfa_size_limit` | usize | &cross; | &checkmark; |
| `nest_limit` | u32 | &cross; | &checkmark; |

[^†]: Enable compatibility with posix escape sequences.
[^‡]: Enables rust style `// comments` at the start of lines.
Which requires escaping of `/` when used in a regex.
[^regex]: &checkmark; Flag gets passed directly to `regex::RegexBuilder`.


## Flags affecting Posix compatibility

As discussed in [Lex compatibility](lexcompatibility.md) the default behaviors of grmtools and rust's regex
library have differed from that of posix lex.

The following flags can change the behavior to match posix lex more closely.

```
%grmtools {
!dot_matches_new_line
posix_escapes
}
%%
...
```
17 changes: 17 additions & 0 deletions doc/src/yaccextensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Yacc Extensions

At the beginning of a `.y` file is a `%grmtools{}` section, by default this section is required.
But a default can be set or forced by using a `YaccKindResolver`.

| Flag | Value | Required |
|------------|---------------------------------------------|--------------|
| `yacckind` | [YaccKind](yacccompatibility.md#yacckinds) | &checkmark; |


## Example

```
%grmtools{yacckind Grmtools}
%%
Start: ;
```
14 changes: 13 additions & 1 deletion lrlex/src/lib/ctbuilder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -680,6 +680,18 @@ where
self
}

/// Enables `// comment` style parsing according to `flag``.
/// When enabled comments can appear at the beginning of a line,
/// and regular expressions with the `/` character should be escaped via `\/`.
///
/// The default value is `false`.
///
/// Setting this flag will override the same flag within a `%grmtools` section.
pub fn allow_wholeline_comments(mut self, flag: bool) -> Self {
self.force_lex_flags.allow_wholeline_comments = Some(flag);
self
}

/// Sets the `regex::RegexBuilder` option of the same name.
/// The default value is `true`.
///
Expand All @@ -698,7 +710,7 @@ where
self
}

/// Sets the `regex::RegexBuilder` option of the same name.
/// Enables posix lex compatible escape sequences according to `flag`.
/// The default value is `false`.
///
/// Setting this flag will override the same flag within a `%grmtools` section.
Expand Down