nbnf

A parser generator based on nom, with syntax inspired by EBNF and regex.

Syntax overview

A grammar is a series of rules containing expressions. Whitespace is insignificant, C-like comments with nesting are allowed, rules must end with a semicolon:

// foo
rule = ...; // bar
rule2 =
    /* /*
        baz
        qux
    */ */
    ...
    ...;
...

A rule generates a parser function as Rust code, and so its name must be a valid Rust identifier. The output type of the generated function can be specified, defaulting to &str if omitted:

rule<Output> = ...;

Any valid Rust code denoting a type is permitted between the chevrons.

The input type can be specified as well, also defaulting to &str, but requires output to also be specified:

rule<Input><Output> = ...;

Expressions can invoke any parser function defined in Rust, with other rules simply being resolved as symbols in the same enclosing module:

top = inner external_rule nbnf::nom::combinator::eof;
inner = ...;

A literal Rust expression can also be inserted, e.g. to invoke parametric parsers:

two_chars = <nbnf::nom::bytes::complete::take(2usize)>;

Rules can match literal chars, strings, or regex-like character ranges; and supports Rust-like escapes:

top = 'a' "bc" [de-g] '\x2A' "\"\0\r\n\t\x7F\u{FF}";

Expressions can be grouped with parentheses, and alternated between with slash:

top = ('a' 'b') / ('c' 'd');

Expressions can be repeated with regex-like syntax:

r1 = 'a'?;      // zero or one
r1 = 'b'*;      // zero or more
r2 = 'c'+;      // one or more
r3 = 'd'{2};    // exactly two
r4 = 'e'{2,};   // at least two
r5 = 'f'{,2};   // at most two
r6 = 'g'{2,4};  // between two to four

Expressions can be tagged with various modifiers, wrapping them in combinators:

!! (cut) prevents backtracking, e.g. when you know no other expressions can match

json_object_pair<(String, Json)> = string !!(-':' json_value);

! (not) matches only when the expression does not match, consuming no input

ident = -![0-9] ~[a-zA-Z0-9_]+;

~ (recognize) will discard the output and instead yield the portion of the input that was matched

r1<(i32, f64)> = ...;
r2<&str> = ~r1;

Expressions can be discarded from output by prefixing them with -:

string<&str> = -'"' ~(string_char+) -'"'

For this particular grammar, foregoing the discards would require a tuple as the return type because the quote chars are included:

string<(char, &str, char)> = ...;

The empty string can be matched with &, allowing various interesting grammar constructs:

parens = ~('(' parens ')') / ~&;

Types and output values can be massaged in a few ways by passing any valid Rust expression:

@<...> (value) discards output and instead returns the given literal

token<Token> =
    ... /
    '/'@<Token::Slash> /
    ...;

|<...> (map) runs a mapping function over the output

object<HashMap> =
    -'{' object_pair+ -'}'
    |<HashMap::from_iter>;

|?<...> (map_opt) runs a mapping function returning Option over the output

even_int<i32> =
    int
    |?<|v| (v & 1 == 0).then_some(v)>;

|!<...> (map_res) runs a mapping function returning Result over the output

number<i32> =
    ~([0-9]+)
    |!<i32::from_str>

||<...> (no corresponding nom combinator) wraps the expression in arbitrary Rust code, which should contain a placeholder $expr (explained below)

comma = ",";
pairs =
    ("foo" "bar")
    ||<nbnf::nom::multi::separated_list1(comma, $expr)>;

Certain behavior can be modified with pragma directives:

#input <ty> allows specifying the default input type of all following rules

#input <&[u8]>
binary_rule<()> = b"foo"@<()>;

#output <ty> similarly allows specifying the default output type
#error <ty> allows setting the error type passed to IResult, e.g. to use VerboseError

// note that the type should not include generics, the input type is substituted per-rule
#error <nom_language::error::VerboseError>

rule = ...;
// generates `fn rule(input: &str) -> IResult<&str, &str, VerboseError<&str>>`

#placeholder <name> <expr> allows defining new placeholders (explained below), and overriding those built into nbnf

#placeholder myparsers my_lib::parsers
rule = $myparsers::parser;

Each pragma also allows clearing user-defined values:

// default input type is reset to `&str`
#input $reset
// likewise for `#output`/`#error`

// placeholder `foo` is reset (to default, if any)
#placeholder foo $reset

// all user-defined placeholders are reset
#placeholder $reset

Placeholders are syntax that allow arbitrary substitutions. nbnf has a few predefined placeholders that can be overridden to alter generated parsers:

$nom defaults to nbnf::nom, and is used by the generator to qualify foundational parsers. Overriding can be used to e.g. swap nom out for winnow

#placeholder nom winnow
// subsequent parsers now use winnow

$complete_or_streaming defaults to complete and is used by the generator to qualify foundational parsers that come in complete or streaming variants (see nom docs for more info)
$expr is only defined in the wrapping code of wrap syntax (||<...>) and expands to the expression being wrapped

rule = inner||<foo($expr)>; // expands to `foo(inner)`

Example Usage

The main entrypoint is nbnf::nbnf, a proc macro that expands to parsers generated from the given grammar. Note that the input must be passed as a string (preferably a raw string,) as certain expressions which are valid grammars are invalid Rust (e.g. the unbalanced quote in [^"].)

use nbnf::nbnf;

nbnf!(r#"
    top = ~('a' top 'b') / ~&;
"#);

fn main() {
    let input = "aabbc";
    let (rest, output) = top.parse(input).unwrap();
    assert_eq!(rest, "c");
    assert_eq!(output, "aabb");
}

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
examples		examples
nbnf_language		nbnf_language
nbnf_macro		nbnf_macro
src		src
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nbnf

Syntax overview

Example Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Yoplitein/nbnf

Folders and files

Latest commit

History

Repository files navigation

nbnf

Syntax overview

Example Usage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages