Megaparsec Discussion #1511

louthy · 2025-11-26T12:22:46Z

louthy
Nov 26, 2025
Maintainer

I mentioned in this thread that I am working on a new Megaparsec library. It's a clone (ish) of the Haskell Megaparsec library. You can follow the progress on the v5-megaparsec branch.

The reasoning for implementing it rather than upgrading the existing LanguageExt.Parsec library is this:

The original Parsec library was built around delegates. We can't make delegates derive from K<F, A> and so i can implement the new traits functionality with Parsec in its current form
The original Parser<T> and Parser<I, O> types can't support bespoke error-values and are not monad-transformers, so can't be combined with other monadic functionality, like IO.
The original can't support bespoke streams and so only the sanctioned PString is allowed

To add support to fix the issues above would create a number of breaking changes. And although v5 is a breaking-change release, I figured a fresh parsing library, that takes in a lot of the performance improvements of the Megaparsec architecture, would allow a slower migration over time rather than a big-bang breaking change.

There is now a full abstraction away from the underlying monad, with MonadParsecT<MP, E, S, T, M>. That means any type can become a parser.

MP is the 'self trait type'
E is the error-type
S is the stream-type
T is the token that the stream yields
M is the transformer lifted monad

The S stream type is constrained to be a TokenStream<S, T>. That means there's a trait type dedicated to bespoke streams:

public interface TokenStream<TOKENS, TOKEN>
    where TOKENS : TokenStream<TOKENS, TOKEN>
{
    public static abstract TOKENS TokenToChunk(in TOKEN token);
    public static abstract TOKENS TokensToChunk(in ReadOnlySpan<TOKEN> token);
    public static abstract ReadOnlySpan<TOKEN> ChunkToTokens(in TOKENS tokens);
    public static abstract int ChunkLength(in TOKENS tokens);
    public static abstract bool Take1(in TOKENS stream, out TOKEN head, out TOKENS tail);
    public static abstract bool Take(int amount, in TOKENS stream, out TOKENS head, out TOKENS tail);
    public static abstract void TakeWhile(Func<TOKEN, bool> predicate, in TOKENS stream, out TOKENS head, out TOKENS tail);
}

This is a trait especially designed for extremely fast 'to the metal' parsing of tokens. The TOKENS and TOKEN parameters guarantees no boxing and the use of in and out arguments guarantees that the minimum amount of value copying happens.

Methods like TakeWhile and Take(n, ...) also allow for the likes of satisfy and other core token-wrangling combinators to be implemented as extremely low-level for-loops, guaranteeing a significant performance boost.

The new PString, which is a struct that holds a string value with a start position and length, can also do splicing without any memory allocation. Arr<A> has also been upgraded to do the same for generic token arrays.

There are so many more options to do 'to the metal' optimisatons that I couldn't do with the old approach. The trait system means we don't need to box everything and the new ReadOnlySpan type (and the supporting keywords) means much less copying and heap allocation.

Finally, because MonadParsecT<MP, E, S, T, M> trait and the main implementation ParsecT<E, S, T, M, A> have so many generic parameters, I'm building the parsers to, mostly, be within a single Module type, which should mean you can do this:

using static LanguageExt.Megaparsec.Module<Error, PString, char, IO>;

Which will bring in all of the predefined parsers so they can be used without generic arguments. That gives us the flexibility of having bespoke error types, bespoke stream types, bespoke token types, and the ability lift any other monad into the stack; without having to compromise on the ease-of-use.

Currently, these are all just words, the project is still very-WIP. But I was asked a question about it on another thread, so I thought I'd give a bit more detail.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Megaparsec Discussion #1511

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Megaparsec Discussion #1511

Uh oh!

Uh oh!

louthy Nov 26, 2025 Maintainer

Replies: 0 comments

louthy
Nov 26, 2025
Maintainer