Megaparsec Discussion #1511
louthy
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I mentioned in this thread that I am working on a new Megaparsec library. It's a clone (ish) of the Haskell Megaparsec library. You can follow the progress on the
v5-megaparsecbranch.The reasoning for implementing it rather than upgrading the existing
LanguageExt.Parseclibrary is this:Parseclibrary was built around delegates. We can't make delegates derive fromK<F, A>and so i can implement the new traits functionality withParsecin its current formParser<T>andParser<I, O>types can't support bespoke error-values and are not monad-transformers, so can't be combined with other monadic functionality, likeIO.PStringis allowedTo add support to fix the issues above would create a number of breaking changes. And although
v5is a breaking-change release, I figured a fresh parsing library, that takes in a lot of the performance improvements of theMegaparsecarchitecture, would allow a slower migration over time rather than a big-bang breaking change.There is now a full abstraction away from the underlying monad, with
MonadParsecT<MP, E, S, T, M>. That means any type can become a parser.MPis the 'self trait type'Eis the error-typeSis the stream-typeTis the token that the stream yieldsMis the transformer lifted monadThe
Sstream type is constrained to be aTokenStream<S, T>. That means there's a trait type dedicated to bespoke streams:This is a trait especially designed for extremely fast 'to the metal' parsing of tokens. The
TOKENSandTOKENparameters guarantees no boxing and the use ofinandoutarguments guarantees that the minimum amount of value copying happens.Methods like
TakeWhileandTake(n, ...)also allow for the likes ofsatisfyand other core token-wrangling combinators to be implemented as extremely low-level for-loops, guaranteeing a significant performance boost.The new
PString, which is astructthat holds astringvalue with a start position and length, can also do splicing without any memory allocation.Arr<A>has also been upgraded to do the same for generic token arrays.There are so many more options to do 'to the metal' optimisatons that I couldn't do with the old approach. The trait system means we don't need to box everything and the new
ReadOnlySpantype (and the supporting keywords) means much less copying and heap allocation.Finally, because
MonadParsecT<MP, E, S, T, M>trait and the main implementationParsecT<E, S, T, M, A>have so many generic parameters, I'm building the parsers to, mostly, be within a singleModuletype, which should mean you can do this:Which will bring in all of the predefined parsers so they can be used without generic arguments. That gives us the flexibility of having bespoke error types, bespoke stream types, bespoke token types, and the ability lift any other monad into the stack; without having to compromise on the ease-of-use.
Currently, these are all just words, the project is still very-WIP. But I was asked a question about it on another thread, so I thought I'd give a bit more detail.
Beta Was this translation helpful? Give feedback.
All reactions