88-- Portability : GHC
99--
1010-- See the general notes about parsing in the "Streamly.Data.Parser" module.
11- -- This module implements a using Continuation Passing Style (CPS) wrapper over
12- -- the "Streamly.Data.Parser" module. It is as fast or faster than attoparsec.
13- --
14- -- Streamly parsers support all operations offered by popular Haskell parser
15- -- libraries. They operate on a generic input type, support streaming, and are
16- -- faster.
11+ -- This (ParserK) module implements a Continuation Passing Style (CPS) wrapper
12+ -- over the fused "Streamly.Data.Parser" module. It is a faster CPS parser than
13+ -- attoparsec.
1714--
1815-- The 'ParserK' type represents a stream-consumer as a composition of function
1916-- calls, therefore, a function call overhead is incurred at each composition.
20- -- It is reasonably fast in general but may be a few times slower than a fused
21- -- parser represented by the 'Streamly.Data.Parser.Parser' type. However, it
17+ -- It is reasonably fast in general but may be a few times slower than the
18+ -- fused 'Streamly.Data.Parser.Parser' type. However, unlike fused parsers , it
2219-- allows for scalable dynamic composition, especially, 'ParserK' can be used
2320-- in recursive calls. Operations like 'splitWith' on 'ParserK' type have
2421-- linear (O(n)) performance with respect to the number of compositions.
2522--
26- -- 'ParserK' is preferred over 'Streamly.Data.Parser.Parser' when extensive
27- -- applicative, alternative and monadic composition is required, or when
28- -- recursive or dynamic composition of parsers is required. 'ParserK' also
29- -- allows efficient parsing of a stream of arrays, it can also break the input
30- -- stream into a parse result and remaining stream so that the stream can be
31- -- parsed independently in segments.
32- --
33- -- == Using ParserK
34- --
35- -- All the parsers from the "Streamly.Data.Parser" module can be converted to
36- -- ParserK using the 'Streamly.Data.Array.parserK',
37- -- 'Streamly.Internal.Data.ParserK.parserK', and
38- -- 'Streamly.Internal.Data.Array.Generic.parserK' combinators.
39- --
40- -- 'Streamly.Data.Array.parse' runs a parser on a stream of unboxed
41- -- arrays, this is the preferred and most efficient way to parse chunked input.
42- -- The more general 'Streamly.Data.Array.parseBreak' function returns
43- -- the remaining stream as well along with the parse result. There are
44- -- 'Streamly.Internal.Data.Array.Generic.parse',
45- -- 'Streamly.Internal.Data.Array.Generic.parseBreak' as well to run
46- -- parsers on boxed arrays. 'Streamly.Internal.Data.StreamK.parse',
47- -- 'Streamly.Internal.Data.StreamK.parseBreak' run parsers on a stream of
48- -- individual elements instead of stream of arrays.
23+ -- 'ParserK' is preferred over the fused 'Streamly.Data.Parser.Parser' when
24+ -- extensive applicative, alternative and monadic composition is required, or
25+ -- when recursive or dynamic composition of parsers is required. 'ParserK' also
26+ -- allows efficient parsing of a stream of byte arrays, it can also break the
27+ -- input stream into a parse result and the remaining stream so that the stream
28+ -- can be parsed independently in segments.
4929--
50- -- == Monadic Composition
30+ -- == How to parse a stream?
5131--
52- -- Monad composition can be used for lookbehind parsers, we can dynamically
53- -- compose new parsers based on the results of the previously parsed values.
32+ -- All the fused parsers from the "Streamly.Data.Parser" module can be
33+ -- converted to the CPS ParserK, for use with different types of parser
34+ -- drivers, using
35+ -- the @parserK@ combinators - Streamly.Data.Array.'Streamly.Data.Array.parserK',
36+ -- Streamly.Data.StreamK.'Streamly.Data.StreamK.parserK', and
37+ -- Streamly.Data.Array.Generic.'Streamly.Data.Array.Generic.parserK'.
38+ --
39+ -- To parse a stream of unboxed arrays, use
40+ -- Streamly.Data.Array.'Streamly.Data.Array.parse' for running the parser, this
41+ -- is the preferred and most efficient way to parse chunked input. The
42+ -- Streamly.Data.Array.'Streamly.Data.Array.parseBreak' function returns the
43+ -- remaining stream as well along with the parse result.
44+ --
45+ -- To parse a stream of boxed arrays, use
46+ -- Streamly.Data.Array.Generic.'Streamly.Data.Array.Generic.parse' or
47+ -- Streamly.Data.Array.Generic.'Streamly.Data.Array.Generic.parseBreak' to run
48+ -- the parser.
49+ --
50+ -- To parse a stream of individual elements, use
51+ -- Streamly.Data.StreamK.'Streamly.Data.StreamK.parse' and
52+ -- Streamly.Data.StreamK.'Streamly.Data.StreamK.parseBreak' to run the parser.
5453--
55- -- If we have to parse "a9" or "9a" but not "99" or "aa" we can use the
56- -- following non-monadic, backtracking parser:
54+ -- == Applicative Composition
5755--
58- -- >>> digits p1 p2 = ((:) <$> p1 <*> ((:) <$> p2 <*> pure []))
56+ -- Applicative parsers are simpler but we cannot use lookbehind as we can in
57+ -- the monadic parsers.
58+ --
59+ -- If we have to parse "9a" or "a9" but not "99" or "aa" we can use the
60+ -- following non-monadic (Applicative), backtracking parser:
61+ --
62+ -- >>> -- parse p1 : p2 : []
63+ -- >>> token p1 p2 = ((:) <$> p1 <*> ((:) <$> p2 <*> pure []))
5964-- >>> :{
6065-- backtracking :: Monad m => ParserK Char m String
61- -- backtracking = ParserK .parserK $
62- -- digits (Parser.satisfy isDigit) (Parser.satisfy isAlpha)
66+ -- backtracking = StreamK .parserK $
67+ -- token (Parser.satisfy isDigit) (Parser.satisfy isAlpha) -- e.g. "9a"
6368-- <|>
64- -- digits (Parser.satisfy isAlpha) (Parser.satisfy isDigit)
69+ -- token (Parser.satisfy isAlpha) (Parser.satisfy isDigit) -- e.g. "a9"
6570-- :}
6671--
67- -- We know that if the first parse resulted in a digit at the first place then
68- -- the second parse is going to fail. However, we waste that information and
69- -- parse the first character again in the second parse only to know that it is
70- -- not an alphabetic char. By using lookbehind in a 'Monad' composition we can
71- -- avoid redundant work:
72+ -- == Monadic Composition
73+ --
74+ -- Monad composition can be used to implement lookbehind parsers, we can dynamically
75+ -- compose new parsers based on the results of the previously parsed values.
76+ --
77+ -- In the previous example, we know that if the first parse resulted in a digit
78+ -- at the first place then the second parse is going to fail. However, we
79+ -- waste that information and parse the first character again in the second
80+ -- parse only to know that it is not an alphabetic char. By using lookbehind
81+ -- in a 'Monad' composition we can avoid redundant work:
7282--
7383-- >>> data DigitOrAlpha = Digit Char | Alpha Char
7484--
7585-- >>> :{
7686-- lookbehind :: Monad m => ParserK Char m String
7787-- lookbehind = do
78- -- x1 <- ParserK .parserK $
88+ -- x1 <- StreamK .parserK $
7989-- Digit <$> Parser.satisfy isDigit
8090-- <|> Alpha <$> Parser.satisfy isAlpha
8191-- -- Note: the parse depends on what we parsed already
82- -- x2 <- ParserK .parserK $
92+ -- x2 <- StreamK .parserK $
8393-- case x1 of
8494-- Digit _ -> Parser.satisfy isAlpha
8595-- Alpha _ -> Parser.satisfy isDigit
@@ -105,11 +115,8 @@ module Streamly.Data.ParserK
105115 ParserK
106116
107117 -- * Parsers
108- -- ** Conversions
109- , parserK
110- -- , toParser
111118
112- -- ** Without Input
119+ -- -- ** Without Input
113120 , fromPure
114121 , fromEffect
115122 , die
0 commit comments