Skip to content

Commit db3f4f3

Browse files
Update the module, function docs, README
1 parent 9d5f763 commit db3f4f3

File tree

3 files changed

+76
-27
lines changed

3 files changed

+76
-27
lines changed

README.md

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,26 @@
11
# streamly-text
22

3-
Library for streamly and text interoperation.
3+
Efficient interoperability between
4+
[streamly](https://hackage.haskell.org/package/streamly) arrays and
5+
[text](https://hackage.haskell.org/package/text).
46

5-
This library is to enable interoperation of streamly with existing code that
6-
uses `Text`.
7+
The strict `Text` type is equivalent to UTF-8 encoded `Array Word8` in Streamly
8+
and lazy `Text` type is equivalent to a stream of `Array Word8`.
79

8-
The package provides APIs to interconvert between strict `Text` and streamly
9-
`Array Word8` and between lazy `Text` and stream of `Array Word8`.
10+
A `Char` stream can be converted to UTF-8 encoded `Word8` stream using
11+
`encodeUtf8` from `Streamly.Unicode.Stream` which in turn can be written as
12+
`Array Word8`, and a stream of UTF-8 encoded `Word8` or `Array Word8` can be
13+
decoded using `decodeUtf8` or `decodeUtf8Chunks`.
1014

11-
The interconversion in the case of strict `Text` and streamly `Array Word8` has
12-
no overhead.
15+
This library provides zero-overhead and streaming conversions between
16+
the `Text` type and `streamly` Array types, making it easier to use
17+
Array and Array stream based functions on `Text`.
18+
19+
## Features
20+
21+
- **Strict `Text``Array Word8`**
22+
Convert between strict `Text` and `streamly`’s `Word8` stream or
23+
`Array Word8` without any overhead.
24+
25+
- **Lazy `Text` ↔ Stream of `Array Word8`**
26+
Convert between lazy `Text` and a stream of `Array Word8`.

src/Streamly/Compat/Text.hs

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,29 @@
33
{-# LANGUAGE MagicHash #-}
44
{-# LANGUAGE BangPatterns #-}
55

6+
-- | Efficient interoperability between
7+
-- <https://hackage.haskell.org/package/streamly streamly> arrays and
8+
-- <https://hackage.haskell.org/package/text text>.
9+
--
10+
-- The strict 'Text' type is equivalent to a UTF-8 encoded 'Array' 'Word8' in
11+
-- streamly. A 'Char' stream can be converted to a UTF-8 encoded 'Word8' stream
12+
-- using 'Streamly.Unicode.Stream.encodeUtf8', which in turn can
13+
-- be written as 'Array' 'Word8'. A stream of UTF-8 encoded 'Word8' or
14+
-- 'Array' 'Word8' can be decoded using 'Streamly.Unicode.Stream.decodeUtf8' or
15+
-- 'Streamly.Unicode.Stream.decodeUtf8Chunks', respectively.
16+
--
17+
-- This module provides zero-overhead conversion between strict 'Text'
18+
-- and streamly’s 'Word8' streams or 'Array' 'Word8'.
19+
620
module Streamly.Compat.Text
7-
( toArray
8-
, unsafeFromArray
21+
(
22+
-- * Construction
23+
unsafeFromArray
24+
, unsafeCreate
925

26+
-- * Elimination
27+
, toArray
1028
, reader
11-
12-
-- , unsafeCreateOf
13-
, unsafeCreate
1429
)
1530
where
1631

src/Streamly/Compat/Text/Lazy.hs

Lines changed: 35 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,29 @@
11
{-# LANGUAGE CPP #-}
22

3+
-- | Efficient interoperability between
4+
-- <https://hackage.haskell.org/package/streamly streamly> arrays and
5+
-- <https://hackage.haskell.org/package/text text>.
6+
--
7+
-- The lazy 'Text' type is equivalent to a UTF-8 encoded stream of 'Array
8+
-- Word8' in streamly. A 'Char' stream can be converted to a UTF-8 encoded
9+
-- 'Word8' stream using 'Streamly.Unicode.Stream.encodeUtf8', which in turn can
10+
-- be written as 'Array' 'Word8'. A stream of UTF-8 encoded 'Word8' or 'Array'
11+
-- 'Word8' can be decoded using 'Streamly.Unicode.Stream.decodeUtf8' or
12+
-- 'Streamly.Unicode.Stream.decodeUtf8Chunks', respectively.
13+
--
14+
-- This module provides zero-overhead conversion between lazy 'Text' and
15+
-- streamly’s 'Array Word8' or 'Word8' streams.
16+
317
module Streamly.Compat.Text.Lazy
4-
( chunkReader
5-
, reader
18+
(
19+
-- * Construction
20+
unsafeFromChunksIO
21+
, unsafeFromChunks
622

23+
-- * Elimination
24+
, reader
725
, toChunks
8-
, unsafeFromChunks
9-
, unsafeFromChunksIO
26+
, chunkReader
1027
)
1128
where
1229

@@ -33,7 +50,7 @@ import Prelude hiding (read)
3350
#define UNFOLD_EACH Unfold.many
3451
#endif
3552

36-
-- | Unfold a lazy 'Text' to a stream of 'Array' 'Words'.
53+
-- | Unfold a lazy 'Text' to a stream of 'Array Word8'.
3754
{-# INLINE chunkReader #-}
3855
chunkReader :: Monad m => Unfold m Text (Array Word8)
3956
chunkReader = Unfold step seed
@@ -42,26 +59,25 @@ chunkReader = Unfold step seed
4259
step (Chunk bs bl) = return $ Yield (Strict.toArray bs) bl
4360
step Empty = return Stop
4461

45-
-- | Unfold a lazy 'Text' to a stream of Word8
62+
-- | Unfold a lazy 'Text' to a stream of 'Word8'.
4663
{-# INLINE reader #-}
4764
reader :: Monad m => Unfold m Text Word8
4865
reader = UNFOLD_EACH Array.reader chunkReader
4966

5067
-- XXX Should this be called readChunks?
51-
-- | Convert a lazy 'Text' to a serial stream of 'Array' 'Word8'.
68+
-- | Convert a lazy 'Text' to a stream of 'Array Word8'.
5269
{-# INLINE toChunks #-}
5370
toChunks :: Monad m => Text -> Stream m (Array Word8)
5471
toChunks = Stream.unfold chunkReader
5572

56-
-- | Convert a serial stream of 'Array' 'Word8' to a lazy 'Text'.
73+
-- | IMPORTANT NOTE: This function is lazy only for lazy monads (e.g.
74+
-- Identity). For strict monads (e.g. /IO/) it consumes the entire input before
75+
-- generating the output. For /IO/ monad use 'unsafeFromChunksIO' instead.
5776
--
58-
-- This function is unsafe: the caller must ensure that each 'Array' 'Word8'
59-
-- element in the stream is a valid UTF-8 encoding.
77+
-- Convert a stream of 'Array' 'Word8' to a lazy 'Text'.
6078
--
61-
-- IMPORTANT NOTE: This function is lazy only for lazy monads
62-
-- (e.g. Identity). For strict monads (e.g. /IO/) it consumes the entire input
63-
-- before generating the output. For /IO/ monad please use unsafeFromChunksIO
64-
-- instead.
79+
-- Unsafe because the caller must ensure that each 'Array Word8'
80+
-- in the stream is UTF-8 encoded and terminates at Char boundary.
6581
--
6682
-- For strict monads like /IO/ you could create a newtype wrapper to make the
6783
-- monad bind operation lazy and lift the stream to that type using hoist, then
@@ -80,6 +96,7 @@ toChunks = Stream.unfold chunkReader
8096
-- @
8197
--
8298
-- /unsafeFromChunks/ can then be used as,
99+
--
83100
-- @
84101
-- {-# INLINE unsafeFromChunksIO #-}
85102
-- unsafeFromChunksIO :: Stream IO (Array Word8) -> IO Text
@@ -89,8 +106,11 @@ toChunks = Stream.unfold chunkReader
89106
unsafeFromChunks :: Monad m => Stream m (Array Word8) -> m Text
90107
unsafeFromChunks = Stream.foldr chunk Empty . fmap Strict.unsafeFromArray
91108

92-
-- | Convert a serial stream of 'Array' 'Word8' to a lazy 'Text' in the
109+
-- | Convert a stream of 'Array Word8' to a lazy 'Text' in the
93110
-- /IO/ monad.
111+
--
112+
-- Unsafe because the caller must ensure that each 'Array Word8'
113+
-- in the stream is UTF-8 encoded and terminates at Char boundary.
94114
{-# INLINE unsafeFromChunksIO #-}
95115
unsafeFromChunksIO :: Stream IO (Array Word8) -> IO Text
96116
unsafeFromChunksIO =

0 commit comments

Comments
 (0)