Is your feature request related to a problem? Please describe.
Current tokenization is only implemented as unary only, and we would like to implement a bidirectional streaming case where chunks of text are aggregated before tokenization/splitting.
Describe the solution you'd like
Implementation on regex sentence splitter