Initial bidirectional streaming tokenization on regex sentence splitter

## Is your feature request related to a problem? Please describe.

Current tokenization is only implemented as unary only, and we would like to implement a bidirectional streaming case where chunks of text are aggregated before tokenization/splitting.

## Describe the solution you'd like

Implementation on regex sentence splitter