We are using Mecab tokenizer to split Japanese sentences into individual words.
Issue:
word
食べてしまいます
gets split into
this is rather difficult for readers to understand
Ideally we should use better parser that understands Japanese conjugation on higher level.