Skip to content

Releases: meilisearch/charabia

Charabia v0.9.9

24 Nov 13:34
5b8f34a

Choose a tag to compare

Changes

Thanks again to @ManyTheFish, @curquiza, and dependabot[bot]! 🎉

Charabia v0.9.8

06 Nov 17:18
8d0c0f8

Choose a tag to compare

Changes

Thanks again to @JinheLin, @Kerollmops, @ManyTheFish, @dependabot[bot], @meili-bors[bot] and dependabot[bot]! 🎉

Charabia v0.9.7

21 Aug 07:31
70fd954

Choose a tag to compare

Changes

  • Fix jieba-rs requirement (#348) @agourlay
  • Add Persian language support with normalization and segmentation (#350) @ja7ad

Thanks again to @ManyTheFish, @agourlay, @ja7ad! 🎉

Charabia v0.9.6

03 Jun 08:32
9fc2185

Choose a tag to compare

Changes

  • Update Lindera to 0.43.0 (#344) @mosuka
  • feat: Finetuning of german dictionary for some sports related composite words (#345) @luflow

Thanks again to @luflow, and @mosuka! 🎉

Charabia v0.9.5

21 May 09:13
aecce7f

Choose a tag to compare

Changes

  • Hotfix: update Lindera to 0.42.3 removing the native tls dependency

Thanks again to @ManyTheFish! 🎉

Charabia v0.9.4

13 May 10:03
a77d5ac

Choose a tag to compare

Changes

Thanks again to @HDT3213, @Kerollmops, @ManyTheFish, @Nickersoft, and @slatian! 🎉

Charabia v0.9.3

24 Mar 12:43
e95cc9e

Choose a tag to compare

Changes

Thanks again to @Kerollmops, @ManyTheFish, @NarHakobyan, @curquiza, @dependabot[bot], @meili-bors[bot], @mosuka and dependabot[bot]! 🎉

Charabia v0.9.2

27 Nov 09:58
93a22f0

Choose a tag to compare

Changes

Thanks again to @ManyTheFish, @PedroTurik, @dependabot, @dependabot[bot], @dqkqd, @meili-bors[bot], and @tats-u! 🎉

Charabia v0.9.1

19 Sep 10:01
2d90e4c

Choose a tag to compare

Changes

  • Add Turkish normalizer (#305) @tkhshtsh0917
  • feat: Adds German compound words decomposition with new segmenter (#303) @luflow
  • German: Adds some more test cases and updates dictionary (#306) @luflow

Thanks again to @ManyTheFish, @luflow, @meili-bors[bot], and @tkhshtsh0917! 🎉

Charabia v0.9.0

25 Jul 13:52
9854134

Choose a tag to compare

Changes

(BREAKING) Simplify lang detection (#299) @ManyTheFish

  • The Language allow_list change from a HashMap<Script, Vec<Language>> to a slice of Language: &[Language].
  • Add the tokenize_with_allow_list method to the Tokenizer, allowing to dynamically pass a Language allow list without having to re-build the tokenizer.

Add math symbols to default separators (#301) @phillitrOSU

Adds all math symbols from https://www.compart.com/en/unicode/category/Sm to the default separator list.

Thanks again to @ManyTheFish, @meili-bors[bot], and @phillitrOSU! 🎉