Skip to content

Releases: Michael-JB/bm25

v2.3.2

07 Sep 09:24

Choose a tag to compare

Changed

  • Bump rayon from 0.10.0 to 0.11.0
  • Bump stop-words from 0.8.1 to 0.9.0

Full Changelog: v2.3.1...v2.3.2

v2.3.1

02 Aug 08:41

Choose a tag to compare

Changed

  • Bump cached from 0.55.1 to 0.56.0

Full Changelog: v2.3.0...v2.3.1

v2.3.0

28 Jun 10:36

Choose a tag to compare

Fixed

  • Fix negative scoring of high-frequency terms. Scores returned by this version
    will differ from the previous version, hence this is a minor version bump
    rather than a patch. This closes the bug raised in
    #20. Thank you to
    hwiorn for this contribution!

Changed

  • Bump deunicode from 1.6.0 to 1.6.2

Full Changelog: v2.2.1...v2.3.0

v2.2.1

03 Mar 19:45

Choose a tag to compare

Changed

  • Bump stop-words from 0.8.0 to 0.8.1
  • Bump whichlang from 0.1.0 to 0.1.1
  • Bump cached from 0.54.0 to 0.55.1

Full Changelog: v2.2.0...v2.2.1

v2.2.0

15 Dec 11:14

Choose a tag to compare

Changed

  • Use unicode-segmentation for better word splitting. Decimal numbers and words with apostrophes
    no longer generate multiple tokens. This is a (minor) breaking change for the default tokenizer.

Added

  • DefaultTokenizerBuilder is now Default.

Full Changelog: v2.1.1...v2.2.0

v2.1.1

14 Dec 10:18

Choose a tag to compare

Added

  • SearchResult is now Clone.
  • Add WebAssembly bm25-demo to README.
  • Miscellaneous documentation improvements.

Full Changelog: v2.1.0...v2.1.1

v2.1.0

16 Nov 20:45

Choose a tag to compare

Added

  • Customisation of the DefaultTokenizer. You can now enable/disable normalization, stemming
    and stop word removal via the new DefaultTokenizer::builder().

Changed

  • DefaultTokenizer now normalizes unicode. This makes search more lenient for languages with
    non-ASCII characters. Note that this is a breaking change for the default tokenizer. If you
    require the behaviour of the previous version, you can create your default tokenizer with the
    new builder: DefaultTokenizer::builder().normalization(false).build().

Full Changelog: v2.0.1...v2.1.0

v2.0.1

11 Nov 16:20

Choose a tag to compare

Fixed

  • Remove unintentionally re-exposed stop-words crate feature.

Full Changelog: v2.0.0...v2.0.1

v2.0.0

10 Nov 11:31

Choose a tag to compare

Changed

  • Introduces TokenEmbedder::EmbeddingSpace to decouple the output of TokenEmbedder from Self.
    This lets you customise the output of your TokenEmbedder without changing its type.

Full Changelog: v1.0.1...v2.0.0

v1.0.1

10 Nov 10:38

Choose a tag to compare

Fixed

  • Correctly embed the README in the crate documentation. docs.rs should now display the README
    correctly.

Full Changelog: v1.0.0...v1.0.1