Releases · Michael-JB/bm25 · GitHub

07 Sep 09:24

Michael-JB

v2.3.2 Latest

Latest

Changed

Bump rayon from 0.10.0 to 0.11.0
Bump stop-words from 0.8.1 to 0.9.0

Full Changelog: v2.3.1...v2.3.2

Assets 2

02 Aug 08:41

Michael-JB

v2.3.1

Changed

Bump cached from 0.55.1 to 0.56.0

Full Changelog: v2.3.0...v2.3.1

Assets 2

28 Jun 10:36

Michael-JB

v2.3.0

Fixed

Fix negative scoring of high-frequency terms. Scores returned by this version
will differ from the previous version, hence this is a minor version bump
rather than a patch. This closes the bug raised in
#20. Thank you to
hwiorn for this contribution!

Changed

Bump deunicode from 1.6.0 to 1.6.2

Full Changelog: v2.2.1...v2.3.0

Assets 2

03 Mar 19:45

Michael-JB

v2.2.1

Changed

Bump stop-words from 0.8.0 to 0.8.1
Bump whichlang from 0.1.0 to 0.1.1
Bump cached from 0.54.0 to 0.55.1

Full Changelog: v2.2.0...v2.2.1

Assets 2

15 Dec 11:14

Michael-JB

v2.2.0

Changed

Use unicode-segmentation for better word splitting. Decimal numbers and words with apostrophes
no longer generate multiple tokens. This is a (minor) breaking change for the default tokenizer.

Added

DefaultTokenizerBuilder is now Default.

Full Changelog: v2.1.1...v2.2.0

Assets 2

14 Dec 10:18

Michael-JB

v2.1.1

Added

SearchResult is now Clone.
Add WebAssembly bm25-demo to README.
Miscellaneous documentation improvements.

Full Changelog: v2.1.0...v2.1.1

Assets 2

16 Nov 20:45

Michael-JB

v2.1.0

Added

Customisation of the DefaultTokenizer. You can now enable/disable normalization, stemming
and stop word removal via the new DefaultTokenizer::builder().

Changed

DefaultTokenizer now normalizes unicode. This makes search more lenient for languages with
non-ASCII characters. Note that this is a breaking change for the default tokenizer. If you
require the behaviour of the previous version, you can create your default tokenizer with the
new builder: DefaultTokenizer::builder().normalization(false).build().

Full Changelog: v2.0.1...v2.1.0

Assets 2

11 Nov 16:20

Michael-JB

v2.0.1

Fixed

Remove unintentionally re-exposed stop-words crate feature.

Full Changelog: v2.0.0...v2.0.1

Assets 2

10 Nov 11:31

Michael-JB

v2.0.0

Changed

Introduces TokenEmbedder::EmbeddingSpace to decouple the output of TokenEmbedder from Self.
This lets you customise the output of your TokenEmbedder without changing its type.

Full Changelog: v1.0.1...v2.0.0

Assets 2

10 Nov 10:38

Michael-JB

v1.0.1

Fixed

Correctly embed the README in the crate documentation. docs.rs should now display the README
correctly.

Full Changelog: v1.0.0...v1.0.1

Assets 2