03 Mar 15:09

maxbachmann

Release 1.1.2

Fixed

Fix reference counting in process.extract (see #81)

Assets 2

23 Feb 14:04

maxbachmann

Release 1.1.1

Fixed

Fix result conversion in process.extract (see #79)

Assets 2

21 Feb 18:43

maxbachmann

Release 1.1.0

Changed

string_metric.normalized_levenshtein supports now all weights
when different weights are used for Insertion and Deletion the strings are not swapped inside the Levenshtein implementation anymore. So different weights for Insertion and Deletion are now supported.
replace C++ implementation with a Cython implementation. This has the following advantages:
- The implementation is less error prone, since a lot of the complex things are done by Cython
- slighly faster than the current implementation (up to 10% for some parts)
- about 33% smaller binary size
- reduced compile time
Added **kwargs argument to process.extract/extractOne/extract_iter that is passed to the scorer
Add max argument to hamming distance
Add support for whole Unicode range to utils.default_process

Performance

replaced Wagner Fischer usage in the normal Levenshtein distance with a bitparallel implementation

Assets 2

19 Feb 14:35

maxbachmann

Release 1.0.2

Fixed

The bitparallel LCS algorithm in fuzz.partial_ratio did not find the longest common substring properly in some cases.
The old algorithm is used again until this bug is fixed.

Assets 2

17 Feb 22:34

maxbachmann

Release 1.0.1

Changed

string_metric.normalized_levenshtein supports now the weights (1, 1, N) with N >= 1

Performance

The Levenshtein distance with the weights (1, 1, >2) do now use the same implementation as the weight (1, 1, 2), since
Substitution > Insertion + Deletion has no effect

Fixed

fix uninitialized variable in bitparallel Levenshtein distance with the weight (1, 1, 1)

Assets 2

12 Feb 15:57

maxbachmann

Release 1.0.0

Changed

all normalized string_metrics can now be used as scorer for process.extract/extractOne
Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
improved docstrings of functions

Performance

Added bit-parallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bit-parallel implementation.
Improved performance of fuzz.partial_ratio
-> Since fuzz.ratio and fuzz.partial_ratio are used in most scorers, this improves the overall performance.
Improved performance of process.extract and process.extractOne

Deprecated

the rapidfuzz.levenshtein module is now deprecated and will be removed in v2.0.0
These functions are now placed in rapidfuzz.string_metric. distance, normalized_distance, weighted_distance and weighted_normalized_distance are combined into levenshtein and normalized_levenshtein.

Added

added normalized version of the hamming distance in string_metric.normalized_hamming
process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff

Fixed

multiple bugs in extractOne when used with a scorer, that's not from RapidFuzz
fixed bug in token_ratio
fixed bug in result normalization causing zero division

Assets 2

31 Dec 00:27

maxbachmann

Release 0.14.2

Fixed

utf8 usage in the copyright header caused problems with python2.7 on some platforms (see #70)

Assets 2

13 Dec 15:59

maxbachmann

Release 0.14.1

Fixed

when a custom processor like lambda s: s was used with any of the methods inside fuzz.* it always returned a score of 100. This release fixes this and adds a better test coverage to prevent this bug in the future.

Assets 2

09 Dec 00:19

maxbachmann

Release 0.14.0

Added

added hamming distance metric in the levenshtein module

Performance

improved performance of default_process by using lookup table

Assets 2

30 Nov 17:20

maxbachmann

Release 0.13.4

Fixed

Add missing virtual destructor that caused a segmentation fault on Mac Os

Assets 2