Skip to content

Releases: rapidfuzz/RapidFuzz

Release 1.1.2

03 Mar 15:09

Choose a tag to compare

Fixed

  • Fix reference counting in process.extract (see #81)

Release 1.1.1

23 Feb 14:04

Choose a tag to compare

Fixed

  • Fix result conversion in process.extract (see #79)

Release 1.1.0

21 Feb 18:43
5383d28

Choose a tag to compare

Changed

  • string_metric.normalized_levenshtein supports now all weights
  • when different weights are used for Insertion and Deletion the strings are not swapped inside the Levenshtein implementation anymore. So different weights for Insertion and Deletion are now supported.
  • replace C++ implementation with a Cython implementation. This has the following advantages:
    • The implementation is less error prone, since a lot of the complex things are done by Cython
    • slighly faster than the current implementation (up to 10% for some parts)
    • about 33% smaller binary size
    • reduced compile time
  • Added **kwargs argument to process.extract/extractOne/extract_iter that is passed to the scorer
  • Add max argument to hamming distance
  • Add support for whole Unicode range to utils.default_process

Performance

  • replaced Wagner Fischer usage in the normal Levenshtein distance with a bitparallel implementation

Release 1.0.2

19 Feb 14:35

Choose a tag to compare

Fixed

  • The bitparallel LCS algorithm in fuzz.partial_ratio did not find the longest common substring properly in some cases.
    The old algorithm is used again until this bug is fixed.

Release 1.0.1

17 Feb 22:34

Choose a tag to compare

Changed

  • string_metric.normalized_levenshtein supports now the weights (1, 1, N) with N >= 1

Performance

  • The Levenshtein distance with the weights (1, 1, >2) do now use the same implementation as the weight (1, 1, 2), since
    Substitution > Insertion + Deletion has no effect

Fixed

  • fix uninitialized variable in bitparallel Levenshtein distance with the weight (1, 1, 1)

Release 1.0.0

12 Feb 15:57

Choose a tag to compare

Changed

  • all normalized string_metrics can now be used as scorer for process.extract/extractOne
  • Implementation of the C++ Wrapper completely refactored to make it easier to add more scorers, processors and string matching algorithms in the future.
  • increased test coverage, that already helped to fix some bugs and help to prevent regressions in the future
  • improved docstrings of functions

Performance

  • Added bit-parallel implementation of the Levenshtein distance for the weights (1,1,1) and (1,1,2).
  • Added specialized implementation of the Levenshtein distance for cases with a small maximum edit distance, that is even faster, than the bit-parallel implementation.
  • Improved performance of fuzz.partial_ratio
    -> Since fuzz.ratio and fuzz.partial_ratio are used in most scorers, this improves the overall performance.
  • Improved performance of process.extract and process.extractOne

Deprecated

  • the rapidfuzz.levenshtein module is now deprecated and will be removed in v2.0.0
    These functions are now placed in rapidfuzz.string_metric. distance, normalized_distance, weighted_distance and weighted_normalized_distance are combined into levenshtein and normalized_levenshtein.

Added

  • added normalized version of the hamming distance in string_metric.normalized_hamming
  • process.extract_iter as a generator, that yields the similarity of all elements, that have a similarity >= score_cutoff

Fixed

  • multiple bugs in extractOne when used with a scorer, that's not from RapidFuzz
  • fixed bug in token_ratio
  • fixed bug in result normalization causing zero division

Release 0.14.2

31 Dec 00:27

Choose a tag to compare

Fixed

  • utf8 usage in the copyright header caused problems with python2.7 on some platforms (see #70)

Release 0.14.1

13 Dec 15:59

Choose a tag to compare

Fixed

  • when a custom processor like lambda s: s was used with any of the methods inside fuzz.* it always returned a score of 100. This release fixes this and adds a better test coverage to prevent this bug in the future.

Release 0.14.0

09 Dec 00:19

Choose a tag to compare

Added

  • added hamming distance metric in the levenshtein module

Performance

  • improved performance of default_process by using lookup table

Release 0.13.4

30 Nov 17:20
8f9a61e

Choose a tag to compare

Fixed

  • Add missing virtual destructor that caused a segmentation fault on Mac Os