-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
The behaviour below is inconsistent on my Mac; on Ubuntu the results are mostly consistent. I cannot reproduce the inconsistency on Ubuntu, but on MacOS see below.
Here is the Jaccard similarity of two empty strings, first as arguments to the stringsim function, and then as components of a vector.
> x <- stringdist::stringsim("","",method="jaccard")
> str(x)
num 1
> y <- stringdist::stringsim(c("y",""),c("y",""),method="jaccard")
> str(y)
num [1:2] 1 NaN
Here is another example of inconsistent behaviour:
> stringdist::stringsim( c("foo","ac"), c("foo","bc"), method = "jaccard", q = 5)
[1] 1 1
> stringdist::stringsim( c("foo","ac"), c("foo","bc"), method = "jaccard", q = 3)
[1] 1 NaN
> stringdist::stringsim( c("foo","ac"), c("foo","bc"), method = "jaccard", q = 1)
[1] 1.0000000 0.3333333
I tried this with a fresh install of the stringdist package:
$ R
R version 4.3.1 (2023-06-16) -- "Beagle Scouts"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin20 (64-bit)
> packageVersion('stringdist')
[1] ‘0.9.10’
Metadata
Metadata
Assignees
Labels
No labels