Skip to content

Commit 63c8426

Browse files
bdeboezitros
authored andcommitted
minor refinement to similarity logic
When an origin entity contains a single-character word, require it to be at the start of a word in the destination entity, rather than anywhere. i.e. "hepatitis b vaccine" previously would be a child to "hepatitis c" because of the c in vaccine, but now no longer is. Note that there's many other refinements that may work well for some environments but cause collateral damage elsewhere. This this felt generic enough to be useful in weeding out some false positives, but not too blunt in causing false negatives.
1 parent 401659d commit 63c8426

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

src/cls/EntityBrowser/API.cls

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -395,9 +395,16 @@ IsSimilar(tOriginIndex, tDestIndex, mode="")
395395

396396
set similar = 1
397397
for posO = 1:1:pEntityTokens(tOriginIndex,0) {
398-
set similar = 0
398+
set similar = 0, length = $l(pEntityTokens(tOriginIndex,posO))
399399
for posD = 1:1:pEntityTokens(tDestIndex,0) {
400-
set similar = ''$find(pEntityTokens(tDestIndex,posD),pEntityTokens(tOriginIndex,posO))
400+
401+
// for single-character tokens, require starting position
402+
if length=1 {
403+
set similar = ($e(pEntityTokens(tDestIndex,posD),1,length)=pEntityTokens(tOriginIndex,posO))
404+
} else {
405+
set similar = ''$find(pEntityTokens(tDestIndex,posD),pEntityTokens(tOriginIndex,posO))
406+
}
407+
401408
quit:similar
402409
}
403410
quit:'similar

0 commit comments

Comments
 (0)