Skip to content
Draft
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions hana/lib/cql-functions.js
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,8 @@ const StandardFunctions = {
// if column specific value is provided, the configuration has to be defined on column level
if (csnElements.some(e => e.element?.['@Search.ranking'] || e.element?.['@Search.fuzzinessThreshold'])) {
csnElements.forEach(e => {
let fuzzy = `FUZZY`
const fuzziScore = e.element?.['@Search.fuzzinessThreshold'] || fuzzyIndex
let fuzzy = `${ fuzziScore === 1 ? 'EXACT' : 'FUZZY'}`
// weighted search
const rank = e.element?.['@Search.ranking']?.['=']
switch (rank) {
Expand All @@ -141,14 +142,20 @@ const StandardFunctions = {
`Invalid configuration ${rank} for @Search.ranking. HIGH, MEDIUM, LOW are supported values.`,
)
}
fuzzy += ` MINIMAL TOKEN SCORE ${e.element?.['@Search.fuzzinessThreshold'] || fuzzyIndex} SIMILARITY CALCULATION MODE 'search'`
if (fuzziScore === 1)
fuzzy += ` MINIMAL SCORE 1 search mode 'text'`
else
fuzzy += ` MINIMAL TOKEN SCORE ${fuzziScore} SIMILARITY CALCULATION MODE 'search'`
// rewrite ref to xpr to mix in search config
// ensure in place modification to reuse .toString method that ensures quoting
e.xpr = [{ ref: e.ref }, fuzzy]
delete e.ref
})
} else {
ref = `${ref} FUZZY MINIMAL TOKEN SCORE ${fuzzyIndex} SIMILARITY CALCULATION MODE 'search'`
if (fuzzyIndex === 1)
ref = `${ref} EXACT MINIMAL SCORE 1 search mode 'text'`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ref = `${ref} EXACT MINIMAL SCORE 1 search mode 'text'`
ref = `${ref} EXACT`

according to java tests, this is sufficient.

Copy link
Contributor Author

@larsplessing larsplessing Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johannes-vogel
For the search mode java also always wraps the search term in wildcards like: *<term>*.
But with placeholders this is not possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Java use placeholders? Otherwise this opens doors for SQL injection since the search term comes from end user?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the discussions is to always forward the $search string directly into the score function. If customers expect to use wildcard characters like * they should include it inside their search field or the application developer has to include it inside the request.

Copy link
Contributor Author

@larsplessing larsplessing Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johannes-vogel They are generating prepared statements like SCORE ? IN ... and the value for ? is *<SearchTerm>*

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean Java deviates from the agreement that search is an arbitrary string that is used as is in score function? It looks to me at least that way...

Copy link
Contributor Author

@larsplessing larsplessing Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johannes-vogel with EXACT MINIMAL SCORE 1 search mode 'text' the search term will be interpreted as a whole string. E.g. for the string 'this is a test':
search term:

  • "this" --> found
  • "is a" --> found
  • "this test" --> not found

Copy link
Contributor Author

@larsplessing larsplessing Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hana colleagues wrote:

SCORE('PR00690415' IN txt FUZZY MINIMAL TOKEN SCORE 1 SIMILARITY CALCULATION MODE 'search')

MINIMAL TOKEN SCORE gets only used on string search (SEARCH MODE 'text' ).

If SEARCH MODE 'text' is set, full text search is executed.

If not set, only a string-like search without tokenisation is done.

This means your term above is equivalent to:

SCORE('PR00690415' IN txt FUZZY MINIMAL SCORE 0.8 SIMILARITY CALCULATION MODE 'search')


default value of minimal score is 0.8.

@johannes-vogel How should we proceed from here?

else
ref = `${ref} FUZZY MINIMAL TOKEN SCORE ${fuzzyIndex} SIMILARITY CALCULATION MODE 'search'`
}

if (Array.isArray(arg.xpr)) {
Expand Down
8 changes: 6 additions & 2 deletions hana/test/fuzzy.cds
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
using {sap.capire.bookshop.BooksAnnotated as BooksAnnotated} from '../../test/bookshop/db/schema.cds';
using {sap.capire.bookshop.BooksAnnotated as BooksAnnotated, sap.capire.bookshop.BooksAnnotatedScore1 as BooksAnnotatedScore1} from '../../test/bookshop/db/schema.cds';

annotate BooksAnnotated with @cds.search: {title, descr, currency.code};
annotate BooksAnnotated:title with @(Search.ranking: HIGH, Search.fuzzinessThreshold: 0.9);
annotate BooksAnnotated:descr with @(Search.ranking: LOW, Search.fuzzinessThreshold: 0.9);
annotate BooksAnnotated:descr with @(Search.ranking: LOW, Search.fuzzinessThreshold: 0.9);

annotate BooksAnnotatedScore1 with @cds.search: {title, descr, currency.code};
annotate BooksAnnotatedScore1:title with @(Search.ranking: HIGH, Search.fuzzinessThreshold: 0.9);
annotate BooksAnnotatedScore1:descr with @(Search.ranking: LOW, Search.fuzzinessThreshold: 1);
28 changes: 25 additions & 3 deletions hana/test/fuzzy.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,24 @@ describe('search', () => {
expect(res.length).to.be(2) // Eleonora and Jane Eyre
})

test('global config', async () => {
test('global config partial string', async () => {
cds.env.hana.fuzzy = 1
const { Books } = cds.entities('sap.capire.bookshop')
const cqn = SELECT.from(Books).search('"autobio"').columns('1')
const {sql} = cqn.toSQL()
expect(sql).to.include('FUZZY MINIMAL TOKEN SCORE 1')
expect(sql).to.include('EXACT MINIMAL SCORE 1')
const res = await cqn
expect(res.length).to.be(2) // Eleonora and Jane Eyre
expect(res.length).to.be(0) // must be exact match
})

test('global config whole string', async () => {
cds.env.hana.fuzzy = 1
const { Books } = cds.entities('sap.capire.bookshop')
const cqn = SELECT.from(Books).search('"Jane"').columns('1')
const {sql} = cqn.toSQL()
expect(sql).to.include('EXACT MINIMAL SCORE 1')
const res = await cqn
expect(res.length).to.be(2) // Wuthering Heights and and Jane Eyre
})

test('annotations', async () => {
Expand All @@ -49,6 +59,18 @@ describe('search', () => {
const res = await cqn
expect(res.length).to.be(1) // jane eyre
})

test('annotations with descr score 1', async () => {
const { BooksAnnotatedScore1 } = cds.entities('sap.capire.bookshop')
const cqn = SELECT.from(BooksAnnotatedScore1).search('is often').columns('1')
const {sql} = cqn.toSQL()
expect(sql).to.include('title FUZZY WEIGHT 0.8 MINIMAL TOKEN SCORE 0.9')
expect(sql).to.include('code FUZZY WEIGHT 0.5 MINIMAL TOKEN SCORE 0.7')
expect(sql).to.include('descr EXACT WEIGHT 0.3 MINIMAL SCORE 1')

const res = await cqn
expect(res.length).to.be(2) // Eleonora and Raven
})
})

describe('like', () => {
Expand Down
1 change: 1 addition & 0 deletions test/bookshop/db/schema.cds
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,4 @@ entity Values {
}

entity BooksAnnotated as projection on Books;
entity BooksAnnotatedScore1 as projection on Books;
Loading