-
Notifications
You must be signed in to change notification settings - Fork 25.6k
ES|QL - kNN function initial support #127322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 63 commits
6f4d011
3c4c401
8317911
e7736be
9588048
0f77374
891f4fc
fb2a3c7
8e9b280
0f58f24
e92c92b
1b7f02f
e44745e
bf92cf4
e1aecf0
239cf1e
1837242
c203766
7ae9909
204efda
1dd6008
e4f31fc
0d5a66a
ad34463
aa97b8a
26f48e7
e7452dd
22efe27
7f5ddde
b352673
66f8496
f756e85
34968ad
77011c1
d60c8e5
eacb9a0
03a329a
fbe8b6c
6ea4995
d28f2ea
9caed86
e8d8c25
958dfba
f2975a3
7a18aec
fccf9a5
3d70558
19548fa
22a4c26
ce64ba9
e114453
a34fc89
6fe7b2a
a258820
9227dc7
d3262a4
cc81380
be1e578
524c93c
65b3256
853e096
dc71549
97b6c63
c3388de
d824faa
78aa6d0
47b91f5
52f057b
68ec878
12fb39c
36c26a5
b86f2fc
c5b1292
bea423f
da4d5bc
49addf3
d1cd92c
1842ab4
a03ec92
aaf8684
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2577,6 +2577,11 @@ public BlockLoader blockLoader(MappedFieldType.BlockLoaderContext blContext) { | |
| return null; | ||
| } | ||
|
|
||
| if (dims == null) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding I can create a separate PR for this, but seemed unnecessary as it implied just this change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just wondering if a test could be added for this change? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense, added in 36c26a5 |
||
| // No data has been indexed yet | ||
| return BlockLoader.CONSTANT_NULLS; | ||
| } | ||
|
|
||
| if (indexed) { | ||
| return new BlockDocValuesReader.DenseVectorBlockLoader(name(), dims); | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -112,7 +112,7 @@ private Vector evalSingleSegmentNonDecreasing(DocVector docs) throws IOException | |
| int min = docs.docs().getInt(0); | ||
| int max = docs.docs().getInt(docs.getPositionCount() - 1); | ||
| int length = max - min + 1; | ||
| try (T scoreBuilder = createVectorBuilder(blockFactory, length)) { | ||
| try (T scoreBuilder = createVectorBuilder(blockFactory, docs.getPositionCount())) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When testing using random indexing and deletions, it became apparent that we need to use getPositionCount() instead of length, as length can be greater than position counts. |
||
| if (length == docs.getPositionCount() && length > 1) { | ||
| return segmentState.scoreDense(scoreBuilder, min, max); | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1022,7 +1022,7 @@ public void testMultipleBatchesWithLookupJoin() throws IOException { | |
| var query = requestObjectBuilder().query(format(null, "from * | lookup join {} on integer {}", testIndexName(), sort)); | ||
| Map<String, Object> result = runEsql(query); | ||
| var columns = as(result.get("columns"), List.class); | ||
| assertEquals(21, columns.size()); | ||
| assertEquals(22, columns.size()); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We added dense_vector to mappings-all-types, so a new column was added |
||
| var values = as(result.get("values"), List.class); | ||
| assertEquals(10, values.size()); | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -144,6 +144,7 @@ public class CsvTestsDataLoader { | |
| private static final TestDataset LOGS = new TestDataset("logs"); | ||
| private static final TestDataset MV_TEXT = new TestDataset("mv_text"); | ||
| private static final TestDataset DENSE_VECTOR = new TestDataset("dense_vector"); | ||
| private static final TestDataset COLORS = new TestDataset("colors"); | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think colors dataset is very intuitive for vector similarity tests - looking for RGB similar colors looks better than looking for random vectors IMO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Smart!!! |
||
|
|
||
| public static final Map<String, TestDataset> CSV_DATASET_MAP = Map.ofEntries( | ||
| Map.entry(EMPLOYEES.indexName, EMPLOYEES), | ||
|
|
@@ -204,7 +205,8 @@ public class CsvTestsDataLoader { | |
| Map.entry(SEMANTIC_TEXT.indexName, SEMANTIC_TEXT), | ||
| Map.entry(LOGS.indexName, LOGS), | ||
| Map.entry(MV_TEXT.indexName, MV_TEXT), | ||
| Map.entry(DENSE_VECTOR.indexName, DENSE_VECTOR) | ||
| Map.entry(DENSE_VECTOR.indexName, DENSE_VECTOR), | ||
| Map.entry(COLORS.indexName, COLORS) | ||
| ); | ||
|
|
||
| private static final EnrichConfig LANGUAGES_ENRICH = new EnrichConfig("languages_policy", "enrich-policy-languages.json"); | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,139 @@ | ||
| color:text,hex_code:keyword,rgb_vector:dense_vector,primary:boolean | ||
| maroon, #800000, [128,0,0], false | ||
| dark red, #8B0000, [139,0,0], false | ||
| brown, #A52A2A, [165,42,42], false | ||
| firebrick, #B22222, [178,34,34], false | ||
| crimson, #DC143C, [220,20,60], false | ||
| red, #FF0000, [255,0,0], true | ||
| tomato, #FF6347, [255,99,71], false | ||
| coral, #FF7F50, [255,127,80], false | ||
| indian red, #CD5C5C, [205,92,92], false | ||
| light coral, #F08080, [240,128,128], false | ||
| dark salmon, #E9967A, [233,150,122], false | ||
| salmon, #FA8072, [250,128,114], false | ||
| light salmon, #FFA07A, [255,160,122], false | ||
| orange red, #FF4500, [255,69,0], false | ||
| dark orange, #FF8C00, [255,140,0], false | ||
| orange, #FFA500, [255,165,0], false | ||
| gold, #FFD700, [255,215,0], false | ||
| dark golden rod, #B8860B, [184,134,11], false | ||
| golden rod, #DAA520, [218,165,32], false | ||
| pale golden rod, #EEE8AA, [238,232,170], false | ||
| dark khaki, #BDB76B, [189,183,107], false | ||
| khaki, #F0E68C, [240,230,140], false | ||
| olive, #808000, [128,128,0], false | ||
| yellow, #FFFF00, [255,255,0], true | ||
| yellow green, #9ACD32, [154,205,50], false | ||
| dark olive green, #556B2F, [85,107,47], false | ||
| olive drab, #6B8E23, [107,142,35], false | ||
| lawn green, #7CFC00, [124,252,0], false | ||
| chartreuse, #7FFF00, [127,255,0], false | ||
| green yellow, #ADFF2F, [173,255,47], false | ||
| dark green, #006400, [0,100,0], false | ||
| green, #008000, [0,128,0], true | ||
| forest green, #228B22, [34,139,34], false | ||
| lime, #00FF00, [0,255,0], false | ||
| lime green, #32CD32, [50,205,50], false | ||
| light green, #90EE90, [144,238,144], false | ||
| pale green, #98FB98, [152,251,152], false | ||
| dark sea green, #8FBC8F, [143,188,143], false | ||
| medium spring green, #00FA9A, [0,250,154], false | ||
| spring green, #00FF7F, [0,255,127], false | ||
| sea green, #2E8B57, [46,139,87], false | ||
| medium aqua marine, #66CDAA, [102,205,170], false | ||
| medium sea green, #3CB371, [60,179,113], false | ||
| light sea green, #20B2AA, [32,178,170], false | ||
| dark slate gray, #2F4F4F, [47,79,79], false | ||
| teal, #008080, [0,128,128], false | ||
| dark cyan, #008B8B, [0,139,139], false | ||
| cyan, #00FFFF, [0,255,255], true | ||
| light cyan, #E0FFFF, [224,255,255], false | ||
| dark turquoise, #00CED1, [0,206,209], false | ||
| turquoise, #40E0D0, [64,224,208], false | ||
| medium turquoise, #48D1CC, [72,209,204], false | ||
| pale turquoise, #AFEEEE, [175,238,238], false | ||
| aqua marine, #7FFFD4, [127,255,212], false | ||
| powder blue, #B0E0E6, [176,224,230], false | ||
| cadet blue, #5F9EA0, [95,158,160], false | ||
| steel blue, #4682B4, [70,130,180], false | ||
| corn flower blue, #6495ED, [100,149,237], false | ||
| deep sky blue, #00BFFF, [0,191,255], false | ||
| dodger blue, #1E90FF, [30,144,255], false | ||
| light blue, #ADD8E6, [173,216,230], false | ||
| sky blue, #87CEEB, [135,206,235], false | ||
| light sky blue, #87CEFA, [135,206,250], false | ||
| midnight blue, #191970, [25,25,112], false | ||
| navy, #000080, [0,0,128], false | ||
| dark blue, #00008B, [0,0,139], false | ||
| medium blue, #0000CD, [0,0,205], false | ||
| blue, #0000FF, [0,0,255], true | ||
| royal blue, #4169E1, [65,105,225], false | ||
| blue violet, #8A2BE2, [138,43,226], false | ||
| indigo, #4B0082, [75,0,130], false | ||
| dark slate blue, #483D8B, [72,61,139], false | ||
| slate blue, #6A5ACD, [106,90,205], false | ||
| medium slate blue, #7B68EE, [123,104,238], false | ||
| medium purple, #9370DB, [147,112,219], false | ||
| dark magenta, #8B008B, [139,0,139], false | ||
| dark violet, #9400D3, [148,0,211], false | ||
| dark orchid, #9932CC, [153,50,204], false | ||
| medium orchid, #BA55D3, [186,85,211], false | ||
| purple, #800080, [128,0,128], false | ||
| thistle, #D8BFD8, [216,191,216], false | ||
| plum, #DDA0DD, [221,160,221], false | ||
| violet, #EE82EE, [238,130,238], false | ||
| magenta, #FF00FF, [255,0,255], true | ||
| orchid, #DA70D6, [218,112,214], false | ||
| medium violet red, #C71585, [199,21,133], false | ||
| pale violet red, #DB7093, [219,112,147], false | ||
| deep pink, #FF1493, [255,20,147], false | ||
| hot pink, #FF69B4, [255,105,180], false | ||
| light pink, #FFB6C1, [255,182,193], false | ||
| pink, #FFC0CB, [255,192,203], false | ||
| antique white, #FAEBD7, [250,235,215], false | ||
| beige, #F5F5DC, [245,245,220], false | ||
| bisque, #FFE4C4, [255,228,196], false | ||
| blanched almond, #FFEBCD, [255,235,205], false | ||
| wheat, #F5DEB3, [245,222,179], false | ||
| corn silk, #FFF8DC, [255,248,220], false | ||
| lemon chiffon, #FFFACD, [255,250,205], false | ||
| light golden rod yellow, #FAFAD2, [250,250,210], false | ||
| light yellow, #FFFFE0, [255,255,224], false | ||
| saddle brown, #8B4513, [139,69,19], false | ||
| sienna, #A0522D, [160,82,45], false | ||
| chocolate, #D2691E, [210,105,30], false | ||
| peru, #CD853F, [205,133,63], false | ||
| sandy brown, #F4A460, [244,164,96], false | ||
| burly wood, #DEB887, [222,184,135], false | ||
| tan, #D2B48C, [210,180,140], false | ||
| rosy brown, #BC8F8F, [188,143,143], false | ||
| moccasin, #FFE4B5, [255,228,181], false | ||
| navajo white, #FFDEAD, [255,222,173], false | ||
| peach puff, #FFDAB9, [255,218,185], false | ||
| misty rose, #FFE4E1, [255,228,225], false | ||
| lavender blush, #FFF0F5, [255,240,245], false | ||
| linen, #FAF0E6, [250,240,230], false | ||
| old lace, #FDF5E6, [253,245,230], false | ||
| papaya whip, #FFEFD5, [255,239,213], false | ||
| sea shell, #FFF5EE, [255,245,238], false | ||
| mint cream, #F5FFFA, [245,255,250], false | ||
| slate gray, #708090, [112,128,144], false | ||
| light slate gray, #778899, [119,136,153], false | ||
| light steel blue, #B0C4DE, [176,196,222], false | ||
| lavender, #E6E6FA, [230,230,250], false | ||
| floral white, #FFFAF0, [255,250,240], false | ||
| alice blue, #F0F8FF, [240,248,255], false | ||
| ghost white, #F8F8FF, [248,248,255], false | ||
| honeydew, #F0FFF0, [240,255,240], false | ||
| ivory, #FFFFF0, [255,255,240], false | ||
| azure, #F0FFFF, [240,255,255], false | ||
| snow, #FFFAFA, [255,250,250], false | ||
| black, #000000, [0,0,0], true | ||
| dim gray, #696969, [105,105,105], false | ||
| gray, #808080, [128,128,128], true | ||
| dark gray, #A9A9A9, [169,169,169], false | ||
| silver, #C0C0C0, [192,192,192], false | ||
| light gray, #D3D3D3, [211,211,211], false | ||
| gainsboro, #DCDCDC, [220,220,220], false | ||
| white smoke, #F5F5F5, [245,245,245], false | ||
| white, #FFFFFF, [255,255,255], true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all docs have been generated for knn function, as the dense_vector field type is still under construction.
We can add the missing docs once we get dense_vector supported and the function out of snapshot.