This repository was archived by the owner on Sep 30, 2024. It is now read-only.
Commit 80a8177
authored
Search: improve keyword search prototype (#52233)
We have an experimental search type called `patterntype:keyword`. In
testing it on Cody-style queries, it had worse relevance than our
ripgrep implementation, and was sometimes quite slow.
This PR makes improvements to query analysis:
* Reduce the number of tokens we search by using a more aggressive
stopword list
* Make stemming cheaper and less noisy by using the stem if it's a
prefix of the original
* Limit the max number of tokens we'll search over
* Remove language detection because it was too noisy and makes it hard
to compare to other search strategies
It also improves ranking:
* Enable Zoekt's keyword scoring to rank documents by (approximate) BM25
* Removes unused ranking logic related to "match groups"
Addresses https://github.com/sourcegraph/sourcegraph/issues/507861 parent 5a553d7 commit 80a8177
File tree
10 files changed
+1073
-408
lines changed- internal/search
- client
- keyword
- zoekt
10 files changed
+1073
-408
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
117 | 117 | | |
118 | 118 | | |
119 | 119 | | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
120 | 123 | | |
121 | 124 | | |
122 | 125 | | |
123 | 126 | | |
124 | 127 | | |
125 | 128 | | |
126 | 129 | | |
127 | | - | |
| 130 | + | |
128 | 131 | | |
129 | 132 | | |
130 | 133 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
This file was deleted.
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | 6 | | |
10 | 7 | | |
11 | 8 | | |
| 9 | + | |
| 10 | + | |
12 | 11 | | |
13 | 12 | | |
14 | 13 | | |
| |||
44 | 43 | | |
45 | 44 | | |
46 | 45 | | |
47 | | - | |
48 | 46 | | |
49 | 47 | | |
50 | 48 | | |
| |||
58 | 56 | | |
59 | 57 | | |
60 | 58 | | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | 59 | | |
65 | 60 | | |
66 | 61 | | |
| |||
69 | 64 | | |
70 | 65 | | |
71 | 66 | | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
88 | 67 | | |
89 | 68 | | |
90 | 69 | | |
| |||
107 | 86 | | |
108 | 87 | | |
109 | 88 | | |
110 | | - | |
111 | | - | |
112 | | - | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
113 | 92 | | |
114 | 93 | | |
115 | | - | |
116 | 94 | | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
122 | 102 | | |
123 | 103 | | |
124 | 104 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
15 | | - | |
| 14 | + | |
| 15 | + | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
| 22 | + | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
28 | 29 | | |
29 | 30 | | |
30 | | - | |
31 | 31 | | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | 32 | | |
37 | | - | |
38 | 33 | | |
39 | 34 | | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | 35 | | |
45 | | - | |
46 | | - | |
47 | 36 | | |
48 | | - | |
49 | 37 | | |
50 | 38 | | |
51 | 39 | | |
| |||
75 | 63 | | |
76 | 64 | | |
77 | 65 | | |
78 | | - | |
79 | | - | |
| 66 | + | |
| 67 | + | |
80 | 68 | | |
81 | 69 | | |
82 | 70 | | |
| |||
0 commit comments