You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"allowSignup": false// This is set to false by default, which means that any bitbucket server user cannot automatically sign up to access your instance.
Copy file name to clipboardExpand all lines: docs/code-search/code-navigation/inference_configuration.mdx
+70-13Lines changed: 70 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,6 +40,8 @@ To **add** additional behaviors, you can create and register a new **recognizer*
40
40
41
41
A _path recognizer_ is a concrete recognizer that advertises a set of path _globs_ it is interested in, then invokes its `generate` function with matching paths from a repository. In the following, all files matching `Snek.module` (`Snek.module`, `proj/Snek.module`, `proj/sub/Snek.module`, etc) are passed to a call to `generate` (if non-empty). The generate function will then return a list of indexing job descriptions. The [guide for auto-indexing jobs configuration](/code-search/code-navigation/auto_indexing_configuration#keys-1) gives detailed descriptions on the fields of this object.
42
42
43
+
The ordering of paths and limits are defined in the [Ordering guarantees and limits](#ordering-guarantees-and-limits) section.
44
+
43
45
```lua
44
46
localpath=require("path")
45
47
localpattern=require("sg.autoindex.patterns")
@@ -105,27 +107,54 @@ This auto-indexing-specific library defines the following two functions.
105
107
- Type:
106
108
```
107
109
({
110
+
-- List of patterns to match against paths in the repository
108
111
"patterns": array[pattern],
112
+
-- List of patterns to match against paths in the repository
113
+
-- for getting contents (see contents_by_path below)
-- Callback function invoked with paths requested by patterns above
116
+
-- for creating index jobs
117
+
"generate": (
118
+
registration_api,
119
+
-- List of paths obtained from 'patterns' and
120
+
-- 'patterns_for_content' combined.
121
+
paths: array[string],
122
+
-- Table mapping paths to contents for paths matched by
123
+
-- 'patterns_for_content'
124
+
contents_by_path: table[string, string]
125
+
) -> array[index_job],
111
126
}) -> recognizer
112
127
```
113
128
where `index_job` is an object with the following shape:
114
129
```
115
130
index_job = {
116
-
"indexer": string, -- Docker image for the indexer
117
-
"root": string, -- working directory for invoking the indexer
118
-
"steps": array[{ -- preparatory steps to run before invoking the indexer (e.g. installing dependencies)
119
-
"root": string, -- working directory for this step
120
-
"image": string -- Docker image to use for preparatory step
121
-
"commands": array[string] -- List of commands to run inside the Docker image
131
+
-- Docker image for the indexer
132
+
"indexer": string,
133
+
-- Working directory for invoking the indexer
134
+
"root": string,
135
+
-- Preparatory steps to run before invoking the indexer
136
+
-- such as installing dependencies
137
+
"steps": array[{
138
+
-- Working directory for this step
139
+
"root": string,
140
+
-- Docker image to use for this step
141
+
"image": string,
142
+
-- List of commands to run inside the Docker image
143
+
"commands": array[string]
122
144
}],
123
-
"local_steps": array[string] -- List of commands to run inside the indexer image at "root" before invoking
124
-
-- the indexer (e.g. to install dependencies)
125
-
"indexer_args": array[string], -- command-line invocation for the indexer
126
-
"outfile": string, -- path to the index generated by the indexer
127
-
"requested_envvars": array[string], -- List of environment variables needed. These are made accessible
128
-
-- to steps, local_steps, and the indexer_args command.
145
+
-- List of commands to run inside the indexer image at "root"
146
+
-- before invoking the indexer, such as installing dependencies.
147
+
"local_steps": array[string],
148
+
-- Command-line invocation for the indexer
149
+
"indexer_args": array[string],
150
+
-- Path to the index generated by the indexer
151
+
"outfile": string,
152
+
-- Names of necessary environment variables. These are
153
+
-- made accessible to steps, local_steps, and the
154
+
-- indexer_args command.
155
+
--
156
+
-- These are generally used for passing secrets.
157
+
"requested_envvars": array[string],
129
158
}
130
159
```
131
160
For installing dependencies, if the indexer image contains the relevant package manager(s),
@@ -186,3 +215,31 @@ This library defines the following two JSON utility functions:
186
215
### `fun`
187
216
188
217
[Lua Functional](https://github.com/luafun/luafun/tree/cb6a7e25d4b55d9578fd371d1474b00e47bd29f3#lua-functional) is a high-performance functional programming library accessible via `local fun = require("fun")`. This library has a number of functional utilities to help make recognizer code a bit more expressive.
218
+
219
+
## Ordering guarantees and limits
220
+
221
+
Sourcegraph enforces several limits to avoid inference timeouts and ever-growing auto-indexing queues. These limits apply for a single round of inference for a single repository, combined across all recognizers, including any implicitly included Sourcegraph recognizers.
222
+
223
+
Limit | Default value
224
+
:-----|-------------:
225
+
The number of auto-indexing jobs inferred | 100
226
+
The number of total paths passed to the inference script's `generate` functions as the second argument `paths` | 500
227
+
The number of total paths with contents passed to the inference script's `generate` functions as the third argument `contents_by_paths` | 100
228
+
Maximum size limit for file contents, in bytes | 1 MiB
229
+
230
+
<Callouttype="note">Please reach out to Sourcegraph support if you'd like to change these limits.</Callout>
231
+
232
+
Auto-indexing jobs and paths are first ranked based on the criteria described below. If the number of jobs and/or paths exceeds the limits above, lower ranked items are discarded.
233
+
234
+
- For auto-indexing jobs, ranking is done based on the following:
235
+
236
+
- Descending order of indexer frequency (total number of inferred jobs with the same `indexer` field).
237
+
- Ascending lexicographic ordering of `indexer`.
238
+
- Descending order of number of path components for `root`. Shallower roots are preferrred over deeper ones as they are more likely to cover more code.
239
+
- Ascending lexicographic ordering of `root` paths.
240
+
241
+
- For paths, ranking happens in the following order:
242
+
243
+
- Paths for which the contents are requested are ranked higher.
244
+
- Paths with fewer components are ranked higher.
245
+
- Otherwise, lexicographic ordering of paths is used.
0 commit comments