Skip to content

Commit 23077cf

Browse files
authored
Merge branch 'main' into sg-next-oct30-2024
2 parents d827e33 + c3e024b commit 23077cf

File tree

3 files changed

+76
-16
lines changed

3 files changed

+76
-16
lines changed

docs/admin/auth/index.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -334,6 +334,7 @@ Then add the following lines to your [site configuration](/admin/config/site_con
334334
"displayName": "Bitbucket Server",
335335
"clientID": "replace-with-the-oauth-client-id",
336336
"clientSecret": "replace-with-the-oauth-client-secret"
337+
"allowSignup": false // This is set to false by default, which means that any bitbucket server user cannot automatically sign up to access your instance.
337338
}
338339
]
339340
```

docs/code-search/code-navigation/inference_configuration.mdx

Lines changed: 70 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ To **add** additional behaviors, you can create and register a new **recognizer*
4040

4141
A _path recognizer_ is a concrete recognizer that advertises a set of path _globs_ it is interested in, then invokes its `generate` function with matching paths from a repository. In the following, all files matching `Snek.module` (`Snek.module`, `proj/Snek.module`, `proj/sub/Snek.module`, etc) are passed to a call to `generate` (if non-empty). The generate function will then return a list of indexing job descriptions. The [guide for auto-indexing jobs configuration](/code-search/code-navigation/auto_indexing_configuration#keys-1) gives detailed descriptions on the fields of this object.
4242

43+
The ordering of paths and limits are defined in the [Ordering guarantees and limits](#ordering-guarantees-and-limits) section.
44+
4345
```lua
4446
local path = require("path")
4547
local pattern = require("sg.autoindex.patterns")
@@ -105,27 +107,54 @@ This auto-indexing-specific library defines the following two functions.
105107
- Type:
106108
```
107109
({
110+
-- List of patterns to match against paths in the repository
108111
"patterns": array[pattern],
112+
-- List of patterns to match against paths in the repository
113+
-- for getting contents (see contents_by_path below)
109114
"patterns_for_content": array[pattern],
110-
"generate": (registration_api, paths: array[string], contents_by_path: table[string, string]) -> array[index_job],
115+
-- Callback function invoked with paths requested by patterns above
116+
-- for creating index jobs
117+
"generate": (
118+
registration_api,
119+
-- List of paths obtained from 'patterns' and
120+
-- 'patterns_for_content' combined.
121+
paths: array[string],
122+
-- Table mapping paths to contents for paths matched by
123+
-- 'patterns_for_content'
124+
contents_by_path: table[string, string]
125+
) -> array[index_job],
111126
}) -> recognizer
112127
```
113128
where `index_job` is an object with the following shape:
114129
```
115130
index_job = {
116-
"indexer": string, -- Docker image for the indexer
117-
"root": string, -- working directory for invoking the indexer
118-
"steps": array[{ -- preparatory steps to run before invoking the indexer (e.g. installing dependencies)
119-
"root": string, -- working directory for this step
120-
"image": string -- Docker image to use for preparatory step
121-
"commands": array[string] -- List of commands to run inside the Docker image
131+
-- Docker image for the indexer
132+
"indexer": string,
133+
-- Working directory for invoking the indexer
134+
"root": string,
135+
-- Preparatory steps to run before invoking the indexer
136+
-- such as installing dependencies
137+
"steps": array[{
138+
-- Working directory for this step
139+
"root": string,
140+
-- Docker image to use for this step
141+
"image": string,
142+
-- List of commands to run inside the Docker image
143+
"commands": array[string]
122144
}],
123-
"local_steps": array[string] -- List of commands to run inside the indexer image at "root" before invoking
124-
-- the indexer (e.g. to install dependencies)
125-
"indexer_args": array[string], -- command-line invocation for the indexer
126-
"outfile": string, -- path to the index generated by the indexer
127-
"requested_envvars": array[string], -- List of environment variables needed. These are made accessible
128-
-- to steps, local_steps, and the indexer_args command.
145+
-- List of commands to run inside the indexer image at "root"
146+
-- before invoking the indexer, such as installing dependencies.
147+
"local_steps": array[string],
148+
-- Command-line invocation for the indexer
149+
"indexer_args": array[string],
150+
-- Path to the index generated by the indexer
151+
"outfile": string,
152+
-- Names of necessary environment variables. These are
153+
-- made accessible to steps, local_steps, and the
154+
-- indexer_args command.
155+
--
156+
-- These are generally used for passing secrets.
157+
"requested_envvars": array[string],
129158
}
130159
```
131160
For installing dependencies, if the indexer image contains the relevant package manager(s),
@@ -186,3 +215,31 @@ This library defines the following two JSON utility functions:
186215
### `fun`
187216

188217
[Lua Functional](https://github.com/luafun/luafun/tree/cb6a7e25d4b55d9578fd371d1474b00e47bd29f3#lua-functional) is a high-performance functional programming library accessible via `local fun = require("fun")`. This library has a number of functional utilities to help make recognizer code a bit more expressive.
218+
219+
## Ordering guarantees and limits
220+
221+
Sourcegraph enforces several limits to avoid inference timeouts and ever-growing auto-indexing queues. These limits apply for a single round of inference for a single repository, combined across all recognizers, including any implicitly included Sourcegraph recognizers.
222+
223+
Limit | Default value
224+
:-----|-------------:
225+
The number of auto-indexing jobs inferred | 100
226+
The number of total paths passed to the inference script's `generate` functions as the second argument `paths` | 500
227+
The number of total paths with contents passed to the inference script's `generate` functions as the third argument `contents_by_paths` | 100
228+
Maximum size limit for file contents, in bytes | 1 MiB
229+
230+
<Callout type="note">Please reach out to Sourcegraph support if you'd like to change these limits.</Callout>
231+
232+
Auto-indexing jobs and paths are first ranked based on the criteria described below. If the number of jobs and/or paths exceeds the limits above, lower ranked items are discarded.
233+
234+
- For auto-indexing jobs, ranking is done based on the following:
235+
236+
- Descending order of indexer frequency (total number of inferred jobs with the same `indexer` field).
237+
- Ascending lexicographic ordering of `indexer`.
238+
- Descending order of number of path components for `root`. Shallower roots are preferrred over deeper ones as they are more likely to cover more code.
239+
- Ascending lexicographic ordering of `root` paths.
240+
241+
- For paths, ranking happens in the following order:
242+
243+
- Paths for which the contents are requested are ranked higher.
244+
- Paths with fewer components are ranked higher.
245+
- Otherwise, lexicographic ordering of paths is used.

docs/cody/clients/model-configuration.mdx

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -372,10 +372,12 @@ The following examples illustrate how to use all these settings in conjunction:
372372
// Not "experimental" or "deprecated".
373373
"statusFilter": ["beta", "stable"],
374374

375-
// Allow any models provided by Anthropic or OpenAI.
375+
// Allow any models provided by Anthropic, OpenAI, Google and Fireworks.
376376
"allow": [
377-
"anthropic::*",
378-
"openai::*"
377+
"anthropic::*", // Anthropic models
378+
"openai::*", // OpenAI models
379+
"google::*", // Google Gemini models
380+
"fireworks::*", // Autocomplete models like StarCoder and DeepSeek-V2-Coder hosted on Fireworks
379381
],
380382

381383
// Do not include any models with the Model ID containing "turbo",

0 commit comments

Comments
 (0)