Merge branch 'main' into sg-next-oct30-2024

MaedahBatool · web-flow · commit 23077cfac53d · 2024-10-22T11:26:49.000-07:00
diff --git a/docs/admin/auth/index.mdx b/docs/admin/auth/index.mdx
@@ -334,6 +334,7 @@ Then add the following lines to your [site configuration](/admin/config/site_con
         "displayName": "Bitbucket Server",
         "clientID": "replace-with-the-oauth-client-id",
         "clientSecret": "replace-with-the-oauth-client-secret"
+        "allowSignup": false // This is set to false by default, which means that any bitbucket server user cannot automatically sign up to access your instance.
       }
     ]
 ```
diff --git a/docs/code-search/code-navigation/inference_configuration.mdx b/docs/code-search/code-navigation/inference_configuration.mdx
@@ -40,6 +40,8 @@ To **add** additional behaviors, you can create and register a new **recognizer*
 
 A _path recognizer_ is a concrete recognizer that advertises a set of path _globs_ it is interested in, then invokes its `generate` function with matching paths from a repository. In the following, all files matching `Snek.module` (`Snek.module`, `proj/Snek.module`, `proj/sub/Snek.module`, etc) are passed to a call to `generate` (if non-empty). The generate function will then return a list of indexing job descriptions. The [guide for auto-indexing jobs configuration](/code-search/code-navigation/auto_indexing_configuration#keys-1) gives detailed descriptions on the fields of this object.
 
+The ordering of paths and limits are defined in the [Ordering guarantees and limits](#ordering-guarantees-and-limits) section.
+
 ```lua
 local path = require("path")
 local pattern = require("sg.autoindex.patterns")
@@ -105,27 +107,54 @@ This auto-indexing-specific library defines the following two functions.
   - Type:
     ```
     ({
+      -- List of patterns to match against paths in the repository
       "patterns": array[pattern],
+      -- List of patterns to match against paths in the repository
+      -- for getting contents (see contents_by_path below)
       "patterns_for_content": array[pattern],
-      "generate": (registration_api, paths: array[string], contents_by_path: table[string, string]) -> array[index_job],
+      -- Callback function invoked with paths requested by patterns above
+      -- for creating index jobs
+      "generate": (
+          registration_api,
+          -- List of paths obtained from 'patterns' and
+          -- 'patterns_for_content' combined.
+          paths: array[string],
+          -- Table mapping paths to contents for paths matched by
+          -- 'patterns_for_content'
+          contents_by_path: table[string, string]
+      ) -> array[index_job],
     }) -> recognizer
     ```
     where `index_job` is an object with the following shape:
     ```
     index_job = {
-      "indexer": string, -- Docker image for the indexer
-      "root": string,    -- working directory for invoking the indexer
-      "steps": array[{   -- preparatory steps to run before invoking the indexer (e.g. installing dependencies)
-        "root": string,  -- working directory for this step
-        "image": string  -- Docker image to use for preparatory step
-        "commands": array[string] -- List of commands to run inside the Docker image
+      -- Docker image for the indexer
+      "indexer": string,
+      -- Working directory for invoking the indexer
+      "root": string,
+      -- Preparatory steps to run before invoking the indexer
+      -- such as installing dependencies
+      "steps": array[{
+        -- Working directory for this step
+        "root": string,
+        -- Docker image to use for this step
+        "image": string,
+        -- List of commands to run inside the Docker image
+        "commands": array[string]
       }],
-      "local_steps": array[string] -- List of commands to run inside the indexer image at "root" before invoking
-                                   -- the indexer (e.g. to install dependencies)
-      "indexer_args": array[string], -- command-line invocation for the indexer
-      "outfile": string,             -- path to the index generated by the indexer
-      "requested_envvars": array[string], -- List of environment variables needed. These are made accessible
-                                          -- to steps, local_steps, and the indexer_args command.
+      -- List of commands to run inside the indexer image at "root"
+      -- before invoking the indexer, such as installing dependencies.
+      "local_steps": array[string],
+      -- Command-line invocation for the indexer
+      "indexer_args": array[string],
+      -- Path to the index generated by the indexer
+      "outfile": string,
+      -- Names of necessary environment variables. These are
+      -- made accessible to steps, local_steps, and the
+      -- indexer_args command.
+      --
+      -- These are generally used for passing secrets.
+      "requested_envvars": array[string],
     }
     ```
     For installing dependencies, if the indexer image contains the relevant package manager(s),
@@ -186,3 +215,31 @@ This library defines the following two JSON utility functions:
 ### `fun`
 
 [Lua Functional](https://github.com/luafun/luafun/tree/cb6a7e25d4b55d9578fd371d1474b00e47bd29f3#lua-functional) is a high-performance functional programming library accessible via `local fun = require("fun")`. This library has a number of functional utilities to help make recognizer code a bit more expressive.
+
+## Ordering guarantees and limits
+
+Sourcegraph enforces several limits to avoid inference timeouts and ever-growing auto-indexing queues. These limits apply for a single round of inference for a single repository, combined across all recognizers, including any implicitly included Sourcegraph recognizers.
+
+Limit | Default value
+:-----|-------------:
+The number of auto-indexing jobs inferred | 100
+The number of total paths passed to the inference script's `generate` functions as the second argument `paths` | 500
+The number of total paths with contents passed to the inference script's `generate` functions as the third argument `contents_by_paths` | 100
+Maximum size limit for file contents, in bytes | 1 MiB
+
+<Callout type="note">Please reach out to Sourcegraph support if you'd like to change these limits.</Callout>
+
+Auto-indexing jobs and paths are first ranked based on the criteria described below. If the number of jobs and/or paths exceeds the limits above, lower ranked items are discarded.
+
+- For auto-indexing jobs, ranking is done based on the following:
+
+  - Descending order of indexer frequency (total number of inferred jobs with the same `indexer` field).
+  - Ascending lexicographic ordering of `indexer`.
+  - Descending order of number of path components for `root`. Shallower roots are preferrred over deeper ones as they are more likely to cover more code.
+  - Ascending lexicographic ordering of `root` paths.
+
+- For paths, ranking happens in the following order:
+
+  - Paths for which the contents are requested are ranked higher.
+  - Paths with fewer components are ranked higher.
+  - Otherwise, lexicographic ordering of paths is used.
diff --git a/docs/cody/clients/model-configuration.mdx b/docs/cody/clients/model-configuration.mdx
@@ -372,10 +372,12 @@ The following examples illustrate how to use all these settings in conjunction:
       // Not "experimental" or "deprecated".
       "statusFilter": ["beta", "stable"],
 
-      // Allow any models provided by Anthropic or OpenAI.
+      // Allow any models provided by Anthropic, OpenAI, Google and Fireworks.
       "allow": [
-        "anthropic::*",
-        "openai::*"
+        "anthropic::*", // Anthropic models
+        "openai::*", // OpenAI models
+        "google::*", // Google Gemini models
+        "fireworks::*", // Autocomplete models like StarCoder and DeepSeek-V2-Coder hosted on Fireworks
       ],
 
       // Do not include any models with the Model ID containing "turbo",

Original file line number	Diff line number	Diff line change
`@@ -334,6 +334,7 @@ Then add the following lines to your [site configuration](/admin/config/site_con`
`334`	`334`	`"displayName": "Bitbucket Server",`
`335`	`335`	`"clientID": "replace-with-the-oauth-client-id",`
`336`	`336`	`"clientSecret": "replace-with-the-oauth-client-secret"`
	`337`	`+ "allowSignup": false // This is set to false by default, which means that any bitbucket server user cannot automatically sign up to access your instance.`
`337`	`338`	`}`
`338`	`339`	`]`
`339`	`340`	```