Skip to content

Commit 6e85fac

Browse files
Update search indexing exclusion criteria
https://ampcode.com/threads/T-0390a39a-9c04-441e-8982-7e2ef7b9bf76 Co-authored-by: Amp <[email protected]>
1 parent c6fbaa2 commit 6e85fac

File tree

1 file changed

+10
-1
lines changed

1 file changed

+10
-1
lines changed

docs/admin/search.mdx

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,16 @@ will not return any result.
6767
6868
## Indexed search
6969
70-
Sourcegraph indexes the code on the default branch of each repository. This speeds up searches that hit many repositories at once. Not all files in a repository branch are indexed, we skip files that are [larger than 1 MB](#maximum-file-size) and binary files. To view which files are skipped during indexing, visit the repository settings page and click on indexing.
70+
Sourcegraph indexes the code on the default branch of each repository. This speeds up searches that hit many repositories at once. Not all files in a repository branch are indexed. We skip:
71+
72+
- Files that are [larger than 1 MB](#maximum-file-size).
73+
- Binary files.
74+
- Files exceeding 20,000 unique trigrams (sequences of three characters).
75+
- Files that are not valid UTF-8.
76+
77+
To view which files are skipped during indexing, visit the repository settings page and click on **Indexing**.
78+
79+
To force the indexer to include specific files (like `yarn.lock` or other large text files) that are otherwise skipped, add their file path or a glob pattern to the [`search.largeFiles`](/admin/config/site_config#search-largeFiles) setting in your site configuration and reindex the repository. Note that files must still be valid UTF-8 to be indexed, even if added to `search.largeFiles`.
7180
7281
For large deployments we recommend horizontally scaling indexed search. You can do this by [adjusting the number of replicas](https://github.com/sourcegraph/deploy-sourcegraph/blob/master/docs/configure#configure-indexed-search-replica-count). Sourcegraph shards repository indexes across replicas. When the replica count changes Sourcegraph will slowly rebalance indexes to ensure availability of existing indexes.
7382

0 commit comments

Comments
 (0)