diff --git a/site/content/3.12/aql/high-level-operations/for.md b/site/content/3.12/aql/high-level-operations/for.md index e0d4088ec3..d2c5e22ea7 100644 --- a/site/content/3.12/aql/high-level-operations/for.md +++ b/site/content/3.12/aql/high-level-operations/for.md @@ -93,7 +93,7 @@ Also see [Combining queries with subqueries](../fundamentals/subqueries.md). ## Options For collections and Views, the `FOR` construct supports an optional `OPTIONS` -clause to modify behavior. The general syntax is: +clause to modify the behavior. The general syntax is as follows:
FOR variableName IN expression OPTIONS { option: value, ... }
diff --git a/site/content/3.12/aql/high-level-operations/search.md b/site/content/3.12/aql/high-level-operations/search.md index 030c0bbab7..018a9c9799 100644 --- a/site/content/3.12/aql/high-level-operations/search.md +++ b/site/content/3.12/aql/high-level-operations/search.md @@ -237,7 +237,7 @@ You can use the special `includeAllFields` [`arangosearch` View property](../../index-and-search/arangosearch/arangosearch-views-reference.md#link-properties) to index all (sub-)attributes of the source documents if desired. -## SEARCH with SORT +## `SEARCH` with `SORT` The documents emitted from a View can be sorted by attribute values with the standard [SORT() operation](sort.md), using one or multiple @@ -283,32 +283,19 @@ a score of `0` will be returned for all documents. ## Search Options -The `SEARCH` operation accepts an options object with the following attributes: - -- `collections` (array, _optional_): array of strings with collection names to - restrict the search to certain source collections -- `conditionOptimization` (string, _optional_): controls how search criteria - get optimized. Possible values: - - `"auto"` (default): convert conditions to disjunctive normal form (DNF) and - apply optimizations. Removes redundant or overlapping conditions, but can - take quite some time even for a low number of nested conditions. - - `"none"`: search the index without optimizing the conditions. - -- `countApproximate` (string, _optional_): controls how the total count of rows - is calculated if the `fullCount` option is enabled for a query or when - a `COLLECT WITH COUNT` clause is executed - - `"exact"` (default): rows are actually enumerated for a precise count. - - `"cost"`: a cost-based approximation is used. Does not enumerate rows and - returns an approximate result with O(1) complexity. Gives a precise result - if the `SEARCH` condition is empty or if it contains a single term query - only (e.g. `SEARCH doc.field == "value"`), the usual eventual consistency - of Views aside. - -**Examples** - -Given a View with three linked collections `coll1`, `coll2` and `coll3` it is -possible to return documents from the first two collections only and ignore the -third using the `collections` option: +The `SEARCH` operation supports an optional `OPTIONS` clause to modify the +behavior. The general syntax is as follows: + +
SEARCH expression OPTIONS { option: value, ... }
+ +### `collections` + +You can specify an array of strings with collection names to restrict the search +to certain source collections. + +Given a View with three linked collections `coll1`, `coll2`, and `coll3`, you +can return documents from the first two collections only and ignore the third +collection by setting the `collections` option to `["coll1", "coll2"]`: ```aql FOR doc IN viewName @@ -316,5 +303,63 @@ FOR doc IN viewName RETURN doc ``` -The search expression `true` matches all View documents. You can use any valid -expression here while limiting the scope to the chosen source collections. +The search expression `true` in the above example matches all View documents. +You can use any valid expression here while limiting the scope to the chosen +source collections. + +### `conditionOptimization` + +You can specify one of the following values for this option to control how +search criteria get optimized: + +- `"auto"` (default): convert conditions to disjunctive normal form (DNF) and + apply optimizations. Removes redundant or overlapping conditions, but can + take quite some time even for a low number of nested conditions. +- `"none"`: search the index without optimizing the conditions. + + +See [Optimizing View and inverted index query performance](../../index-and-search/arangosearch/performance.md#condition-optimization-options) +for an example. + +### `countApproximate` + +This option controls how the total count of rows is calculated if the `fullCount` +option is enabled for a query or when a `COLLECT WITH COUNT` clause is executed. +You can set it to one of the following values: + +- `"exact"` (default): rows are actually enumerated for a precise count. +- `"cost"`: a cost-based approximation is used. Does not enumerate rows and + returns an approximate result with O(1) complexity. Gives a precise result + if the `SEARCH` condition is empty or if it contains a single term query + only (e.g. `SEARCH doc.field == "value"`), the usual eventual consistency + of Views aside. + +See [Optimizing View and inverted index query performance](../../index-and-search/arangosearch/performance.md#count-approximation) +for an example. + +### `parallelism` + +A `SEARCH` operation can optionally process index segments in parallel using +multiple threads. This can speed up search queries but increases CPU and memory +utilization. + +If you omit the `parallelism` option, then the default parallelism as defined by +the [`--arangosearch.default-parallelism` startup option](../../components/arangodb-server/options.md#--arangosearchdefault-parallelism) +is used. If you set it to a value of `1`, the search execution is not +parallelized. If the value is greater than `1`, then up to that many worker +threads can be used for concurrently processing index segments. The maximum +number of total parallel execution threads is defined by the +[`--arangosearch.execution-threads-limit` startup option](../../components/arangodb-server/options.md#--arangosearchexecution-threads-limit) +that defaults to twice the number of CPU cores. + +The `parallelism` option should be considered a hint. Not all search queries are +eligible. Queries also don't wait for the specified number of threads to be +available. They start immediately even if only single-threaded and may acquire +more threads later. + +```aql +FOR doc IN restaurantsView + SEARCH ANALYZER(GEO_INTERSECTS(rect, doc.geometry), "geojson") + OPTIONS { parallelism: 16 } + RETURN doc.geometry +``` diff --git a/site/content/3.12/index-and-search/arangosearch/performance.md b/site/content/3.12/index-and-search/arangosearch/performance.md index 7858925cdc..f5edc4120f 100644 --- a/site/content/3.12/index-and-search/arangosearch/performance.md +++ b/site/content/3.12/index-and-search/arangosearch/performance.md @@ -675,3 +675,13 @@ db._createView("articlesView", "search-alias", { indexes: [ { collection: "articles", index: "inv-idx" } ] }); ``` + +## Parallel index segment processing + +Introduced in: v3.12.0 + +You can speed up `SEARCH` queries against Views using the `parallelism` option +to process index segment using multiple threads. + +See [`SEARCH` operation in AQL](../../aql/high-level-operations/search.md#parallelism) +for details. diff --git a/site/content/3.12/release-notes/version-3.12/api-changes-in-3-12.md b/site/content/3.12/release-notes/version-3.12/api-changes-in-3-12.md index 418a0d0c1e..2ae11ed92a 100644 --- a/site/content/3.12/release-notes/version-3.12/api-changes-in-3-12.md +++ b/site/content/3.12/release-notes/version-3.12/api-changes-in-3-12.md @@ -142,13 +142,14 @@ produced no warnings. #### Metrics API -The metrics endpoint includes the following new metrics about AQL queries and -ongoing dumps: +The metrics endpoint includes the following new metrics about AQL queries, +ongoing dumps, and ArangoSearch execution threads: - `arangodb_aql_cursors_active` - `arangodb_dump_memory_usage` - `arangodb_dump_ongoing` - `arangodb_dump_threads_blocked_total` +- `arangodb_search_execution_threads_demand` --- diff --git a/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md b/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md index 1f72e5a3db..93a9a7fdad 100644 --- a/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md +++ b/site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md @@ -31,6 +31,26 @@ for examples. This feature is only available in the Enterprise Edition. +### `SEARCH` parallelization + +In search queries against Views, you can set the new `parallelism` option for +`SEARCH` operations to optionally process index segments in parallel using +multiple threads. This can speed up search queries. + +The default value for the `parallelism` option is defined by the new +`--arangosearch.default-parallelism` startup option that defaults to `1`. + +The new `--arangosearch.execution-threads-limit` startup option controls how +many threads can be used in total for search queries. The new +`arangodb_search_execution_threads_demand` metric reports the number of threads +that queries request. If it is below the configured thread limit, it coincides +with the number of active threads. If it exceeds the limit, some queries cannot +currently get the threads as requested and may have to use a single thread until +more become available. + +See [`SEARCH` operation in AQL](../../aql/high-level-operations/search.md#parallelism) +for details. + ## Analyzers