Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion site/content/3.12/aql/high-level-operations/for.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ Also see [Combining queries with subqueries](../fundamentals/subqueries.md).
## Options

For collections and Views, the `FOR` construct supports an optional `OPTIONS`
clause to modify behavior. The general syntax is:
clause to modify the behavior. The general syntax is as follows:

<pre><code>FOR <em>variableName</em> IN <em>expression</em> OPTIONS { <em>option</em>: <em>value</em>, <em>...</em> }</code></pre>

Expand Down
103 changes: 74 additions & 29 deletions site/content/3.12/aql/high-level-operations/search.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ You can use the special `includeAllFields`
[`arangosearch` View property](../../index-and-search/arangosearch/arangosearch-views-reference.md#link-properties)
to index all (sub-)attributes of the source documents if desired.

## SEARCH with SORT
## `SEARCH` with `SORT`

The documents emitted from a View can be sorted by attribute values with the
standard [SORT() operation](sort.md), using one or multiple
Expand Down Expand Up @@ -283,38 +283,83 @@ a score of `0` will be returned for all documents.

## Search Options

The `SEARCH` operation accepts an options object with the following attributes:

- `collections` (array, _optional_): array of strings with collection names to
restrict the search to certain source collections
- `conditionOptimization` (string, _optional_): controls how search criteria
get optimized. Possible values:
- `"auto"` (default): convert conditions to disjunctive normal form (DNF) and
apply optimizations. Removes redundant or overlapping conditions, but can
take quite some time even for a low number of nested conditions.
- `"none"`: search the index without optimizing the conditions.
<!-- Internal only: nodnf, noneg -->
- `countApproximate` (string, _optional_): controls how the total count of rows
is calculated if the `fullCount` option is enabled for a query or when
a `COLLECT WITH COUNT` clause is executed
- `"exact"` (default): rows are actually enumerated for a precise count.
- `"cost"`: a cost-based approximation is used. Does not enumerate rows and
returns an approximate result with O(1) complexity. Gives a precise result
if the `SEARCH` condition is empty or if it contains a single term query
only (e.g. `SEARCH doc.field == "value"`), the usual eventual consistency
of Views aside.

**Examples**

Given a View with three linked collections `coll1`, `coll2` and `coll3` it is
possible to return documents from the first two collections only and ignore the
third using the `collections` option:
The `SEARCH` operation supports an optional `OPTIONS` clause to modify the
behavior. The general syntax is as follows:

<pre><code>SEARCH <em>expression</em> OPTIONS { <em>option</em>: <em>value</em>, <em>...</em> }</code></pre>

### `collections`

You can specify an array of strings with collection names to restrict the search
to certain source collections.

Given a View with three linked collections `coll1`, `coll2`, and `coll3`, you
can return documents from the first two collections only and ignore the third
collection by setting the `collections` option to `["coll1", "coll2"]`:

```aql
FOR doc IN viewName
SEARCH true OPTIONS { collections: ["coll1", "coll2"] }
RETURN doc
```

The search expression `true` matches all View documents. You can use any valid
expression here while limiting the scope to the chosen source collections.
The search expression `true` in the above example matches all View documents.
You can use any valid expression here while limiting the scope to the chosen
source collections.

### `conditionOptimization`

You can specify one of the following values for this option to control how
search criteria get optimized:

- `"auto"` (default): convert conditions to disjunctive normal form (DNF) and
apply optimizations. Removes redundant or overlapping conditions, but can
take quite some time even for a low number of nested conditions.
- `"none"`: search the index without optimizing the conditions.
<!-- Internal only: nodnf, noneg -->

See [Optimizing View and inverted index query performance](../../index-and-search/arangosearch/performance.md#condition-optimization-options)
for an example.

### `countApproximate`

This option controls how the total count of rows is calculated if the `fullCount`
option is enabled for a query or when a `COLLECT WITH COUNT` clause is executed.
You can set it to one of the following values:

- `"exact"` (default): rows are actually enumerated for a precise count.
- `"cost"`: a cost-based approximation is used. Does not enumerate rows and
returns an approximate result with O(1) complexity. Gives a precise result
if the `SEARCH` condition is empty or if it contains a single term query
only (e.g. `SEARCH doc.field == "value"`), the usual eventual consistency
of Views aside.

See [Optimizing View and inverted index query performance](../../index-and-search/arangosearch/performance.md#count-approximation)
for an example.

### `parallelism`

A `SEARCH` operation can optionally process index segments in parallel using
multiple threads. This can speed up search queries but increases CPU and memory
utilization.

If you omit the `parallelism` option, then the default parallelism as defined by
the [`--arangosearch.default-parallelism` startup option](../../components/arangodb-server/options.md#--arangosearchdefault-parallelism)
is used. If you set it to a value of `1`, the search execution is not
parallelized. If the value is greater than `1`, then up to that many worker
threads can be used for concurrently processing index segments. The maximum
number of total parallel execution threads is defined by the
[`--arangosearch.execution-threads-limit` startup option](../../components/arangodb-server/options.md#--arangosearchexecution-threads-limit)
that defaults to twice the number of CPU cores.

The `parallelism` option should be considered a hint. Not all search queries are
eligible. Queries also don't wait for the specified number of threads to be
available. They start immediately even if only single-threaded and may acquire
more threads later.

```aql
FOR doc IN restaurantsView
SEARCH ANALYZER(GEO_INTERSECTS(rect, doc.geometry), "geojson")
OPTIONS { parallelism: 16 }
RETURN doc.geometry
```
10 changes: 10 additions & 0 deletions site/content/3.12/index-and-search/arangosearch/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -675,3 +675,13 @@ db._createView("articlesView", "search-alias", { indexes: [
{ collection: "articles", index: "inv-idx" }
] });
```

## Parallel index segment processing

<small>Introduced in: v3.12.0</small>

You can speed up `SEARCH` queries against Views using the `parallelism` option
to process index segment using multiple threads.

See [`SEARCH` operation in AQL](../../aql/high-level-operations/search.md#parallelism)
for details.
Original file line number Diff line number Diff line change
Expand Up @@ -142,13 +142,14 @@ produced no warnings.

#### Metrics API

The metrics endpoint includes the following new metrics about AQL queries and
ongoing dumps:
The metrics endpoint includes the following new metrics about AQL queries,
ongoing dumps, and ArangoSearch execution threads:

- `arangodb_aql_cursors_active`
- `arangodb_dump_memory_usage`
- `arangodb_dump_ongoing`
- `arangodb_dump_threads_blocked_total`
- `arangodb_search_execution_threads_demand`

---

Expand Down
20 changes: 20 additions & 0 deletions site/content/3.12/release-notes/version-3.12/whats-new-in-3-12.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,26 @@ for examples.

This feature is only available in the Enterprise Edition.

### `SEARCH` parallelization

In search queries against Views, you can set the new `parallelism` option for
`SEARCH` operations to optionally process index segments in parallel using
multiple threads. This can speed up search queries.

The default value for the `parallelism` option is defined by the new
`--arangosearch.default-parallelism` startup option that defaults to `1`.

The new `--arangosearch.execution-threads-limit` startup option controls how
many threads can be used in total for search queries. The new
`arangodb_search_execution_threads_demand` metric reports the number of threads
that queries request. If it is below the configured thread limit, it coincides
with the number of active threads. If it exceeds the limit, some queries cannot
currently get the threads as requested and may have to use a single thread until
more become available.

See [`SEARCH` operation in AQL](../../aql/high-level-operations/search.md#parallelism)
for details.

## Analyzers


Expand Down