Skip to content
This repository was archived by the owner on Aug 16, 2022. It is now read-only.

Commit dc77c72

Browse files
Merge pull request #20 from opendistro/master
merge
2 parents 47ddf27 + 6db6437 commit dc77c72

34 files changed

+693
-159
lines changed

Gemfile

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,22 @@ source "https://rubygems.org"
88
#
99
# This will help ensure the proper Jekyll version is running.
1010
# Happy Jekylling!
11-
gem "jekyll", "~> 3.8.5"
11+
# gem "jekyll", "~> 3.9.0"
1212

1313
# This is the default theme for new Jekyll sites. You may change this to anything you like.
1414
gem "just-the-docs", "~> 0.3.3"
1515

1616
# If you want to use GitHub Pages, remove the "gem "jekyll"" above and
1717
# uncomment the line below. To upgrade, run `bundle update github-pages`.
18-
# gem "github-pages", group: :jekyll_plugins
18+
19+
gem 'github-pages', group: :jekyll_plugins
1920

2021
# If you have any plugins, put them here!
21-
group :jekyll_plugins do
22-
# gem "jekyll-feed", "~> 0.6"
23-
gem "jekyll-remote-theme"
24-
gem "jekyll-redirect-from"
25-
end
22+
# group :jekyll_plugins do
23+
# # gem "jekyll-feed", "~> 0.6"
24+
# gem "jekyll-remote-theme"
25+
# gem "jekyll-redirect-from"
26+
# end
2627

2728
# Windows does not include zoneinfo files, so bundle the tzinfo-data gem
2829
gem "tzinfo-data", platforms: [:mingw, :mswin, :x64_mingw, :jruby]

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,10 +201,14 @@ If you're making major changes to the documentation and need to see the rendered
201201

202202
Use `curl -XGET https://localhost:9200 -u admin:admin -k` to verify the Elasticsearch version.
203203

204-
1. Update the plugin compatibility table in `docs/install/plugin.md` and `docs/kibana/plugins.md`.
204+
1. Update the plugin compatibility table in `docs/install/plugin.md`.
205205

206206
Use `curl -XGET https://localhost:9200/_cat/plugins -u admin:admin -k` to get the correct version strings.
207207

208+
1. Update the plugin compatibility table in `docs/kibana/plugins.md`.
209+
210+
Use `docker ps` to find the ID for the Kibana node. Then use `docker exec -it <kibana-node-id> /bin/bash` to get shell access. Finally, run `./bin/kibana-plugin list` to get the plugins and version strings.
211+
208212
1. Run a build (`build.sh`), and look for any warnings or errors you introduced.
209213
1. Verify that the individual plugin download links in `docs/install/plugins.md` and `docs/kibana/plugins.md` work.
210214
1. Check for any other bad links (`check-links.sh`). Expect a few false positives for the `localhost` links.

_config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,8 @@ baseurl: "/for-elasticsearch-docs" # the subpath of your site, e.g. /blog
2020
url: "https://opendistro.github.io" # the base hostname & protocol for your site, e.g. http://example.com
2121
permalink: pretty
2222

23-
odfe_version: 1.12.0
24-
es_version: 7.10.0
23+
odfe_version: 1.13.0
24+
es_version: 7.10.2
2525

2626
# Build settings
2727
markdown: kramdown

docs/async/index.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
---
2+
layout: default
3+
title: Asynchronous Search
4+
nav_order: 51
5+
has_children: true
6+
---
7+
8+
# Asynchronous Search
9+
10+
Searching large volumes of data can take a long time, especially if you're searching across warm nodes or multiple remote clusters.
11+
12+
Asynchronous search lets you run search requests that run in the background. You can monitor the progress of these searches and get back partial results as they become available. After the search finishes, you can save the results to examine at a later time.
13+
14+
## REST API
15+
16+
To perform an asynchronous search, send requests to `_opendistro/_asynchronous_search`, with your query in the request body:
17+
18+
```json
19+
POST _opendistro/_asynchronous_search
20+
```
21+
22+
You can specify the following options.
23+
24+
Options | Description | Default value | Required
25+
:--- | :--- |:--- |:--- |
26+
`wait_for_completion_timeout` | Specifies the amount of time that you plan to wait for the results. You can see whatever results you get within this time just like in a normal search. You can poll the remaining results based on an ID. The maximum value is 300 seconds. | 1 second | No
27+
`keep_on_completion` | Specifies whether you want to save the results in the cluster after the search is complete. You can examine the stored results at a later time. | `false` | No
28+
`keep_alive` | Specifies the amount of time that the result is saved in the cluster. For example, `2d` means that the results are stored in the cluster for 48 hours. The saved search results are deleted after this period or if the search is cancelled. Note that this includes the query execution time. If the query overruns this time, the process cancels this query automatically. | 12 hours | No
29+
30+
#### Sample request
31+
32+
```json
33+
POST _opendistro/_asynchronous_search/?pretty&size=10&wait_for_completion_timeout=1ms&keep_on_completion=true&request_cache=false
34+
{
35+
"aggs": {
36+
"city": {
37+
"terms": {
38+
"field": "city",
39+
"size": 10
40+
}
41+
}
42+
}
43+
}
44+
```
45+
46+
#### Sample response
47+
48+
```json
49+
{
50+
"*id*": "FklfVlU4eFdIUTh1Q1hyM3ZnT19fUVEUd29KLWZYUUI3TzRpdU5wMjRYOHgAAAAAAAAABg==",
51+
"state": "RUNNING",
52+
"start_time_in_millis": 1599833301297,
53+
"expiration_time_in_millis": 1600265301297,
54+
"response": {
55+
"took": 15,
56+
"timed_out": false,
57+
"terminated_early": false,
58+
"num_reduce_phases": 4,
59+
"_shards": {
60+
"total": 21,
61+
"successful": 4,
62+
"skipped": 0,
63+
"failed": 0
64+
},
65+
"hits": {
66+
"total": {
67+
"value": 807,
68+
"relation": "eq"
69+
},
70+
"max_score": null,
71+
"hits": []
72+
},
73+
"aggregations": {
74+
"city": {
75+
"doc_count_error_upper_bound": 16,
76+
"sum_other_doc_count": 403,
77+
"buckets": [
78+
{
79+
"key": "downsville",
80+
"doc_count": 1
81+
},
82+
....
83+
....
84+
....
85+
{
86+
"key": "blairstown",
87+
"doc_count": 1
88+
}
89+
]
90+
}
91+
}
92+
}
93+
}
94+
```
95+
96+
#### Response parameters
97+
98+
Options | Description
99+
:--- | :---
100+
`id` | The ID of an asynchronous search. Use this ID to monitor the progress of the search, get its partial results, and/or delete the results. If the asynchronous search finishes within the timeout period, the response doesn't include the ID because the results aren't stored in the cluster.
101+
`state` | Specifies whether the search is still running or if it has finished, and if the results persist in the cluster. The possible states are `RUNNING`, `COMPLETED`, and `PERSISTED`.
102+
`start_time_in_millis` | The start time in milliseconds.
103+
`expiration_time_in_millis` | The expiration time in milliseconds.
104+
`took` | The total time that the search is running.
105+
`response` | The actual search response.
106+
`num_reduce_phases` | The number of times that the coordinating node aggregates results from batches of shard responses (5 by default). If this number increases compared to the last retrieved results, you can expect additional results to be included in the search response.
107+
`total` | The total number of shards that run the search.
108+
`successful` | The number of shard responses that the coordinating node received successfully.
109+
`aggregations` | The partial aggregation results that have been completed by the shards so far.
110+
111+
## Get partial results
112+
113+
After you submit an asynchronous search request, you can request partial responses with the ID that you see in the asynchronous search response.
114+
115+
```json
116+
GET _opendistro/_asynchronous_search/<ID>?pretty
117+
```
118+
119+
#### Sample response
120+
121+
```json
122+
{
123+
"id": "Fk9lQk5aWHJIUUltR2xGWnpVcWtFdVEURUN1SWZYUUJBVkFVMEJCTUlZUUoAAAAAAAAAAg==",
124+
"state": "STORE_RESIDENT",
125+
"start_time_in_millis": 1599833907465,
126+
"expiration_time_in_millis": 1600265907465,
127+
"response": {
128+
"took": 83,
129+
"timed_out": false,
130+
"_shards": {
131+
"total": 20,
132+
"successful": 20,
133+
"skipped": 0,
134+
"failed": 0
135+
},
136+
"hits": {
137+
"total": {
138+
"value": 1000,
139+
"relation": "eq"
140+
},
141+
"max_score": 1,
142+
"hits": [
143+
{
144+
"_index": "bank",
145+
"_type": "_doc",
146+
"_id": "1",
147+
"_score": 1,
148+
"_source": {
149+
"email": "[email protected]",
150+
"city": "Brogan",
151+
"state": "IL"
152+
}
153+
},
154+
{....}
155+
]
156+
},
157+
"aggregations": {
158+
"city": {
159+
"doc_count_error_upper_bound": 0,
160+
"sum_other_doc_count": 997,
161+
"buckets": [
162+
{
163+
"key": "belvoir",
164+
"doc_count": 2
165+
},
166+
{
167+
"key": "aberdeen",
168+
"doc_count": 1
169+
},
170+
{
171+
"key": "abiquiu",
172+
"doc_count": 1
173+
}
174+
]
175+
}
176+
}
177+
}
178+
}
179+
```
180+
181+
After the response is successfully persisted, you get back the `STORE_RESIDENT` state in the response.
182+
183+
You can poll the ID with the `wait_for_completion_timeout` parameter to wait for the results received for the time that you specify.
184+
185+
For asynchronous searches with `keep_on_completion` as `true` and a sufficiently long `keep_alive` time, you can keep polling the IDs until the search finishes. If you don’t want to periodically poll each ID, you can retain the results in your cluster with the `keep_alive` parameter and come back to it at a later time.
186+
187+
## Delete searches and results
188+
189+
You can use the DELETE API operation to delete any ongoing asynchronous search by its ID. If the search is still running, it’s canceled. If the search is complete, the saved search results are deleted.
190+
191+
```json
192+
DELETE _opendistro/_asynchronous_search/<ID>?pretty
193+
```
194+
195+
#### Sample response
196+
197+
```json
198+
{
199+
"acknowledged": "true"
200+
}
201+
```
202+
203+
## Monitor stats
204+
205+
You can use the stats API operation to monitor asynchronous searches that are running, completed, and/or persisted.
206+
207+
```json
208+
GET _opendistro/_asynchronous_search/stats
209+
```
210+
211+
#### Sample response
212+
213+
```json
214+
{
215+
"_nodes": {
216+
"total": 8,
217+
"successful": 8,
218+
"failed": 0
219+
},
220+
"cluster_name": "264071961897:asynchronous-search",
221+
"nodes": {
222+
"JKEFl6pdRC-xNkKQauy7Yg": {
223+
"asynchronous_search_stats": {
224+
"submitted": 18236,
225+
"initialized": 112,
226+
"search_failed": 56,
227+
"search_completed": 56,
228+
"rejected": 18124,
229+
"persist_failed": 0,
230+
"cancelled": 1,
231+
"running_current": 399,
232+
"persisted": 100
233+
}
234+
}
235+
}
236+
}
237+
```
238+
239+
#### Response parameters
240+
241+
Options | Description
242+
:--- | :---
243+
`submitted` | The number of asynchronous search requests that were submitted.
244+
`initialized` | The number of asynchronous search requests that were initialized.
245+
`rejected` | The number of asynchronous search requests that were rejected.
246+
`search_completed` | The number of asynchronous search requests that completed with a successful response.
247+
`search_failed` | The number of asynchronous search requests that completed with a failed response.
248+
`persisted` | The number of asynchronous search requests whose final result successfully persisted in the cluster.
249+
`persist_failed` | The number of asynchronous search requests whose final result failed to persist in the cluster.
250+
`running_current` | The number of asynchronous search requests that are running on a given coordinator node.
251+
`cancelled` | The number of asynchronous search requests that were canceled while the search was running.

docs/async/security.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
layout: default
3+
title: Asynchronous Search Security
4+
nav_order: 2
5+
parent: Asynchronous Search
6+
has_children: false
7+
---
8+
9+
# Asynchronous search security
10+
11+
You can use the security plugin with asynchronous searches to limit non-admin users to specific actions. For example, you might want some users to only be able to submit or delete asynchronous searches, while you might want others to only view the results.
12+
13+
All asynchronous search indices are protected as system indices. Only a super admin user or an admin user with a Transport Layer Security (TLS) certificate can access system indices. For more information, see [System indices](../../security/configuration/system-indices/).
14+
15+
## Basic permissions
16+
17+
As an admin user, you can use the security plugin to assign specific permissions to users based on which API operations they need access to. For a list of supported APIs operations, see [Asynchronous search](../).
18+
19+
The security plugin has two built-in roles that cover most asynchronous search use cases: `asynchronous_search_full_access` and `asynchronous_search_read_access`. For descriptions of each, see [Predefined roles](../../security/access-control/users-roles/#predefined-roles).
20+
21+
If these roles don’t meet your needs, mix and match individual asynchronous search permissions to suit your use case. Each action corresponds to an operation in the REST API. For example, the `cluster:admin/opendistro/asynchronous_search/delete` permission lets you delete a previously submitted asynchronous search.
22+
23+
## (Advanced) Limit access by backend role
24+
25+
Use backend roles to configure fine-grained access to asynchronous searches based on roles. For example, users of different departments in an organization can view asynchronous searches owned by their own department.
26+
27+
First, make sure that your users have the appropriate [backend roles](../../security/access-control/). Backend roles usually come from an [LDAP server](../../security/configuration/ldap/) or [SAML provider](../../security/configuration/saml/). However, if you use the internal user database, you can use the REST API to [add them manually](../../security/access-control/api/#create-user).
28+
29+
Now when users view asynchronous search resources in Kibana (or make REST API calls), they only see asynchronous searches that are submitted by users who have a subset of the backend role.
30+
For example, consider two users: `judy` and `elon`.
31+
32+
`judy` has an IT backend role:
33+
34+
```json
35+
PUT _opendistro/_security/api/internalusers/judy
36+
{
37+
"password": "judy",
38+
"backend_roles": [
39+
"IT"
40+
],
41+
"attributes": {}
42+
}
43+
```
44+
45+
`elon` has an admin backend role:
46+
47+
```json
48+
PUT _opendistro/_security/api/internalusers/elon
49+
{
50+
"password": "elon",
51+
"backend_roles": [
52+
"admin"
53+
],
54+
"attributes": {}
55+
}
56+
```
57+
58+
Both `judy` and `elon` have full access to asynchronous search:
59+
60+
```json
61+
PUT _opendistro/_security/api/rolesmapping/async_full_access
62+
{
63+
"backend_roles": [],
64+
"hosts": [],
65+
"users": [
66+
"judy",
67+
"elon"
68+
]
69+
}
70+
```
71+
72+
Because they have different backend roles, an asynchronous search submitted by `judy` will not be visible to `elon` and vice versa.
73+
74+
`judy` needs to have at least the superset of all roles that `elon` has to see `elon`'s asynchronous searches.
75+
76+
For example, if `judy` has five backend roles and `elon` one has one of these roles, then `judy` can see asynchronous searches submitted by `elon`, but `elon` can’t see the asynchronous searches submitted by `judy`. This means that `judy` can perform GET and DELETE operations on asynchronous searches that are submitted by `elon`, but not the reverse.

0 commit comments

Comments
 (0)