Skip to content

Commit 2f5be65

Browse files
Merge branch 'main' into dont-wipe-xpack
2 parents a058acd + 3f42a02 commit 2f5be65

File tree

26 files changed

+3396
-75
lines changed

26 files changed

+3396
-75
lines changed

docs/images/python-example.png

107 KB
Loading

docs/reference/querying.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Querying
2+
3+
The Python Elasticsearch client provides several ways to send queries to Elasticsearch. This document explains the details of how to construct and execute queries using the client. This document does not cover the DSL module.
4+
5+
## From API URLs to function calls
6+
7+
Elasticsearch APIs are grouped by namespaces.
8+
9+
* There's the global namespace, with APIs like the Search API (`GET _search`) or the Index API (`PUT /<target>/_doc/<_id>` and related endpoints).
10+
* Then there are all the other namespaces, such as:
11+
* Indices with APIs like the Create index API (`PUT /my-index`),
12+
* ES|QL with the Run an ES|QL query API (`POST /_async`),
13+
* and so on.
14+
15+
As a result, when you know which namespace and function you need, you can call the function. Assuming that `client` is an Elasticsearch instance, here is how you would call the examples from above:
16+
17+
* Global namespace: `client.search(...)` and `client.index(...)`
18+
* Other namespaces:
19+
* Indices: `client.indices.create(...)`
20+
* ES|QL: `client.esql.query(...)`
21+
22+
How can you figure out the namespace?
23+
24+
* The [Elasticsearch API docs](https://www.elastic.co/docs/api/doc/elasticsearch/) can help, even though the tags it uses do not fully map to namespaces.
25+
* You can also use the client documentation, by:
26+
* browsing the [Elasticsearch API Reference](https://elasticsearch-py.readthedocs.io/en/stable/api.html) page, or
27+
* searching for your endpoint using [Read the Docs](https://elasticsearch-py.readthedocs.io/) search, which is powered by Elasticsearch!
28+
* Finally, for Elasticsearch 8.x, most examples in the [Elasticsearch guide](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html) are also available in Python. (This is still a work in progress for Elasticsearch 9.x.) In the example below, `client.ingest.put_pipeline(...)` is the function that calls the "Create or update a pipeline" API.
29+
30+
31+
:::{image} ../images/python-example.png
32+
:alt: Python code example in the Elasticsearch guide
33+
:::
34+
35+
## Parameters
36+
37+
Now that you know which functions to call, the next step is parameters. To avoid ambiguity, the Python Elasticsearch client mandates keyword arguments. To give an example, let's look at the ["Create an index" API](https://elasticsearch-py.readthedocs.io/en/stable/api/indices.html#elasticsearch.client.IndicesClient.create). There's only one required parameter, `index`, so the minimal form looks like this:
38+
39+
```python
40+
from elasticsearch import Elasticsearch
41+
42+
client = Elasticsearch("http://localhost:9200", api_key="...")
43+
44+
client.indices.create(index="my-index")
45+
```
46+
47+
You can also use other parameters, including the first level of body parameters, such as:
48+
49+
```python
50+
resp = client.indices.create(
51+
index="logs",
52+
aliases={"logs-alias": {}},
53+
mappings={"name": {"type": "text"}},
54+
)
55+
print(resp)
56+
```
57+
58+
In this case, the client will send to Elasticsearch the following JSON body:
59+
60+
```console
61+
PUT /logs
62+
{
63+
"aliases": {"logs-alias": {}},
64+
"mappings": {"name": {"type": "text"}}
65+
}
66+
```
67+
68+
## Unknown parameters or APIs
69+
70+
Like other clients, the Python Elasticsearch client is generated from the [Elasticsearch specification](https://github.com/elastic/elasticsearch-specification). While we strive to keep it up to date, it is not (yet!) perfect, and sometimes body parameters are missing. In this case, you can specify the body directly, as follows:
71+
72+
```python
73+
resp = client.indices.create(
74+
index="logs",
75+
body={
76+
"aliases": {"logs-alias": {}},
77+
"mappings": {"name": {"type": "text"}},
78+
"missing_parameter": "foo",
79+
}
80+
)
81+
print(resp)
82+
```
83+
84+
In the event where an API is missing, you need to use the low-level `perform_request` function:
85+
86+
```python
87+
resp = client.perform_request(
88+
"PUT",
89+
"/logs"
90+
index="logs",
91+
headers={"content-type": "application/json", "accept": "application/json"},
92+
body={
93+
"aliases": {"logs-alias": {}},
94+
"mappings": {"name": {"type": "text"}},
95+
"missing_parameter": "foo",
96+
}
97+
)
98+
print(resp)
99+
```
100+
101+
One benefit of this function is that it lets you use arbitrary headers, such as the `es-security-runas-user` header used to [impersonate users](https://www.elastic.co/guide/en/elasticsearch/reference/current/run-as-privilege.html).
102+
103+
104+
## Options
105+
106+
You can specify options such as request timeouts or retries using the `.options()` API, see the [Configuration](./configuration.md) page for details.

docs/reference/toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ toc:
44
- file: installation.md
55
- file: connecting.md
66
- file: configuration.md
7+
- file: querying.md
78
- file: async.md
89
- file: integrations.md
910
children:

elasticsearch/_async/client/__init__.py

Lines changed: 34 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,8 @@
8787
_rewrite_parameters,
8888
_stability_warning,
8989
client_node_configs,
90+
is_requests_http_auth,
91+
is_requests_node_class,
9092
)
9193
from .watcher import WatcherClient
9294
from .xpack import XPackClient
@@ -178,6 +180,7 @@ def __init__(
178180
t.Callable[[t.Dict[str, t.Any], NodeConfig], t.Optional[NodeConfig]]
179181
] = None,
180182
meta_header: t.Union[DefaultType, bool] = DEFAULT,
183+
http_auth: t.Union[DefaultType, t.Any] = DEFAULT,
181184
# Internal use only
182185
_transport: t.Optional[AsyncTransport] = None,
183186
) -> None:
@@ -225,9 +228,26 @@ def __init__(
225228
sniff_callback = default_sniff_callback
226229

227230
if _transport is None:
231+
requests_session_auth = None
232+
if http_auth is not None and http_auth is not DEFAULT:
233+
if is_requests_http_auth(http_auth):
234+
# If we're using custom requests authentication
235+
# then we need to alert the user that they also
236+
# need to use 'node_class=requests'.
237+
if not is_requests_node_class(node_class):
238+
raise ValueError(
239+
"Using a custom 'requests.auth.AuthBase' class for "
240+
"'http_auth' must be used with node_class='requests'"
241+
)
242+
243+
# Reset 'http_auth' to DEFAULT so it's not consumed below.
244+
requests_session_auth = http_auth
245+
http_auth = DEFAULT
246+
228247
node_configs = client_node_configs(
229248
hosts,
230249
cloud_id=cloud_id,
250+
requests_session_auth=requests_session_auth,
231251
connections_per_node=connections_per_node,
232252
http_compress=http_compress,
233253
verify_certs=verify_certs,
@@ -314,6 +334,7 @@ def __init__(
314334
self._headers["x-opaque-id"] = opaque_id
315335
self._headers = resolve_auth_headers(
316336
self._headers,
337+
http_auth=http_auth,
317338
api_key=api_key,
318339
basic_auth=basic_auth,
319340
bearer_auth=bearer_auth,
@@ -1468,7 +1489,7 @@ async def delete_by_query(
14681489
If the request can target data streams, this argument determines whether
14691490
wildcard expressions match hidden data streams. It supports comma-separated
14701491
values, such as `open,hidden`.
1471-
:param from_: Starting offset (default: 0)
1492+
:param from_: Skips the specified number of documents.
14721493
:param ignore_unavailable: If `false`, the request returns an error if it targets
14731494
a missing or closed index.
14741495
:param lenient: If `true`, format-based query failures (such as providing text
@@ -3307,7 +3328,8 @@ async def msearch(
33073328
computationally expensive named queries on a large number of hits may add
33083329
significant overhead.
33093330
:param max_concurrent_searches: Maximum number of concurrent searches the multi
3310-
search API can execute.
3331+
search API can execute. Defaults to `max(1, (# of data nodes * min(search
3332+
thread pool size, 10)))`.
33113333
:param max_concurrent_shard_requests: Maximum number of concurrent shard requests
33123334
that each sub-search request executes per node.
33133335
:param pre_filter_shard_size: Defines a threshold that enforces a pre-filter
@@ -3635,6 +3657,7 @@ async def open_point_in_time(
36353657
human: t.Optional[bool] = None,
36363658
ignore_unavailable: t.Optional[bool] = None,
36373659
index_filter: t.Optional[t.Mapping[str, t.Any]] = None,
3660+
max_concurrent_shard_requests: t.Optional[int] = None,
36383661
preference: t.Optional[str] = None,
36393662
pretty: t.Optional[bool] = None,
36403663
routing: t.Optional[str] = None,
@@ -3690,6 +3713,8 @@ async def open_point_in_time(
36903713
a missing or closed index.
36913714
:param index_filter: Filter indices if the provided query rewrites to `match_none`
36923715
on every shard.
3716+
:param max_concurrent_shard_requests: Maximum number of concurrent shard requests
3717+
that each sub-search request executes per node.
36933718
:param preference: The node or shard the operation should be performed on. By
36943719
default, it is random.
36953720
:param routing: A custom value that is used to route operations to a specific
@@ -3717,6 +3742,8 @@ async def open_point_in_time(
37173742
__query["human"] = human
37183743
if ignore_unavailable is not None:
37193744
__query["ignore_unavailable"] = ignore_unavailable
3745+
if max_concurrent_shard_requests is not None:
3746+
__query["max_concurrent_shard_requests"] = max_concurrent_shard_requests
37203747
if preference is not None:
37213748
__query["preference"] = preference
37223749
if pretty is not None:
@@ -4257,7 +4284,7 @@ async def render_search_template(
42574284
human: t.Optional[bool] = None,
42584285
params: t.Optional[t.Mapping[str, t.Any]] = None,
42594286
pretty: t.Optional[bool] = None,
4260-
source: t.Optional[str] = None,
4287+
source: t.Optional[t.Union[str, t.Mapping[str, t.Any]]] = None,
42614288
body: t.Optional[t.Dict[str, t.Any]] = None,
42624289
) -> ObjectApiResponse[t.Any]:
42634290
"""
@@ -4718,7 +4745,8 @@ async def search(
47184745
limit the impact of the search on the cluster in order to limit the number
47194746
of concurrent shard requests.
47204747
:param min_score: The minimum `_score` for matching documents. Documents with
4721-
a lower `_score` are not included in the search results.
4748+
a lower `_score` are not included in search results and results collected
4749+
by aggregations.
47224750
:param pit: Limit the search to a point in time (PIT). If you provide a PIT,
47234751
you cannot specify an `<index>` in the request path.
47244752
:param post_filter: Use the `post_filter` parameter to filter search results.
@@ -5661,7 +5689,7 @@ async def search_template(
56615689
search_type: t.Optional[
56625690
t.Union[str, t.Literal["dfs_query_then_fetch", "query_then_fetch"]]
56635691
] = None,
5664-
source: t.Optional[str] = None,
5692+
source: t.Optional[t.Union[str, t.Mapping[str, t.Any]]] = None,
56655693
typed_keys: t.Optional[bool] = None,
56665694
body: t.Optional[t.Dict[str, t.Any]] = None,
56675695
) -> ObjectApiResponse[t.Any]:
@@ -6399,7 +6427,7 @@ async def update_by_query(
63996427
wildcard expressions match hidden data streams. It supports comma-separated
64006428
values, such as `open,hidden`. Valid values are: `all`, `open`, `closed`,
64016429
`hidden`, `none`.
6402-
:param from_: Starting offset (default: 0)
6430+
:param from_: Skips the specified number of documents.
64036431
:param ignore_unavailable: If `false`, the request returns an error if it targets
64046432
a missing or closed index.
64056433
:param lenient: If `true`, format-based query failures (such as providing text

elasticsearch/_async/client/_base.py

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@
6868

6969
def resolve_auth_headers(
7070
headers: Optional[Mapping[str, str]],
71+
http_auth: Union[DefaultType, None, Tuple[str, str], str] = DEFAULT,
7172
api_key: Union[DefaultType, None, Tuple[str, str], str] = DEFAULT,
7273
basic_auth: Union[DefaultType, None, Tuple[str, str], str] = DEFAULT,
7374
bearer_auth: Union[DefaultType, None, str] = DEFAULT,
@@ -77,7 +78,32 @@ def resolve_auth_headers(
7778
elif not isinstance(headers, HttpHeaders):
7879
headers = HttpHeaders(headers)
7980

81+
resolved_http_auth = http_auth if http_auth is not DEFAULT else None
8082
resolved_basic_auth = basic_auth if basic_auth is not DEFAULT else None
83+
if resolved_http_auth is not None:
84+
if resolved_basic_auth is not None:
85+
raise ValueError(
86+
"Can't specify both 'http_auth' and 'basic_auth', "
87+
"instead only specify 'basic_auth'"
88+
)
89+
if isinstance(http_auth, str) or (
90+
isinstance(resolved_http_auth, (list, tuple))
91+
and all(isinstance(x, str) for x in resolved_http_auth)
92+
):
93+
resolved_basic_auth = resolved_http_auth
94+
else:
95+
raise TypeError(
96+
"The deprecated 'http_auth' parameter must be either 'Tuple[str, str]' or 'str'. "
97+
"Use either the 'basic_auth' parameter instead"
98+
)
99+
100+
warnings.warn(
101+
"The 'http_auth' parameter is deprecated. "
102+
"Use 'basic_auth' or 'bearer_auth' parameters instead",
103+
category=DeprecationWarning,
104+
stacklevel=warn_stacklevel(),
105+
)
106+
81107
resolved_api_key = api_key if api_key is not DEFAULT else None
82108
resolved_bearer_auth = bearer_auth if bearer_auth is not DEFAULT else None
83109
if resolved_api_key or resolved_basic_auth or resolved_bearer_auth:

elasticsearch/_async/client/async_search.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -401,7 +401,7 @@ async def submit(
401401
limit the impact of the search on the cluster in order to limit the number
402402
of concurrent shard requests
403403
:param min_score: Minimum _score for matching documents. Documents with a lower
404-
_score are not included in the search results.
404+
_score are not included in search results and results collected by aggregations.
405405
:param pit: Limits the search to a point in time (PIT). If you provide a PIT,
406406
you cannot specify an <index> in the request path.
407407
:param post_filter:

elasticsearch/_async/client/fleet.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -430,7 +430,7 @@ async def search(
430430
:param lenient:
431431
:param max_concurrent_shard_requests:
432432
:param min_score: Minimum _score for matching documents. Documents with a lower
433-
_score are not included in the search results.
433+
_score are not included in search results and results collected by aggregations.
434434
:param pit: Limits the search to a point in time (PIT). If you provide a PIT,
435435
you cannot specify an <index> in the request path.
436436
:param post_filter:

0 commit comments

Comments
 (0)