Skip to content

Commit ac02644

Browse files
add migrated files
1 parent 48ec7be commit ac02644

35 files changed

+2818
-0
lines changed

docs/docset.yml

Lines changed: 490 additions & 0 deletions
Large diffs are not rendered by default.

docs/images/api_key_name.png

32 KB
Loading

docs/images/cloud_api_key.png

117 KB
Loading

docs/images/cloud_id.png

163 KB
Loading

docs/images/create_api_key.png

78.7 KB
Loading

docs/images/es_endpoint.jpg

361 KB
Loading
31.4 KB
Loading
44.8 KB
Loading
37.2 KB
Loading

docs/reference/Helpers.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
---
2+
mapped_pages:
3+
- https://www.elastic.co/guide/en/elasticsearch/client/ruby-api/current/Helpers.html
4+
---
5+
6+
# Bulk and Scroll helpers [Helpers]
7+
8+
The {{es}} Ruby client includes Bulk and Scroll helpers for working with results more efficiently.
9+
10+
11+
## Bulk helper [_bulk_helper]
12+
13+
The Bulk API in Elasticsearch allows you to perform multiple indexing or deletion operations through a single API call, resulting in reduced overhead and improved indexing speed. The actions are specified in the request body using a newline delimited JSON (NDJSON) structure. In the Elasticsearch Ruby client, the `bulk` method supports several data structures as a parameter. You can use the Bulk API in an idiomatic way without concerns about payload formatting. Refer to [Bulk requests](/reference/examples.md#ex-bulk) for more information.
14+
15+
The BulkHelper provides a better developer experience when using the Bulk API. At its simplest, you can send it a collection of hashes in an array, and it will bulk ingest them into {{es}}.
16+
17+
To use the BulkHelper, require it in your code:
18+
19+
```ruby
20+
require 'elasticsearch/helpers/bulk_helper'
21+
```
22+
23+
Instantiate a BulkHelper with a client, and an index:
24+
25+
```ruby
26+
client = Elasticsearch::Client.new
27+
bulk_helper = Elasticsearch::Helpers::BulkHelper.new(client, index)
28+
```
29+
30+
This helper works on the index you pass in during initialization, but you can change the index at any time in your code:
31+
32+
```ruby
33+
bulk_helper.index = 'new_index'
34+
```
35+
36+
If you want to index a collection of documents, use the `ingest` method:
37+
38+
```ruby
39+
documents = [
40+
{ name: 'document1', date: '2024-05-16' },
41+
{ name: 'document2', date: '2023-12-19' },
42+
{ name: 'document3', date: '2024-07-07' }
43+
]
44+
bulk_helper.ingest(documents)
45+
```
46+
47+
If you’re ingesting a large set of data and want to separate the documents into smaller pieces before sending them to {{es}}, use the `slice` parameter.
48+
49+
```ruby
50+
bulk_helper.ingest(documents, { slice: 2 })
51+
```
52+
53+
This way the data will be sent in two different bulk requests.
54+
55+
You can also include the parameters you would send to the Bulk API either in the query parameters or in the request body. The method signature is `ingest(docs, params = {}, body = {}, &block)`. Additionally, the method can be called with a block, that will provide access to the response object received from calling the Bulk API and the documents sent in the request:
56+
57+
```ruby
58+
helper.ingest(documents) { |_, docs| puts "Ingested #{docs.count} documents" }
59+
```
60+
61+
You can update and delete documents with the BulkHelper too. To delete a set of documents, you can send an array of document ids:
62+
63+
```ruby
64+
ids = ['shm0I4gB6LpJd9ljO9mY', 'sxm0I4gB6LpJd9ljO9mY', 'tBm0I4gB6LpJd9ljO9mY', 'tRm0I4gB6LpJd9ljO9mY', 'thm0I4gB6LpJd9ljO9mY', 'txm0I4gB6LpJd9ljO9mY', 'uBm0I4gB6LpJd9ljO9mY', 'uRm0I4gB6LpJd9ljO9mY', 'uhm0I4gB6LpJd9ljO9mY', 'uxm0I4gB6LpJd9ljO9mY']
65+
helper.delete(ids)
66+
```
67+
68+
To update documents, you can send the array of documents with their respective ids:
69+
70+
```ruby
71+
documents = [
72+
{name: 'updated name 1', id: 'AxkFJYgB6LpJd9ljOtr7'},
73+
{name: 'updated name 2', id: 'BBkFJYgB6LpJd9ljOtr7'}
74+
]
75+
helper.update(documents)
76+
```
77+
78+
79+
### Ingest a JSON file [_ingest_a_json_file]
80+
81+
`BulkHelper` also provides a helper to ingest data straight from a JSON file. By giving a file path as an input, the helper will parse and ingest the documents in the file:
82+
83+
```ruby
84+
file_path = './data.json'
85+
helper.ingest_json(file_path)
86+
```
87+
88+
In cases where the array of data you want to ingest is not necessarily in the root of the JSON file, you can provide the keys to access the data, for example given the following JSON file:
89+
90+
```json
91+
{
92+
"field": "value",
93+
"status": 200,
94+
"data": {
95+
"items": [
96+
{
97+
"name": "Item 1",
98+
(...)
99+
},
100+
{
101+
(...)
102+
]
103+
}
104+
}
105+
```
106+
107+
The following is an example of the Ruby code to ingest the documents in the JSON above:
108+
109+
```ruby
110+
bulk_helper.ingest_json(file_path, { keys: ['data', 'items'] })
111+
```
112+
113+
114+
## Scroll helper [_scroll_helper]
115+
116+
This helper provides an easy way to get results from a Scroll.
117+
118+
To use the ScrollHelper, require it in your code:
119+
120+
```ruby
121+
require 'elasticsearch/helpers/scroll_helper'
122+
```
123+
124+
Instantiate a ScrollHelper with a client, an index, and a body (with the scroll API parameters) which will be used in every following scroll request:
125+
126+
```ruby
127+
client = Elasticsearch::Client.new
128+
scroll_helper = Elasticsearch::Helpers::ScrollHelper.new(client, index, body)
129+
```
130+
131+
There are two ways to get the results from a scroll using the helper.
132+
133+
1. You can iterate over a scroll using the methods in `Enumerable` such as `each` and `map`:
134+
135+
```ruby
136+
scroll_helper.each do |item|
137+
puts item
138+
end
139+
```
140+
141+
2. You can fetch results by page, with the `results` function:
142+
143+
```ruby
144+
my_documents = []
145+
while !(documents = scroll_helper.results).empty?
146+
my_documents << documents
147+
end
148+
scroll_helper.clear
149+
```
150+
151+

0 commit comments

Comments
 (0)