Skip to content
This repository was archived by the owner on Aug 16, 2022. It is now read-only.

Commit 4e38847

Browse files
authored
Merge pull request #181 from ashwinkumar12345/crud
added read, update, and delete operations
2 parents 8827693 + fddfc16 commit 4e38847

File tree

1 file changed

+197
-3
lines changed

1 file changed

+197
-3
lines changed

docs/elasticsearch/index-data.md

Lines changed: 197 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,31 +44,40 @@ Optional document\n
4444
4545
```
4646

47-
The document is optional, because `delete` actions do not require a document. The other actions (`index`, `create`, and `update`) all require a document.
47+
The document is optional, because `delete` actions do not require a document. The other actions (`index`, `create`, and `update`) all require a document. If you specifically want the action to fail if the document already exists, use the `create` action instead of the `index` action.
4848
{: .note }
4949

50+
To index bulk data using the `curl` command, navigate to the folder where you have your file saved and run the following command:
51+
52+
```json
53+
curl -H "Content-Type: application/x-ndjson" -POST https://localhost:9200/data/_bulk -u admin:admin --insecure --data-binary "@data.json"
54+
```
55+
56+
If any one of the actions in the `_bulk` API fail, Elasticsearch continues to execute the other actions. Examine the `items` array in the response to figure out what went wrong. The entries in the `items` array are in the same order as the actions specified in the request.
57+
5058
Elasticsearch features automatic index creation when you add a document to an index that doesn't already exist. It also features automatic ID generation if you don't specify an ID in the request. This simple example automatically creates the movies index, indexes the document, and assigns it a unique ID:
5159

5260
```json
5361
POST movies/_doc
5462
{ "title": "Spirited Away" }
5563
```
5664

57-
Automatic ID generation has a clear downside: because the indexing request didn't specify a document ID, you can't easily update the document at a later time. To specify an ID of 1, use the following request, and note the use of PUT instead of POST:
65+
Automatic ID generation has a clear downside: because the indexing request didn't specify a document ID, you can't easily update the document at a later time. Also, if you run this request 10 times, Elasticsearch indexes this document as 10 different documents with unique IDs. To specify an ID of 1, use the following request, and note the use of PUT instead of POST:
5866

5967
```json
6068
PUT movies/_doc/1
6169
{ "title": "Spirited Away" }
6270
```
6371

72+
Because you must specify an ID, if you run this command 10 times, you still have just one document indexed with the `_version` field incremented to 10.
73+
6474
Indices default to one primary shard and one replica. If you want to specify non-default settings, create the index before adding documents:
6575

6676
```json
6777
PUT more-movies
6878
{ "settings": { "number_of_shards": 6, "number_of_replicas": 2 } }
6979
```
7080

71-
7281
## Naming restrictions for indices
7382

7483
Elasticsearch indices have the following naming restrictions:
@@ -78,3 +87,188 @@ Elasticsearch indices have the following naming restrictions:
7887
- Index names can't contain spaces, commas, or the following characters:
7988

8089
`:`, `"`, `*`, `+`, `/`, `\`, `|`, `?`, `#`, `>`, or `<`
90+
91+
## Read data
92+
93+
After you index a document, you can retrieve it by sending a GET request to the same endpoint that you used for indexing:
94+
95+
```json
96+
GET movies/_doc/1
97+
98+
{
99+
"_index" : "movies",
100+
"_type" : "_doc",
101+
"_id" : "1",
102+
"_version" : 1,
103+
"_seq_no" : 0,
104+
"_primary_term" : 1,
105+
"found" : true,
106+
"_source" : {
107+
"title" : "Spirited Away"
108+
}
109+
}
110+
```
111+
112+
You can see the document in the `_source` object. If the document is not found, the `found` key is `false` and the `_source` object is not part of the response.
113+
114+
To retrieve multiple documents with a single command, use the `_mget` operation.
115+
The format for retrieving multiple documents is similar to the `_bulk` operation, where you must specify the index and ID in the request body:
116+
117+
```json
118+
GET _mget
119+
{
120+
"docs": [
121+
{
122+
"_index": "<index>",
123+
"_id": "<id>"
124+
},
125+
{
126+
"_index": "<index>",
127+
"_id": "<id>"
128+
}
129+
]
130+
}
131+
```
132+
133+
To only return specific fields in a document:
134+
135+
```json
136+
GET _mget
137+
{
138+
"docs": [
139+
{
140+
"_index": "<index>",
141+
"_id": "<id>",
142+
"_source": "field1"
143+
},
144+
{
145+
"_index": "<index>",
146+
"_id": "<id>",
147+
"_source": "field2"
148+
}
149+
]
150+
}
151+
```
152+
153+
To check if a document exists:
154+
155+
```json
156+
HEAD movies/_doc/<doc-id>
157+
```
158+
159+
If the document exists, you get back a `200 OK` response, and if it doesn't, you get back a `404 - Not Found` error.
160+
161+
## Update data
162+
163+
To update existing fields or to add new fields, send a POST request to the `_update` operation with your changes in a `doc` object:
164+
165+
```json
166+
POST movies/_update/1
167+
{
168+
"doc": {
169+
"title": "Castle in the Sky",
170+
"genre": ["Animation", "Fantasy"]
171+
}
172+
}
173+
```
174+
175+
Note the updated `title` field and new `genre` field:
176+
177+
```json
178+
GET movies/_doc/1
179+
180+
{
181+
"_index" : "movies",
182+
"_type" : "_doc",
183+
"_id" : "1",
184+
"_version" : 2,
185+
"_seq_no" : 1,
186+
"_primary_term" : 1,
187+
"found" : true,
188+
"_source" : {
189+
"title" : "Castle in the Sky",
190+
"genre" : [
191+
"Animation",
192+
"Fantasy"
193+
]
194+
}
195+
}
196+
```
197+
198+
The document also has an incremented `_version` field. Use this field to keep track of how many times a document is updated.
199+
200+
POST requests make partial updates to documents. To altogether replace a document, use a PUT request:
201+
202+
```json
203+
PUT movies/_doc/1
204+
{
205+
"title": "Spirited Away"
206+
}
207+
```
208+
209+
The document with ID of 1 will contain only the `title` field, because the entire document will be replaced with the document indexed in this PUT request.
210+
211+
Use the `upsert` object to conditionally update documents based on whether they already exist. Here, if the document exists, its `title` field changes to `Castle in the Sky`. If it doesn't, Elasticsearch indexes the document in the `upsert` object.
212+
213+
```json
214+
POST movies/_update/2
215+
{
216+
"doc": {
217+
"title": "Castle in the Sky"
218+
},
219+
"upsert": {
220+
"title": "Only Yesterday",
221+
"genre": ["Animation", "Fantasy"],
222+
"date": 1993
223+
}
224+
}
225+
```
226+
227+
#### Sample response
228+
229+
```json
230+
{
231+
"_index" : "movies",
232+
"_type" : "_doc",
233+
"_id" : "2",
234+
"_version" : 2,
235+
"result" : "updated",
236+
"_shards" : {
237+
"total" : 2,
238+
"successful" : 1,
239+
"failed" : 0
240+
},
241+
"_seq_no" : 3,
242+
"_primary_term" : 1
243+
}
244+
```
245+
246+
Each update operation for a document has a unique combination of the `_seq_no` and `_primary_term` values.
247+
248+
Elasticsearch first writes your updates to the primary shard and then sends this change to all the replica shards. An uncommon issue can occur if multiple users of your Elasticsearch-based application make updates to existing documents in the same index. In this situation, another user can read and update a document from a replica before it receives your update from the primary shard. Your update operation then ends up updating an older version of the document. In the best case, you and the other user make the same changes, and the document remains accurate. In the worst case, the document now contains out-of-date information.
249+
250+
To prevent this situation, use the `_seq_no` and `_primary_term` values in the request header:
251+
252+
```json
253+
POST movies/_update/2?if_seq_no=3&if_primary_term=1
254+
{
255+
"doc": {
256+
"title": "Castle in the Sky",
257+
"genre": ["Animation", "Fantasy"]
258+
}
259+
}
260+
```
261+
262+
If the document is updated after we retrieved it, the `_seq_no` and `_primary_term` values are different and our update operation fails with a `409 — Conflict` error.
263+
264+
When using the `_bulk` API, specify the `_seq_no` and `_primary_term` values within the action metadata.
265+
266+
## Delete data
267+
268+
To delete a document from an index, use a DELETE request:
269+
270+
```json
271+
DELETE movies/_doc/1
272+
```
273+
274+
The DELETE operation increments the `_version` field. If you add the document back to the same ID, the `_version` field increments again. This behavior occurs because Elasticsearch deletes the document `_source`, but retains its metadata.

0 commit comments

Comments
 (0)