Skip to content

Commit 68d4724

Browse files
Merge #224
224: Added methods add_documents_in_batches, update_documents_in_batches and their tests r=curquiza a=FlamesRunner This pull request adds the following methods: - add_documents_in_batches(documents, batch_size = 1000, primary_key = nil) - update_documents_in_batches(documents, batch_size = 1000, primary_key = nil) The tests for these methods are: - adds documents in a batch (as a array of documents) - adds document batches synchronously (as an array of documents) - updates documents in index in batches - updates documents synchronously in index in batches (as an array of documents) They are based on their non-batch methods. There are two linter errors: ``` lib/meilisearch/index.rb:7:3: C: Metrics/ClassLength: Class has too many lines. [252/226] class Index < HTTPRequest ... ^^^^^^^^^^^^^^^^^^^^^^^^^ spec/meilisearch/index/documents_spec.rb:3:1: C: Metrics/BlockLength: Block has too many lines. [500/497] RSpec.describe 'MeiliSearch::Index - Documents' do ... ``` The linter indicates that there are too many lines for `documents_spec.rb` (which I was thinking of correcting by adding a new documents_batch_spec.rb, or the less clean way of changing the linter, etc), and that `index.rb` has too many lines (this one I am less sure of, as there aren't really any other classes that separate methods and since this is my first PR here, I don't want to introduce anything that might break something). I'm unsure, so I'd like to get some feedback on these. References issue #218. Co-authored-by: Andrew Hong <[email protected]> Co-authored-by: Andrew H <[email protected]>
2 parents 62765a9 + 96ee18c commit 68d4724

File tree

3 files changed

+112
-4
lines changed

3 files changed

+112
-4
lines changed

.rubocop_todo.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# This configuration was generated by
22
# `rubocop --auto-gen-config`
3-
# on 2021-09-01 13:50:39 UTC using RuboCop version 1.19.1.
3+
# on 2021-10-12 19:14:29 UTC using RuboCop version 1.20.0.
44
# The point is for the user to remove these configuration records
55
# one by one as the offenses are removed from the code base.
66
# Note that changes in the inspected code, or installation of new
@@ -14,16 +14,16 @@ Gemspec/DateAssignment:
1414
Exclude:
1515
- 'meilisearch.gemspec'
1616

17-
# Offense count: 33
17+
# Offense count: 35
1818
# Configuration parameters: CountComments, CountAsOne, ExcludedMethods, IgnoredMethods.
1919
# IgnoredMethods: refine
2020
Metrics/BlockLength:
21-
Max: 497
21+
Max: 500
2222

2323
# Offense count: 1
2424
# Configuration parameters: CountComments, CountAsOne.
2525
Metrics/ClassLength:
26-
Max: 226
26+
Max: 256
2727

2828
# Offense count: 1
2929
Naming/AccessorMethodName:

lib/meilisearch/index.rb

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,40 @@ def update_documents!(documents, primary_key = nil)
7676
end
7777
alias add_or_update_documents! update_documents!
7878

79+
def add_documents_in_batches(documents, batch_size = 1000, primary_key = nil)
80+
update_ids = []
81+
documents.each_slice(batch_size) do |batch|
82+
update_ids.append(add_documents(batch, primary_key))
83+
end
84+
update_ids
85+
end
86+
87+
def add_documents_in_batches!(documents, batch_size = 1000, primary_key = nil)
88+
update_ids = add_documents_in_batches(documents, batch_size, primary_key)
89+
responses = []
90+
update_ids.each do |update_object|
91+
responses.append(wait_for_pending_update(update_object['updateId']))
92+
end
93+
responses
94+
end
95+
96+
def update_documents_in_batches(documents, batch_size = 1000, primary_key = nil)
97+
update_ids = []
98+
documents.each_slice(batch_size) do |batch|
99+
update_ids.append(update_documents(batch, primary_key))
100+
end
101+
update_ids
102+
end
103+
104+
def update_documents_in_batches!(documents, batch_size = 1000, primary_key = nil)
105+
update_ids = update_documents_in_batches(documents, batch_size, primary_key)
106+
responses = []
107+
update_ids.each do |update_object|
108+
responses.append(wait_for_pending_update(update_object['updateId']))
109+
end
110+
responses
111+
end
112+
79113
def delete_documents(documents_ids)
80114
if documents_ids.is_a?(Array)
81115
http_post "/indexes/#{@uid}/documents/delete-batch", documents_ids

spec/meilisearch/index/documents_spec.rb

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,17 @@
3535
expect(index.documents.count).to eq(documents.count)
3636
end
3737

38+
it 'adds documents in a batch (as a array of documents)' do
39+
response = index.add_documents_in_batches(documents, 5)
40+
expect(response).to be_a(Array)
41+
expect(response.count).to eq(2) # 2 batches, since we start with 5 < documents.count <= 10 documents
42+
expect(response[0]).to have_key('updateId')
43+
response.each do |response_object|
44+
index.wait_for_pending_update(response_object['updateId'])
45+
end
46+
expect(index.documents.count).to eq(documents.count)
47+
end
48+
3849
it 'adds documents synchronously (as an array of documents)' do
3950
response = index.add_documents!(documents)
4051
expect(response).to be_a(Hash)
@@ -45,6 +56,19 @@
4556
expect(index.documents.count).to eq(documents.count)
4657
end
4758

59+
it 'adds document batches synchronously (as an array of documents)' do
60+
response = index.add_documents_in_batches!(documents, 5)
61+
expect(response).to be_a(Array)
62+
expect(response.count).to eq(2) # 2 batches, since we start with 5 < documents.count <= 10 documents
63+
response.each do |response_object|
64+
expect(response_object).to have_key('updateId')
65+
expect(response_object).to have_key('status')
66+
expect(response_object['status']).not_to eql('enqueued')
67+
expect(response_object['status']).to eql('processed')
68+
end
69+
expect(index.documents.count).to eq(documents.count)
70+
end
71+
4872
it 'infers order of fields' do
4973
response = index.document(1)
5074
expect(response.keys).to eq(['objectId', 'title', 'comment'])
@@ -109,6 +133,31 @@
109133
expect(doc2['comment']).to eq(documents.detect { |doc| doc[:objectId] == id2 }[:comment])
110134
end
111135

136+
it 'updates documents in index in batches (as an array of documents)' do
137+
id1 = 123
138+
id2 = 456
139+
updated_documents = [
140+
{ objectId: id1, title: 'Sense and Sensibility' },
141+
{ objectId: id2, title: 'The Little Prince' }
142+
]
143+
response = index.update_documents_in_batches(updated_documents, 1)
144+
expect(response).to be_a(Array)
145+
expect(response.count).to eq(2)
146+
response.each do |response_object|
147+
expect(response_object).to have_key('updateId')
148+
index.wait_for_pending_update(response_object['updateId'])
149+
end
150+
151+
doc1 = index.document(id1)
152+
doc2 = index.document(id2)
153+
154+
expect(index.documents.count).to eq(documents.count)
155+
expect(doc1['title']).to eq(updated_documents.detect { |doc| doc[:objectId] == id1 }[:title])
156+
expect(doc1['comment']).to eq(documents.detect { |doc| doc[:objectId] == id1 }[:comment])
157+
expect(doc2['title']).to eq(updated_documents.detect { |doc| doc[:objectId] == id2 }[:title])
158+
expect(doc2['comment']).to eq(documents.detect { |doc| doc[:objectId] == id2 }[:comment])
159+
end
160+
112161
it 'updates documents synchronously in index (as an array of documents)' do
113162
id1 = 123
114163
id2 = 456
@@ -131,6 +180,31 @@
131180
expect(doc2['comment']).to eq(documents.detect { |doc| doc[:objectId] == id2 }[:comment])
132181
end
133182

183+
it 'updates documents synchronously in index in batches (as an array of documents)' do
184+
id1 = 123
185+
id2 = 456
186+
updated_documents = [
187+
{ objectId: id1, title: 'Sense and Sensibility' },
188+
{ objectId: id2, title: 'The Little Prince' }
189+
]
190+
response = index.update_documents_in_batches!(updated_documents, 1)
191+
expect(response).to be_a(Array)
192+
expect(response.count).to eq(2) # 2 batches, since we have two items with batch size 1
193+
response.each do |response_object|
194+
expect(response_object).to have_key('updateId')
195+
expect(response_object).to have_key('status')
196+
expect(response_object['status']).not_to eql('enqueued')
197+
expect(response_object['status']).to eql('processed')
198+
end
199+
doc1 = index.document(id1)
200+
doc2 = index.document(id2)
201+
expect(index.documents.count).to eq(documents.count)
202+
expect(doc1['title']).to eq(updated_documents.detect { |doc| doc[:objectId] == id1 }[:title])
203+
expect(doc1['comment']).to eq(documents.detect { |doc| doc[:objectId] == id1 }[:comment])
204+
expect(doc2['title']).to eq(updated_documents.detect { |doc| doc[:objectId] == id2 }[:title])
205+
expect(doc2['comment']).to eq(documents.detect { |doc| doc[:objectId] == id2 }[:comment])
206+
end
207+
134208
it 'updates one document in index (as an hash of one document)' do
135209
id = 123
136210
updated_document = { objectId: id, title: 'Emma' }

0 commit comments

Comments
 (0)