You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update sharding documentation following Meilisearch's automatic splitting of documents (#3345)
* Update network and remote objects
* Update code samples [skip ci]
* Add information to the tasks object
* Update guide to inform that automatic sharding is a EE feature + document the automatic sharding
* Fix NoticeTag syntax
The message about sharding was not appearing because I had used the NoticeTag as a wrapping tag, which it isn't.
* Advise to use a single instance for writes
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Copy file name to clipboardExpand all lines: learn/multi_search/implement_sharding.mdx
+38-12Lines changed: 38 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,9 @@ Sharding is the process of splitting an index containing many documents into mul
11
11
12
12
This guide walks you through activating the `/network` route, configuring the network object, and performing remote federated searches.
13
13
14
+
<NoticeTaglabel="Enterprise Edition" />
15
+
Sharding is an Enterprise Edition feature. You are free to use it for evaluation purposes. Please [reach out to us](mailto:[email protected]) before using it in production.
16
+
14
17
<Tip>
15
18
## Configuring multiple instances
16
19
@@ -19,7 +22,7 @@ To minimize issues and limit unexpected behavior, instance, network, and index c
@@ -48,6 +51,7 @@ Next, you must configure the network object. It consists of the following fields
48
51
49
52
-`remotes`: defines a list with the required information to access each remote instance
50
53
-`self`: specifies which of the configured `remotes` corresponds to the current instance
54
+
-`sharding`: whether to use sharding.
51
55
52
56
### Setting up the list of remotes
53
57
@@ -93,32 +97,54 @@ curl \
93
97
94
98
Meilisearch processes searches on the remote that corresponds to `self` locally instead of making a remote request.
95
99
100
+
### Enabling sharding
101
+
102
+
Finally enable the automatic sharding of documents by Meilisearch on all instances:
103
+
104
+
```sh
105
+
curl \
106
+
-X PATCH 'MEILISEARCH_URL/network' \
107
+
-H 'Content-Type: application/json' \
108
+
--data-binary '{
109
+
"sharding": true
110
+
}'
111
+
```
112
+
96
113
### Adding or removing an instance
97
114
98
115
Changing the topology of the network involves moving some documents from an instance to another, depending on your hashing scheme.
99
116
100
117
As Meilisearch does not provide atomicity across multiple instances, you will need to either:
101
118
102
-
1. accept search downtime while migrating documents
103
-
2. accept some documents will not appear in search results during the migration
119
+
1. accept search downtime while migrating documents
120
+
2. accept some documents will not appear in search results during the migration
104
121
3. accept some duplicate documents may appear in search results during the migration
105
122
106
123
#### Reducing downtime
107
124
108
125
If your disk space allows, you can reduce the downtime by applying the following algorithm:
109
126
110
-
1. Create a new temporary index in each remote instance
111
-
2. Compute the new instance for each document
112
-
3. Send the documents to the temporary index of their new instance
113
-
4. Once Meilisearch has copied all documents to their instance of destination, swap the new index with the previously used index
114
-
5. Delete the temporary index after the swap
115
-
6. Update network configuration and search queries across all instances
127
+
1. Create a new temporary index in each remote instance
128
+
2. Compute the new instance for each document
129
+
3. Send the documents to the temporary index of their new instance
130
+
4. Once Meilisearch has copied all documents to their instance of destination, swap the new index with the previously used index
131
+
5. Delete the temporary index after the swap
132
+
6. Update network configuration and search queries across all instances
133
+
134
+
## Create indexes
135
+
136
+
Create the same empty indexes with the same settings on all instances.
137
+
Keeping the settings and indexes in sync is important to avoid errors and unexpected behavior, though not strictly required.
138
+
139
+
## Add documents
140
+
141
+
Pick a single instance to send all your documents to. Documents will be replicated to the other instances.
116
142
117
-
## Create indexes and add documents
143
+
Each instance will index the documents they are responsible for and ignore the others.
118
144
119
-
Create the same empty indexes with the same settings on all instances. Keeping the settings and indexes in sync is important to avoid errors and unexpected behavior, though not strictly required.
145
+
You *may* send send the same document to multiple instances, the task will be replicated to all instances, and only the instance responsible for the document will index it.
120
146
121
-
Distribute your documents across all instances. Do not send the same document to multiple instances as this may lead to duplicate search results. Similarly, you should ensure all future versions of a document are sent to the same instance. Meilisearch recommends you hash their primary key using [rendezvous hashing](https://en.wikipedia.org/wiki/Rendezvous_hashing).
147
+
Similarly, you may send any future versions of any document to the instance you picked, and only the correct instance will process that document.
@@ -122,13 +141,14 @@ Update the `self` and `remotes` fields of the network object.
122
141
123
142
Updates to the network object are **partial**. Only provide the fields you intend to update. Fields not present in the payload will remain unchanged.
124
143
125
-
To reset `self` and `remotes` to their original value, set them to `null`. To remove a single `remote` from your network, set the value of its name to `null`.
144
+
To reset `self`, `sharding` and `remotes` to their original value, set them to `null`. To remove a single `remote` from your network, set the value of its name to `null`.
**Description**: If the task was replicated from another remote or to other remotes, `network` will contain information about the remote task uids corresponding to this task. Otherwise, missing in task object.
196
+
197
+
`network` either has a single key that is either `origin` or `remotes`.
|**`remoteName`**| The name of the [remote](/reference/api/network)|
209
+
|**`taskUid`**| The uid of the task of origin |
210
+
211
+
`remotes` is itself an object whose keys are the [remotes](/reference/api/network) and values an object with a single key that is either `task_uid` or `error`:
0 commit comments