Skip to content

Commit 3f909f6

Browse files
Update sharding documentation following Meilisearch's automatic splitting of documents (#3345)
* Update network and remote objects * Update code samples [skip ci] * Add information to the tasks object * Update guide to inform that automatic sharding is a EE feature + document the automatic sharding * Fix NoticeTag syntax The message about sharding was not appearing because I had used the NoticeTag as a wrapping tag, which it isn't. * Advise to use a single instance for writes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
1 parent 7d6ed21 commit 3f909f6

File tree

6 files changed

+141
-20
lines changed

6 files changed

+141
-20
lines changed

learn/multi_search/implement_sharding.mdx

Lines changed: 38 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@ Sharding is the process of splitting an index containing many documents into mul
1111

1212
This guide walks you through activating the `/network` route, configuring the network object, and performing remote federated searches.
1313

14+
<NoticeTag label="Enterprise Edition" />
15+
Sharding is an Enterprise Edition feature. You are free to use it for evaluation purposes. Please [reach out to us](mailto:[email protected]) before using it in production.
16+
1417
<Tip>
1518
## Configuring multiple instances
1619

@@ -19,7 +22,7 @@ To minimize issues and limit unexpected behavior, instance, network, and index c
1922

2023
## Prerequisites
2124

22-
- Multiple Meilisearch projects (instances) running Meilisearch >=v1.13
25+
- Multiple Meilisearch projects (instances) running Meilisearch >=v1.19
2326

2427
## Activate the `/network` endpoint
2528

@@ -48,6 +51,7 @@ Next, you must configure the network object. It consists of the following fields
4851

4952
- `remotes`: defines a list with the required information to access each remote instance
5053
- `self`: specifies which of the configured `remotes` corresponds to the current instance
54+
- `sharding`: whether to use sharding.
5155

5256
### Setting up the list of remotes
5357

@@ -93,32 +97,54 @@ curl \
9397

9498
Meilisearch processes searches on the remote that corresponds to `self` locally instead of making a remote request.
9599

100+
### Enabling sharding
101+
102+
Finally enable the automatic sharding of documents by Meilisearch on all instances:
103+
104+
```sh
105+
curl \
106+
-X PATCH 'MEILISEARCH_URL/network' \
107+
-H 'Content-Type: application/json' \
108+
--data-binary '{
109+
"sharding": true
110+
}'
111+
```
112+
96113
### Adding or removing an instance
97114

98115
Changing the topology of the network involves moving some documents from an instance to another, depending on your hashing scheme.
99116

100117
As Meilisearch does not provide atomicity across multiple instances, you will need to either:
101118

102-
1. accept search downtime while migrating documents
103-
2. accept some documents will not appear in search results during the migration
119+
1. accept search downtime while migrating documents
120+
2. accept some documents will not appear in search results during the migration
104121
3. accept some duplicate documents may appear in search results during the migration
105122

106123
#### Reducing downtime
107124

108125
If your disk space allows, you can reduce the downtime by applying the following algorithm:
109126

110-
1. Create a new temporary index in each remote instance
111-
2. Compute the new instance for each document
112-
3. Send the documents to the temporary index of their new instance
113-
4. Once Meilisearch has copied all documents to their instance of destination, swap the new index with the previously used index
114-
5. Delete the temporary index after the swap
115-
6. Update network configuration and search queries across all instances
127+
1. Create a new temporary index in each remote instance
128+
2. Compute the new instance for each document
129+
3. Send the documents to the temporary index of their new instance
130+
4. Once Meilisearch has copied all documents to their instance of destination, swap the new index with the previously used index
131+
5. Delete the temporary index after the swap
132+
6. Update network configuration and search queries across all instances
133+
134+
## Create indexes
135+
136+
Create the same empty indexes with the same settings on all instances.
137+
Keeping the settings and indexes in sync is important to avoid errors and unexpected behavior, though not strictly required.
138+
139+
## Add documents
140+
141+
Pick a single instance to send all your documents to. Documents will be replicated to the other instances.
116142

117-
## Create indexes and add documents
143+
Each instance will index the documents they are responsible for and ignore the others.
118144

119-
Create the same empty indexes with the same settings on all instances. Keeping the settings and indexes in sync is important to avoid errors and unexpected behavior, though not strictly required.
145+
You *may* send send the same document to multiple instances, the task will be replicated to all instances, and only the instance responsible for the document will index it.
120146

121-
Distribute your documents across all instances. Do not send the same document to multiple instances as this may lead to duplicate search results. Similarly, you should ensure all future versions of a document are sent to the same instance. Meilisearch recommends you hash their primary key using [rendezvous hashing](https://en.wikipedia.org/wiki/Rendezvous_hashing).
147+
Similarly, you may send any future versions of any document to the instance you picked, and only the correct instance will process that document.
122148

123149
### Updating index settings
124150

reference/api/network.mdx

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -40,14 +40,17 @@ Do not enable the `network` feature if you rely on the value of attributes not p
4040
```json
4141
{
4242
"self": "ms-00",
43+
"sharding": false,
4344
"remotes": {
4445
"ms-00": {
4546
"url": "http://ms-1235.example.meilisearch.io",
46-
"searchApiKey": "Ecd1SDDi4pqdJD6qYLxD3y7VZAEb4d9j6LJgt4d6xas"
47+
"searchApiKey": "Ecd1SDDi4pqdJD6qYLxD3y7VZAEb4d9j6LJgt4d6xas",
48+
"writeApiKey": "O2OaIHgwGuHNx9duH6kSe1YJ55Bh0dXvLhbr8FQVvr3vRVViBO"
4749
},
4850
"ms-01": {
4951
"url": "http://ms-4242.example.meilisearch.io",
50-
"searchApiKey": "hrVu-OMcjPGElK7692K7bwriBoGyHXTMvB5NmZkMKqQ"
52+
"searchApiKey": "hrVu-OMcjPGElK7692K7bwriBoGyHXTMvB5NmZkMKqQ",
53+
"writeApiKey": "bd1ldDoFlfyeoFDe8f3GVNiE8AHX86chmFuzOW7nWYUbPa7ww3"
5154
}
5255
}
5356
}
@@ -59,6 +62,12 @@ Do not enable the `network` feature if you rely on the value of attributes not p
5962
**Default value**: `null`<br />
6063
**Description**: A string indicating the name of the current instance
6164

65+
### `sharding`
66+
67+
**Type**: Boolean<br />
68+
**Default value**: `false`<br />
69+
**Description**: A boolean indicating whether sharding should be enabled on the network
70+
6271
### `remotes`
6372

6473
**Type**: Object<br />
@@ -70,7 +79,8 @@ Do not enable the `network` feature if you rely on the value of attributes not p
7079
```json
7180
"ms-00": {
7281
"url": "http://ms-1235.example.meilisearch.io",
73-
"searchApiKey": "Ecd1SDDi4pqdJD6qYLxD3y7VZAEb4d9j6LJgt4d6xas"
82+
"searchApiKey": "Ecd1SDDi4pqdJD6qYLxD3y7VZAEb4d9j6LJgt4d6xas",
83+
"writeApiKey": "O2OaIHgwGuHNx9duH6kSe1YJ55Bh0dXvLhbr8FQVvr3vRVViBO"
7484
}
7585
```
7686

@@ -86,6 +96,12 @@ Do not enable the `network` feature if you rely on the value of attributes not p
8696
**Default value**: `null`<br />
8797
**Description**: An API key with search permissions
8898

99+
##### `writeApiKey`
100+
101+
**Type**: String<br />
102+
**Default value**: `null`<br />
103+
**Description**: An API key with `documents.*` permissions
104+
89105
## Get the network object
90106

91107
<RouteHighlighter method="GET" path="/network"/>
@@ -101,14 +117,17 @@ Returns the current value of the instance's network object.
101117
```json
102118
{
103119
"self": "ms-00",
120+
"sharding": false,
104121
"remotes": {
105122
"ms-00": {
106123
"url": "http://ms-1235.example.meilisearch.io",
107-
"searchApiKey": "Ecd1SDDi4pqdJD6qYLxD3y7VZAEb4d9j6LJgt4d6xas"
124+
"searchApiKey": "Ecd1SDDi4pqdJD6qYLxD3y7VZAEb4d9j6LJgt4d6xas",
125+
"writeApiKey": "O2OaIHgwGuHNx9duH6kSe1YJ55Bh0dXvLhbr8FQVvr3vRVViBO"
108126
},
109127
"ms-01": {
110128
"url": "http://ms-4242.example.meilisearch.io",
111-
"searchApiKey": "hrVu-OMcjPGElK7692K7bwriBoGyHXTMvB5NmZkMKqQ"
129+
"searchApiKey": "hrVu-OMcjPGElK7692K7bwriBoGyHXTMvB5NmZkMKqQ",
130+
"writeApiKey": "bd1ldDoFlfyeoFDe8f3GVNiE8AHX86chmFuzOW7nWYUbPa7ww3"
112131
}
113132
}
114133
}
@@ -122,13 +141,14 @@ Update the `self` and `remotes` fields of the network object.
122141

123142
Updates to the network object are **partial**. Only provide the fields you intend to update. Fields not present in the payload will remain unchanged.
124143

125-
To reset `self` and `remotes` to their original value, set them to `null`. To remove a single `remote` from your network, set the value of its name to `null`.
144+
To reset `self`, `sharding` and `remotes` to their original value, set them to `null`. To remove a single `remote` from your network, set the value of its name to `null`.
126145

127146
### Body
128147

129148
| Name | Type | Default value | Description |
130149
| :-------------------------------- | :----- | :------------ | :---------------------------------- |
131150
| **[`self`](#self)** | String | `null` | The name of the current instance |
151+
| **[`sharding`](#sharding)** | Boolean| `false` | Whether sharding should be enabled on the network |
132152
| **[`remotes`](#remotes)** | String | `null` | A list of remote objects describing accessible Meilisearch instances |
133153

134154
### Example
@@ -140,14 +160,17 @@ To reset `self` and `remotes` to their original value, set them to `null`. To re
140160
```json
141161
{
142162
"self": "ms-00",
163+
"sharding": true,
143164
"remotes": {
144165
"ms-00": {
145166
"url": "http://INSTANCE_URL",
146-
"searchApiKey": "INSTANCE_API_KEY"
167+
"searchApiKey": "INSTANCE_API_KEY",
168+
"writeApiKey": "INSTANCE_WRITE_API_KEY"
147169
},
148170
"ms-01": {
149171
"url": "http://ANOTHER_INSTANCE_URL",
150-
"searchApiKey": "ANOTHER_INSTANCE_API_KEY"
172+
"searchApiKey": "ANOTHER_INSTANCE_API_KEY",
173+
"writeApiKey": "ANOTHER_INSTANCE_WRITE_API_KEY"
151174
}
152175
}
153176
}

reference/api/tasks.mdx

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,49 @@ The `details` object is set to `null` for `snapshotCreation` tasks.
187187
| **`type`** | The [error type](/reference/errors/overview#errors) |
188188
| **`link`** | A link to the relevant section of the documentation |
189189

190+
### `network`
191+
192+
<NoticeTag type="experimental" label="experimental" />
193+
194+
**Type**: Object<br />
195+
**Description**: If the task was replicated from another remote or to other remotes, `network` will contain information about the remote task uids corresponding to this task. Otherwise, missing in task object.
196+
197+
`network` either has a single key that is either `origin` or `remotes`.
198+
199+
| Name | Description |
200+
| :-------------| :----------------------------------------------------------------------|
201+
| **`origin`** | The task and remote from which this task was replicated **from** |
202+
| **`remotes`** | This task was replicated **to** these tasks and remotes |
203+
204+
`origin` is itself an object with keys:
205+
206+
| Name | Description |
207+
| :--------------- | :--------------------------------------------|
208+
| **`remoteName`** | The name of the [remote](/reference/api/network) |
209+
| **`taskUid`** | The uid of the task of origin |
210+
211+
`remotes` is itself an object whose keys are the [remotes](/reference/api/network) and values an object with a single key that is either `task_uid` or `error`:
212+
213+
| Name | Description |
214+
| :------------- | :---------------------------------------------------------------------------------------------|
215+
| **`task_uid`** | The uid of the replicated task |
216+
| **`error`** | A human-readable error message indicating why the task could not be replicated to that remote |
217+
218+
<Note>
219+
This is an experimental feature. Use the Meilisearch Cloud UI or the experimental features endpoint to activate it:
220+
221+
```sh
222+
curl \
223+
-X PATCH 'MEILISEARCH_URL/experimental-features/' \
224+
-H 'Content-Type: application/json' \
225+
--data-binary '{
226+
"network": true
227+
}'
228+
```
229+
</Note>
230+
231+
232+
190233
### `duration`
191234

192235
**Type**: String<br />
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
<CodeGroup>
2+
3+
```go Go
4+
client.GetNetwork();
5+
```
6+
</CodeGroup>
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
<CodeGroup>
2+
3+
```go Go
4+
client.UpdateNetwork(&meilisearch.Network{
5+
Self: meilisearch.String("TEST"),
6+
Remotes: meilisearch.NewOpt(map[string]meilisearch.Opt[meilisearch.Remote]{
7+
"ms-00": meilisearch.NewOpt(meilisearch.Remote{
8+
URL: meilisearch.String("https://example.com"),
9+
SearchAPIKey: meilisearch.String("TEST"),
10+
},
11+
},
12+
});
13+
```
14+
</CodeGroup>
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
<CodeGroup>
2+
3+
```go Go
4+
client.UpdateNetwork(&meilisearch.Network{
5+
Self: meilisearch.String("TEST"),
6+
Remotes: meilisearch.Null[map[string]meilisearch.Opt[meilisearch.Remote]](),
7+
});
8+
```
9+
</CodeGroup>

0 commit comments

Comments
 (0)