Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/changelog/112348.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 112348
summary: Introduce repository integrity verification API
area: Snapshot/Restore
type: enhancement
issues:
- 52622
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ For more information, see <<snapshot-restore>>.
include::put-repo-api.asciidoc[]
include::verify-repo-api.asciidoc[]
include::repo-analysis-api.asciidoc[]
include::verify-repo-integrity-api.asciidoc[]
include::get-repo-api.asciidoc[]
include::delete-repo-api.asciidoc[]
include::clean-up-repo-api.asciidoc[]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
[role="xpack"]
[[verify-repo-integrity-api]]
=== Verify repository integrity API
++++
<titleabbrev>Verify repository integrity</titleabbrev>
++++

Verifies the integrity of the contents of a snapshot repository.

////
[source,console]
----
PUT /_snapshot/my_repository
{
"type": "fs",
"settings": {
"location": "my_backup_location"
}
}
----
// TESTSETUP
////

[source,console]
----
POST /_snapshot/my_repository/_verify_integrity
----

[[verify-repo-integrity-api-request]]
==== {api-request-title}

`POST /_snapshot/<repository>/_verify_integrity`

[[verify-repo-integrity-api-prereqs]]
==== {api-prereq-title}

* If the {es} {security-features} are enabled, you must have the `manage`
<<privileges-list-cluster,cluster privilege>> to use this API. For more
information, see <<security-privileges>>.

[[verify-repo-integrity-api-desc]]
==== {api-description-title}

This API allows you to perform a comprehensive check of the contents of a
repository, looking for any anomalies in its data or metadata which might
prevent you from restoring snapshots from the repository or which might cause
future snapshot create or delete operations to fail.

The default values for the parameters of this API are designed to limit the
impact of the integrity verification on other activities in your cluster. For
instance, by default it will only use at most half of the `snapshot_meta`
threads to verify the integrity of each snapshot, allowing other snapshot
operations to use the other half of this thread pool.

You should avoid any operations which write to the repository while this API is
running. If something changes the repository contents while an integrity
verification is running then {es} may incorrectly report having detected some
anomalies in its contents due to the concurrent writes. It may also incorrectly
fail to report some anomalies that the concurrent writes prevented it from
detecting.

NOTE: This API is intended for exploratory use by humans. You should expect the
request parameters and the response format to vary in future versions.

NOTE: This API may not work correctly in a mixed-version cluster.

[[verify-repo-integrity-api-path-params]]
==== {api-path-parms-title}

`<repository>`::
(Required, string)
Name of the snapshot repository whose integrity to verify.

[[verify-repo-integrity-api-query-params]]
==== {api-query-parms-title}

`snapshot_verification_concurrency`::
(Optional, integer) Specifies the number of snapshots to verify concurrently.
Defaults to `0` which means to use at most half of the `snapshot_meta` thread
pool at once.

`index_verification_concurrency`::
(Optional, integer) Specifies the number of indices to verify concurrently.
Defaults to `0` which means to use the entire `snapshot_meta` thread pool.

`meta_thread_pool_concurrency`::
(Optional, integer) Specifies the maximum number of snapshot metadata
operations to execute concurrently. Defaults to `0` which means to use at most
half of the `snapshot_meta` thread pool at once.

`index_snapshot_verification_concurrency`::
(Optional, integer) Specifies the maximum number of index snapshots to verify
concurrently within each index verification. Defaults to `1`.

`max_failed_shard_snapshots`::
(Optional, integer) Limits the number of shard snapshot failures to track
during integrity verification, in order to avoid excessive resource usage. If
your repository contains more than this number of shard snapshot failures then
the verification will fail. Defaults to `10000`.

`verify_blob_contents`::
(Optional, boolean) Specifies whether to verify the checksum of every data blob
in the repository. Defaults to `false`. If this feature is enabled, {es} will
read the entire repository contents, which may be extremely slow and expensive.

`blob_thread_pool_concurrency`::
(Optional, integer) If `?verify_blob_contents` is `true`, this parameter
specifies how many blobs to verify at once. Defaults to `1`.

`max_bytes_per_sec`::
(Optional, <<size-units, size units>>)
If `?verify_blob_contents` is `true`, this parameter specifies the maximum
amount of data that {es} will read from the repository every second. Defaults
to `10mb`.

[role="child_attributes"]
[[verify-repo-integrity-api-response-body]]
==== {api-response-body-title}

The response exposes implementation details of the analysis which may change
from version to version. The response body format is therefore not considered
stable and may be different in newer versions.

`log`::
(array) A sequence of objects that report the progress of the analysis.
+
.Properties of `log`
[%collapsible%open]
====
`snapshot`::
(object) If the log entry pertains to a particular snapshot then the snapshot
will be described in this object.

`index`::
(object) If the log entry pertains to a particular index then the index will be
described in this object.

`index`::
(object) If the log entry pertains to a particular index then the index will be
described in this object.

`snapshot_restorability`::
(object) If the log entry pertains to the restorability of an index then the
details will be described in this object.

`anomaly`::
(string) If the log entry pertains to an anomaly in the repository contents then
this string will describe the anomaly.

`exception`::
(object) If the log entry pertains to an exception that {es} encountered during
the verification then the details will be included in this object.

====

`results`::
(object) An object which describes the final results of the analysis.
+
.Properties of `results`
[%collapsible%open]
====
`status`::
(object) The final status of the analysis task.

`final_repository_generation`::
(integer) The repository generation at the end of the analysis. If there were
any writes to the repository during the analysis then this value will be
different from the `generation` reported in the task status, and the analysis
may have detected spurious anomalies due to the concurrent writes, or may even
have failed to detect some anomalies in the repository contents.

`total_anomalies`::
(integer) The total number of anomalies detected during the analysis.

`result`::
(string) The final result of the analysis. If the repository contents appear to
be intact then this will be the string `pass`. If this field is missing, or
contains some other value, then the repository contents were not fully
verified.

====

`exception`::
(object) If the analysis encountered an exception which prevented it from
completing successfully then this exception will be reported here.
4 changes: 3 additions & 1 deletion docs/reference/snapshot-restore/register-repository.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,9 @@ filesystem snapshot of this repository.
When restoring a repository from a backup, you must not register the repository
with {es} until the repository contents are fully restored. If you alter the
contents of a repository while it is registered with {es} then the repository
may become unreadable or may silently lose some of its contents.
may become unreadable or may silently lose some of its contents. After
restoring a repository from a backup, use the <<verify-repo-integrity-api>> API
to verify its integrity before you start to use the repository.

include::repository-azure.asciidoc[]
include::repository-gcs.asciidoc[]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,13 @@ public Collection<SnapshotId> getSnapshotIds() {
return snapshotIds.values();
}

/**
* @return the number of index snapshots (i.e. the sum of the index count of each snapshot)
*/
public long getIndexSnapshotCount() {
return indexSnapshots.values().stream().mapToLong(List::size).sum();
}

/**
* @return whether some of the {@link SnapshotDetails} of the given snapshot are missing, due to BwC, so that they must be loaded from
* the {@link SnapshotInfo} blob instead.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ public class Constants {
"cluster:admin/snapshot/restore",
"cluster:admin/snapshot/status",
"cluster:admin/snapshot/status[nodes]",
"cluster:admin/repository/verify_integrity",
"cluster:admin/features/get",
"cluster:admin/features/reset",
"cluster:admin/tasks/cancel",
Expand Down
Loading