You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is important to understand first the [Basic write model] of documents:
151
150
documents are written to Lucene in-memory buffers, then "refreshed" to searchable segments which may not be persisted on disk, and finally "flushed" to a durable Lucene commit on disk.
@@ -158,7 +157,6 @@ The translog ultimately truncates operations once they have been flushed to disk
158
157
Main usages of the translog are:
159
158
160
159
* During recovery, an index shard can be recovered up to at least the last acknowledged operation by replaying the translog onto the last flushed commit of the shard.
161
-
* During a replica recovery, it may recover some lost operations from the primary's translog if needed before falling back to a complete recovery of Lucene files from the primary.
162
160
* Facilitate real-time (m)GETs of documents without refreshing.
163
161
164
162
#### Translog Truncation
@@ -168,18 +166,16 @@ Main usages of the translog are:
168
166
Translog files are automatically truncated when they are no longer needed, specifically after all their operations have been persisted by Lucene commits on disk.
169
167
Lucene commits are initiated by flushes (e.g., with the index [Flush API]).
170
168
171
-
Flushes may also be automatically initiated by Elasticsearch, e.g., if the translog exceeds a configurable size ([`INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTING`](https://github.com/elastic/elasticsearch/blob/dd1db5031ee7fdac284753c0c3b096b0e981d71a/server/src/main/java/org/elasticsearch/index/IndexSettings.java#L352)) or age ([`INDEX_TRANSLOG_FLUSH_THRESHOLD_AGE_SETTING`](https://github.com/elastic/elasticsearch/blob/dd1db5031ee7fdac284753c0c3b096b0e981d71a/server/src/main/java/org/elasticsearch/index/IndexSettings.java#L370)),which ultimately truncates the translog as well.
169
+
Flushes may also be automatically initiated by Elasticsearch, e.g., if the translog exceeds a configurable size ([`INDEX_TRANSLOG_FLUSH_THRESHOLD_SIZE_SETTING`](https://github.com/elastic/elasticsearch/blob/dd1db5031ee7fdac284753c0c3b096b0e981d71a/server/src/main/java/org/elasticsearch/index/IndexSettings.java#L352)) or age ([`INDEX_TRANSLOG_FLUSH_THRESHOLD_AGE_SETTING`](https://github.com/elastic/elasticsearch/blob/dd1db5031ee7fdac284753c0c3b096b0e981d71a/server/src/main/java/org/elasticsearch/index/IndexSettings.java#L370)),which ultimately truncates the translog as well.
172
170
173
171
#### Acknowledging writes
174
172
175
173
A bulk request will repeateadly call ultimately the Engine methods such as [`index()` or `delete()`](https://github.com/elastic/elasticsearch/blob/591fa87e43a509d3eadfdbbb296cdf08453ea91a/server/src/main/java/org/elasticsearch/index/engine/Engine.java#L546-L564) which adds operations to the Translog.
176
-
Finally, the AfterWrite action of the `[TransportWriteAction](https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java)` will call `[indexShard.syncAfterWrite()](https://github.com/elastic/elasticsearch/blob/387eef070c25ed57e4139158e7e7e0ed097c8c98/server/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java#L548)` which will put the last written transloc Location of the bulk request into a `[AsyncIOProcessor](https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/common/util/concurrent/AsyncIOProcessor.java)` that is responsible for gradually fsync'ing the Translog and notifying any waiters.
174
+
Finally, the AfterWrite action of the [`TransportWriteAction`](https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java) will call [`indexShard.syncAfterWrite()`](https://github.com/elastic/elasticsearch/blob/387eef070c25ed57e4139158e7e7e0ed097c8c98/server/src/main/java/org/elasticsearch/action/support/replication/TransportWriteAction.java#L548) which will put the last written translog [`Location`](https://github.com/elastic/elasticsearch/blob/693f3bfe30271d77a6b3147e4519b4915cbb395d/server/src/main/java/org/elasticsearch/index/translog/Translog.java#L977) of the bulk request into a [`AsyncIOProcessor`](https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/common/util/concurrent/AsyncIOProcessor.java) that is responsible for gradually fsync'ing the Translog and notifying any waiters.
177
175
Ultimately the bulk request is notified that the translog has fsync'ed passed the requested location, and can continue to acknowledge the bulk request.
Each translog is a sequence of files, each identified by a translog generation ID, each containing a sequence of operations, with the last file open for writes.
184
180
The last file has a part which has been fsync'ed to disk, and a part which has been written but not necessarily fsync'ed yet to disk.
185
181
Each operation is identified by a sequence number (`seqno`), which is monotonically increased by the engine's ingestion functionality.
@@ -191,7 +187,6 @@ A few more words on terminology and classes used around the translog Java packag
191
187
A [`Location`](https://github.com/elastic/elasticsearch/blob/693f3bfe30271d77a6b3147e4519b4915cbb395d/server/src/main/java/org/elasticsearch/index/translog/Translog.java#L977) of an operation is defined by the translog generation file it is contained in, the offset of the operation in that file, and the number of bytes that encode that operation.
192
188
An [`Operation`](https://github.com/elastic/elasticsearch/blob/693f3bfe30271d77a6b3147e4519b4915cbb395d/server/src/main/java/org/elasticsearch/index/translog/Translog.java#L1087) can be a document indexed, a document deletion, or a no-op operation.
193
189
A [`Snapshot`](https://github.com/elastic/elasticsearch/blob/693f3bfe30271d77a6b3147e4519b4915cbb395d/server/src/main/java/org/elasticsearch/index/translog/Translog.java#L711) iterator can be created to iterate over a range of requested operation sequence numbers read from the translog files.
194
-
A retention lock can be acquired for [History retention] purposes, e.g., for potentially facilitating a replica shard's recovery, which prohibits truncating the translog files.
195
190
The [`sync()`](https://github.com/elastic/elasticsearch/blob/693f3bfe30271d77a6b3147e4519b4915cbb395d/server/src/main/java/org/elasticsearch/index/translog/Translog.java#L813) method is the one that fsync's the current translog generation file to disk, and updates the checkpoint file with the last fsync'ed operation and location.
196
191
The [`rollGeneration()`](https://github.com/elastic/elasticsearch/blob/693f3bfe30271d77a6b3147e4519b4915cbb395d/server/src/main/java/org/elasticsearch/index/translog/Translog.java#L1656) method is the one that rolls the translog, creating a new translog generation, e.g., called during an index flush.
197
192
The [`createEmptyTranslog()`](https://github.com/elastic/elasticsearch/blob/693f3bfe30271d77a6b3147e4519b4915cbb395d/server/src/main/java/org/elasticsearch/index/translog/Translog.java#L1929) method creates a new translog, e.g., for a new empty index shard.
0 commit comments