Skip to content

Commit 1daf063

Browse files
lankasevergreen
authored andcommitted
SERVER-43779 Add section on Two Phase Drops to the Replication Architecture Guide
1 parent ec0bf80 commit 1daf063

File tree

1 file changed

+33
-0
lines changed

1 file changed

+33
-0
lines changed

src/mongo/db/repl/README.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -748,3 +748,36 @@ where we cannot assume a drop will come and fix them, we abort and retry initial
748748
The oplog application phase concludes when the node applies `minValid`. The node checks its sync
749749
source's Rollback ID to see if a rollback occurred and if so, restarts initial sync. Otherwise, the
750750
`DataReplicator` shuts down and the `ReplicationCoordinator` starts steady state replication.
751+
752+
# Dropping Collections and Databases
753+
754+
In 3.6, the Two Phase Drop Algorithm was added in the replication layer for supporting collection
755+
and database drops. It made it easy to support rollbacks for drop operations. In 4.2, the
756+
implementation for collection drops was moved to the storage engine. This section will cover the
757+
behavior for the implementation in the replication layer, which currently runs on nodes where
758+
<!-- TODO SERVER-43788: Link to the section describing enableMajorityReadConcern=false -->
759+
`enableMajorityReadConcern` is set to false.
760+
761+
## Dropping Collections
762+
763+
Dropping an unreplicated collection happens immediately. However, the process for dropping a
764+
replicated collection requires two phases.
765+
766+
In the first phase, if the node is the primary, it will write a "dropCollection" oplog entry. The
767+
collection will be flagged as dropped by being added to a list in the `DropPendingCollectionReaper`
768+
(along with its OpTime), but the storage engine won't delete the collection data yet. Every time the
769+
`ReplicationCoordinator` advances the commit point, the node will check to see if any drop's OpTime
770+
is before or at the majority commit point. If any are, those drops will then move to phase 2 and
771+
the `DropPendingCollectionReaper` will tell the storage engine to drop the collection.
772+
773+
By waiting until the "dropCollection" oplog entry is majority committed to drop the collection, it
774+
guarantees that only drops in phase 1 can be rolled back. This means that the storage engine will
775+
still have the collection's data and in the case of a rollback, it can then easily restore the
776+
collection.
777+
778+
## Dropping Databases
779+
780+
When a node receives a `dropDatabase` command, it will initiate a Two Phase Drop as described above
781+
for each collection in the relevant database. Once all collection drops are replicated to a majority
782+
of nodes, the node will drop the now empty database and a `dropDatabase` command oplog entry is
783+
written to the oplog.

0 commit comments

Comments
 (0)