Skip to content

Commit 50aab7d

Browse files
Merge pull request ceph#50212 from rishabh-d-dave/fs-swap-subcmd
cephfs: add command "ceph fs swap" Reviewed-by: Venky Shankar <[email protected]> Reviewed-by: Patrick Donnelly <[email protected]>
2 parents 8858839 + 9c547ad commit 50aab7d

File tree

12 files changed

+1155
-24
lines changed

12 files changed

+1155
-24
lines changed

PendingReleaseNotes

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,13 @@
5050
recommend that users with versioned buckets, especially those that existed
5151
on prior releases, use these new tools to check whether their buckets are
5252
affected and to clean them up accordingly.
53+
* CephFS: Two FS names can now be swapped, optionally along with their IDs,
54+
using "ceph fs swap" command. The function of this API is to facilitate
55+
file system swaps for disaster recovery. In particular, it avoids situations
56+
where a named file system is temporarily missing which would prompt a higher
57+
level storage operator (like Rook) to recreate the missing file system.
58+
See https://docs.ceph.com/en/latest/cephfs/administration/#file-systems
59+
docs for more information.
5360

5461
>=18.0.0
5562

doc/cephfs/administration.rst

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,46 @@ The CephX IDs authorized to the old file system name need to be reauthorized
9292
to the new name. Any on-going operations of the clients using these IDs may be
9393
disrupted. Mirroring is expected to be disabled on the file system.
9494

95+
::
96+
97+
fs swap <fs1-name> <fs1_id> <fs2-name> <fs2_id> [--swap-fscids=yes|no] [--yes-i-really-mean-it]
98+
99+
Swaps names of two Ceph file sytems and updates the application tags on all
100+
pools of both FSs accordingly. Certain tools that track FSCIDs of the file
101+
systems, besides the FS names, might get confused due to this operation. For
102+
this reason, mandatory option ``--swap-fscids`` has been provided that must be
103+
used to indicate whether or not FSCIDs must be swapped.
104+
105+
.. note:: FSCID stands for "File System Cluster ID".
106+
107+
Before the swap, mirroring should be disabled on both the CephFSs
108+
(because the cephfs-mirror daemon uses the fscid internally and changing it
109+
while the daemon is running could result in undefined behaviour), both the
110+
CephFSs should be offline and the file system flag ``refuse_client_sessions``
111+
must be set for both the CephFS.
112+
113+
The function of this API is to facilitate disaster recovery where a new file
114+
system reconstructed from the previous one is ready to take over for the
115+
possibly damaged file system. Instead of two ``fs rename`` operations, the
116+
operator can use a swap so there is no FSMap epoch where the primary (or
117+
production) named file system does not exist. This is important when Ceph is
118+
monitored by automatic storage operators like (Rook) which try to reconcile
119+
the storage system continuously. That operator may attempt to recreate the
120+
file system as soon as it is seen to not exist.
121+
122+
After the swap, CephX credentials may need to be reauthorized if the existing
123+
mounts should "follow" the old file system to its new name. Generally, for
124+
disaster recovery, its desirable for the existing mounts to continue using
125+
the same file system name. Any active file system mounts for either CephFSs
126+
must remount. Existing unflushed operations will be lost. When it is judged
127+
that one of the swapped file systems is ready for clients, run::
128+
129+
ceph fs set <fs> joinable true
130+
ceph fs set <fs> refuse_client_sessions false
131+
132+
Keep in mind that one of the swapped file systems may be left offline for
133+
future analysis if doing a disaster recovery swap.
134+
95135

96136
Settings
97137
--------

doc/man/8/ceph.rst

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Synopsis
2323
2424
| **ceph** **df** *{detail}*
2525
26-
| **ceph** **fs** [ *add_data_pool* \| *authorize* \| *dump* \| *feature ls* \| *flag set* \| *get* \| *ls* \| *lsflags* \| *new* \| *rename* \| *reset* \| *required_client_features add* \| *required_client_features rm* \| *rm* \| *rm_data_pool* \| *set*] ...
26+
| **ceph** **fs** [ *add_data_pool* \| *authorize* \| *dump* \| *feature ls* \| *flag set* \| *get* \| *ls* \| *lsflags* \| *new* \| *rename* \| *reset* \| *required_client_features add* \| *required_client_features rm* \| *rm* \| *rm_data_pool* \| *set* \| *swap* ] ...
2727
2828
| **ceph** **fsid**
2929
@@ -474,6 +474,15 @@ Usage::
474474

475475
ceph fs set <fs-name> <fs-setting> <value>
476476

477+
Subcommand ``swap`` swaps the names of two Ceph file system and updates
478+
application tags on the pool of the file systems accordingly. Optionally,
479+
FSIDs of the filesystems can also be swapped along with names by passing
480+
``--swap-fscids``.
481+
482+
Usage::
483+
484+
ceph fs swap <fs1-name> <fs1-id> <fs2-name> <fs2-id> [--swap-fscids] {--yes-i-really-meant-it}
485+
477486
fsid
478487
----
479488

qa/suites/fs/functional/tasks/admin.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ tasks:
1010
fail_on_skip: false
1111
modules:
1212
- tasks.cephfs.test_admin
13+
- tasks.cephfs.admin.test_fs_swap

qa/tasks/ceph_test_case.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,8 +76,8 @@ def _verify(self, proc, exp_retval=None, exp_errmsgs=None):
7676

7777
proc_stderr = proc.stderr.getvalue().lower()
7878
msg = ('didn\'t find any of the expected string in stderr.\n'
79-
f'expected string: {exp_errmsgs}\n'
80-
f'received error message: {proc_stderr}\n'
79+
f'expected string -\n{exp_errmsgs}\n'
80+
f'received error message -\n{proc_stderr}\n'
8181
'note: received error message is converted to lowercase')
8282
for e in exp_errmsgs:
8383
if e in proc_stderr:
@@ -105,6 +105,8 @@ def negtest_ceph_cmd(self, args, retval=None, errmsgs=None, **kwargs):
105105
# execution is needed to not halt on command failure because we are
106106
# conducting negative testing
107107
kwargs['check_status'] = False
108+
# log stdout since it may contain something useful when command fails
109+
kwargs['stdout'] = StringIO()
108110
# stderr is needed to check for expected error messages.
109111
kwargs['stderr'] = StringIO()
110112

0 commit comments

Comments
 (0)