|
8 | 8 | methods treat "naive" `datetime` objects as local times. |
9 | 9 | * RBD: `rbd group info` and `rbd group snap info` commands are introduced to |
10 | 10 | show information about a group and a group snapshot respectively. |
11 | | -* RBD: `rbd group snap ls` output now includes the group snap IDs. The header |
| 11 | +* RBD: `rbd group snap ls` output now includes the group snapshot IDs. The header |
12 | 12 | of the column showing the state of a group snapshot in the unformatted CLI |
13 | 13 | output is changed from 'STATUS' to 'STATE'. The state of a group snapshot |
14 | 14 | that was shown as 'ok' is now shown as 'complete', which is more descriptive. |
15 | | -* Based on tests performed at scale on a HDD based Ceph cluster, it was found |
| 15 | +* Based on tests performed at scale on an HDD based Ceph cluster, it was found |
16 | 16 | that scheduling with mClock was not optimal with multiple OSD shards. For |
17 | 17 | example, in the test cluster with multiple OSD node failures, the client |
18 | 18 | throughput was found to be inconsistent across test runs coupled with multiple |
|
21 | 21 | consistency of client and recovery throughput across multiple test runs. |
22 | 22 | Therefore, as an interim measure until the issue with multiple OSD shards |
23 | 23 | (or multiple mClock queues per OSD) is investigated and fixed, the following |
24 | | - change to the default HDD OSD shard configuration is made: |
| 24 | + changes to the default option values have been made: |
25 | 25 | - osd_op_num_shards_hdd = 1 (was 5) |
26 | 26 | - osd_op_num_threads_per_shard_hdd = 5 (was 1) |
27 | 27 | For more details see https://tracker.ceph.com/issues/66289. |
28 | | -* MGR: MGR's always-on modulues/plugins can now be force-disabled. This can be |
29 | | - necessary in cases where MGR(s) needs to be prevented from being flooded by |
30 | | - the module commands when coresponding Ceph service is down/degraded. |
| 28 | +* MGR: The Ceph Manager's always-on modulues/plugins can now be force-disabled. |
| 29 | + This can be necessary in cases where we wish to prevent the manager from being |
| 30 | + flooded by module commands when Ceph services are down or degraded. |
31 | 31 |
|
32 | | -* CephFS: Modifying the FS setting variable "max_mds" when a cluster is |
| 32 | +* CephFS: Modifying the setting "max_mds" when a cluster is |
33 | 33 | unhealthy now requires users to pass the confirmation flag |
34 | 34 | (--yes-i-really-mean-it). This has been added as a precaution to tell the |
35 | 35 | users that modifying "max_mds" may not help with troubleshooting or recovery |
|
52 | 52 |
|
53 | 53 | * cephx: key rotation is now possible using `ceph auth rotate`. Previously, |
54 | 54 | this was only possible by deleting and then recreating the key. |
55 | | -* ceph: a new --daemon-output-file switch is available for `ceph tell` commands |
| 55 | +* Ceph: a new --daemon-output-file switch is available for `ceph tell` commands |
56 | 56 | to dump output to a file local to the daemon. For commands which produce |
57 | 57 | large amounts of output, this avoids a potential spike in memory usage on the |
58 | 58 | daemon, allows for faster streaming writes to a file local to the daemon, and |
59 | 59 | reduces time holding any locks required to execute the command. For analysis, |
60 | 60 | it is necessary to retrieve the file from the host running the daemon |
61 | 61 | manually. Currently, only --format=json|json-pretty are supported. |
62 | | -* RGW: GetObject and HeadObject requests now return a x-rgw-replicated-at |
| 62 | +* RGW: GetObject and HeadObject requests now return an x-rgw-replicated-at |
63 | 63 | header for replicated objects. This timestamp can be compared against the |
64 | 64 | Last-Modified header to determine how long the object took to replicate. |
65 | | -* The cephfs-shell utility is now packaged for RHEL 9 / CentOS 9 as required |
66 | | - python dependencies are now available in EPEL9. |
| 65 | +* The cephfs-shell utility is now packaged for RHEL / CentOS / Rocky 9 as required |
| 66 | + Python dependencies are now available in EPEL9. |
67 | 67 | * RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in |
68 | | - multi-site. Previously, the replicas of such objects were corrupted on decryption. |
| 68 | + multi-site deployments Previously, replicas of such objects were corrupted on decryption. |
69 | 69 | A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to |
70 | 70 | identify these original multipart uploads. The ``LastModified`` timestamp of any |
71 | | - identified object is incremented by 1ns to cause peer zones to replicate it again. |
72 | | - For multi-site deployments that make any use of Server-Side Encryption, we |
| 71 | + identified object is incremented by one ns to cause peer zones to replicate it again. |
| 72 | + For multi-site deployments that make use of Server-Side Encryption, we |
73 | 73 | recommended running this command against every bucket in every zone after all |
74 | 74 | zones have upgraded. |
75 | 75 | * Tracing: The blkin tracing feature (see https://docs.ceph.com/en/reef/dev/blkin/) |
|
85 | 85 | be enabled to migrate to the new format. See |
86 | 86 | https://docs.ceph.com/en/squid/radosgw/zone-features for details. The "v1" |
87 | 87 | format is now considered deprecated and may be removed after 2 major releases. |
88 | | -* CEPHFS: MDS evicts clients which are not advancing their request tids which causes |
89 | | - a large buildup of session metadata resulting in the MDS going read-only due to |
90 | | - the RADOS operation exceeding the size threshold. `mds_session_metadata_threshold` |
91 | | - config controls the maximum size that a (encoded) session metadata can grow. |
| 88 | +* CephFS: The MDS evicts clients which are not advancing their request tids, which causes |
| 89 | + a large buildup of session metadata, which in turn results in the MDS going read-only |
| 90 | + due to RADOS operations exceeding the size threshold. `mds_session_metadata_threshold` |
| 91 | + config controls the maximum size to which (encoded) session metadata can grow. |
92 | 92 | * CephFS: A new "mds last-seen" command is available for querying the last time |
93 | 93 | an MDS was in the FSMap, subject to a pruning threshold. |
94 | | -* CephFS: For clusters with multiple CephFS file systems, all the snap-schedule |
| 94 | +* CephFS: For clusters with multiple CephFS file systems, all snap-schedule |
95 | 95 | commands now expect the '--fs' argument. |
96 | 96 | * CephFS: The period specifier ``m`` now implies minutes and the period specifier |
97 | | - ``M`` now implies months. This has been made consistent with the rest |
98 | | - of the system. |
| 97 | + ``M`` now implies months. This is consistent with the rest of the system. |
99 | 98 | * RGW: New tools have been added to radosgw-admin for identifying and |
100 | 99 | correcting issues with versioned bucket indexes. Historical bugs with the |
101 | 100 | versioned bucket index transaction workflow made it possible for the index |
102 | 101 | to accumulate extraneous "book-keeping" olh entries and plain placeholder |
103 | 102 | entries. In some specific scenarios where clients made concurrent requests |
104 | | - referencing the same object key, it was likely that a lot of extra index |
| 103 | + referencing the same object key, it was likely that extra index |
105 | 104 | entries would accumulate. When a significant number of these entries are |
106 | 105 | present in a single bucket index shard, they can cause high bucket listing |
107 | | - latencies and lifecycle processing failures. To check whether a versioned |
| 106 | + latency and lifecycle processing failures. To check whether a versioned |
108 | 107 | bucket has unnecessary olh entries, users can now run ``radosgw-admin |
109 | 108 | bucket check olh``. If the ``--fix`` flag is used, the extra entries will |
110 | | - be safely removed. A distinct issue from the one described thus far, it is |
111 | | - also possible that some versioned buckets are maintaining extra unlinked |
112 | | - objects that are not listable from the S3/ Swift APIs. These extra objects |
113 | | - are typically a result of PUT requests that exited abnormally, in the middle |
114 | | - of a bucket index transaction - so the client would not have received a |
115 | | - successful response. Bugs in prior releases made these unlinked objects easy |
116 | | - to reproduce with any PUT request that was made on a bucket that was actively |
117 | | - resharding. Besides the extra space that these hidden, unlinked objects |
118 | | - consume, there can be another side effect in certain scenarios, caused by |
119 | | - the nature of the failure mode that produced them, where a client of a bucket |
120 | | - that was a victim of this bug may find the object associated with the key to |
121 | | - be in an inconsistent state. To check whether a versioned bucket has unlinked |
122 | | - entries, users can now run ``radosgw-admin bucket check unlinked``. If the |
123 | | - ``--fix`` flag is used, the unlinked objects will be safely removed. Finally, |
124 | | - a third issue made it possible for versioned bucket index stats to be |
125 | | - accounted inaccurately. The tooling for recalculating versioned bucket stats |
126 | | - also had a bug, and was not previously capable of fixing these inaccuracies. |
127 | | - This release resolves those issues and users can now expect that the existing |
128 | | - ``radosgw-admin bucket check`` command will produce correct results. We |
129 | | - recommend that users with versioned buckets, especially those that existed |
130 | | - on prior releases, use these new tools to check whether their buckets are |
131 | | - affected and to clean them up accordingly. |
132 | | -* rgw: The User Accounts feature unlocks several new AWS-compatible IAM APIs |
133 | | - for the self-service management of users, keys, groups, roles, policy and |
| 109 | + be safely removed. An additional issue is that some versioned buckets |
| 110 | + may maintain extra unlinked objects that are not listable via the S3/Swift |
| 111 | + APIs. These extra objects are typically a result of PUT requests that |
| 112 | + exited abnormally in the middle of a bucket index transaction, and thus |
| 113 | + the client would not have received a successful response. Bugs in prior |
| 114 | + releases made these unlinked objects easy to reproduce with any PUT |
| 115 | + request made on a bucket that was actively resharding. In certain |
| 116 | + scenarios, a client of a bucket that was a victim of this bug may find |
| 117 | + the object associated with the key to be in an inconsistent state. To check |
| 118 | + whether a versioned bucket has unlinked entries, users can now run |
| 119 | + ``radosgw-admin bucket check unlinked``. If the ``--fix`` flag is used, |
| 120 | + the unlinked objects will be safely removed. Finally, a third issue made |
| 121 | + it possible for versioned bucket index stats to be accounted inaccurately. |
| 122 | + The tooling for recalculating versioned bucket stats also had a bug, and |
| 123 | + was not previously capable of fixing these inaccuracies. This release |
| 124 | + resolves those issues and users can now expect that the existing |
| 125 | + ``radosgw-admin bucket check`` command will produce correct results. |
| 126 | + We recommend that users with versioned buckets, especially those that |
| 127 | + existed on prior releases, use these new tools to check whether their |
| 128 | + buckets are affected and to clean them up accordingly. |
| 129 | +* RGW: The "user accounts" feature unlocks several new AWS-compatible IAM APIs |
| 130 | + for self-service management of users, keys, groups, roles, policy and |
134 | 131 | more. Existing users can be adopted into new accounts. This process is optional |
135 | 132 | but irreversible. See https://docs.ceph.com/en/squid/radosgw/account and |
136 | 133 | https://docs.ceph.com/en/squid/radosgw/iam for details. |
137 | | -* rgw: On startup, radosgw and radosgw-admin now validate the ``rgw_realm`` |
| 134 | +* RGW: On startup, radosgw and radosgw-admin now validate the ``rgw_realm`` |
138 | 135 | config option. Previously, they would ignore invalid or missing realms and |
139 | 136 | go on to load a zone/zonegroup in a different realm. If startup fails with |
140 | 137 | a "failed to load realm" error, fix or remove the ``rgw_realm`` option. |
141 | | -* rgw: The radosgw-admin commands ``realm create`` and ``realm pull`` no |
| 138 | +* RGW: The radosgw-admin commands ``realm create`` and ``realm pull`` no |
142 | 139 | longer set the default realm without ``--default``. |
143 | 140 | * CephFS: Running the command "ceph fs authorize" for an existing entity now |
144 | 141 | upgrades the entity's capabilities instead of printing an error. It can now |
@@ -183,8 +180,9 @@ CephFS: Disallow delegating preallocated inode ranges to clients. Config |
183 | 180 | * RADOS: `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated |
184 | 181 | due to being prone to false negative results. It's safer replacement is |
185 | 182 | `pool_is_in_selfmanaged_snaps_mode`. |
186 | | -* RADOS: For bug 62338 (https://tracker.ceph.com/issues/62338), we did not choose |
187 | | - to condition the fix on a server flag in order to simplify backporting. As |
| 183 | +* RADOS: For bug 62338 (https://tracker.ceph.com/issues/62338), in order to simplify |
| 184 | + backporting, we choose to not |
| 185 | + condition the fix on a server flag. As |
188 | 186 | a result, in rare cases it may be possible for a PG to flip between two acting |
189 | 187 | sets while an upgrade to a version with the fix is in progress. If you observe |
190 | 188 | this behavior, you should be able to work around it by completing the upgrade or |
|
0 commit comments