|
8 | 8 | methods treat "naive" `datetime` objects as local times. |
9 | 9 | * RBD: `rbd group info` and `rbd group snap info` commands are introduced to |
10 | 10 | show information about a group and a group snapshot respectively. |
11 | | -* RBD: `rbd group snap ls` output now includes the group snap IDs. The header |
| 11 | +* RBD: `rbd group snap ls` output now includes the group snapshot IDs. The header |
12 | 12 | of the column showing the state of a group snapshot in the unformatted CLI |
13 | 13 | output is changed from 'STATUS' to 'STATE'. The state of a group snapshot |
14 | 14 | that was shown as 'ok' is now shown as 'complete', which is more descriptive. |
15 | | -* Based on tests performed at scale on a HDD based Ceph cluster, it was found |
| 15 | +* Based on tests performed at scale on an HDD based Ceph cluster, it was found |
16 | 16 | that scheduling with mClock was not optimal with multiple OSD shards. For |
17 | 17 | example, in the test cluster with multiple OSD node failures, the client |
18 | 18 | throughput was found to be inconsistent across test runs coupled with multiple |
|
21 | 21 | consistency of client and recovery throughput across multiple test runs. |
22 | 22 | Therefore, as an interim measure until the issue with multiple OSD shards |
23 | 23 | (or multiple mClock queues per OSD) is investigated and fixed, the following |
24 | | - change to the default HDD OSD shard configuration is made: |
| 24 | + changes to the default option values have been made: |
25 | 25 | - osd_op_num_shards_hdd = 1 (was 5) |
26 | 26 | - osd_op_num_threads_per_shard_hdd = 5 (was 1) |
27 | 27 | For more details see https://tracker.ceph.com/issues/66289. |
28 | | -* MGR: MGR's always-on modulues/plugins can now be force-disabled. This can be |
29 | | - necessary in cases where MGR(s) needs to be prevented from being flooded by |
30 | | - the module commands when coresponding Ceph service is down/degraded. |
| 28 | +* MGR: The Ceph Manager's always-on modulues/plugins can now be force-disabled. |
| 29 | + This can be necessary in cases where we wish to prevent the manager from being |
| 30 | + flooded by module commands when Ceph services are down or degraded. |
31 | 31 |
|
32 | | -* CephFS: Modifying the FS setting variable "max_mds" when a cluster is |
| 32 | +* CephFS: Modifying the setting "max_mds" when a cluster is |
33 | 33 | unhealthy now requires users to pass the confirmation flag |
34 | 34 | (--yes-i-really-mean-it). This has been added as a precaution to tell the |
35 | 35 | users that modifying "max_mds" may not help with troubleshooting or recovery |
|
41 | 41 |
|
42 | 42 | * cephx: key rotation is now possible using `ceph auth rotate`. Previously, |
43 | 43 | this was only possible by deleting and then recreating the key. |
44 | | -* ceph: a new --daemon-output-file switch is available for `ceph tell` commands |
| 44 | +* Ceph: a new --daemon-output-file switch is available for `ceph tell` commands |
45 | 45 | to dump output to a file local to the daemon. For commands which produce |
46 | 46 | large amounts of output, this avoids a potential spike in memory usage on the |
47 | 47 | daemon, allows for faster streaming writes to a file local to the daemon, and |
48 | 48 | reduces time holding any locks required to execute the command. For analysis, |
49 | 49 | it is necessary to retrieve the file from the host running the daemon |
50 | 50 | manually. Currently, only --format=json|json-pretty are supported. |
51 | | -* RGW: GetObject and HeadObject requests now return a x-rgw-replicated-at |
| 51 | +* RGW: GetObject and HeadObject requests now return an x-rgw-replicated-at |
52 | 52 | header for replicated objects. This timestamp can be compared against the |
53 | 53 | Last-Modified header to determine how long the object took to replicate. |
54 | | -* The cephfs-shell utility is now packaged for RHEL 9 / CentOS 9 as required |
55 | | - python dependencies are now available in EPEL9. |
| 54 | +* The cephfs-shell utility is now packaged for RHEL / CentOS / Rocky 9 as required |
| 55 | + Python dependencies are now available in EPEL9. |
56 | 56 | * RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in |
57 | | - multi-site. Previously, the replicas of such objects were corrupted on decryption. |
| 57 | + multi-site deployments Previously, replicas of such objects were corrupted on decryption. |
58 | 58 | A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to |
59 | 59 | identify these original multipart uploads. The ``LastModified`` timestamp of any |
60 | | - identified object is incremented by 1ns to cause peer zones to replicate it again. |
61 | | - For multi-site deployments that make any use of Server-Side Encryption, we |
| 60 | + identified object is incremented by one ns to cause peer zones to replicate it again. |
| 61 | + For multi-site deployments that make use of Server-Side Encryption, we |
62 | 62 | recommended running this command against every bucket in every zone after all |
63 | 63 | zones have upgraded. |
64 | 64 | * Tracing: The blkin tracing feature (see https://docs.ceph.com/en/reef/dev/blkin/) |
|
74 | 74 | be enabled to migrate to the new format. See |
75 | 75 | https://docs.ceph.com/en/squid/radosgw/zone-features for details. The "v1" |
76 | 76 | format is now considered deprecated and may be removed after 2 major releases. |
77 | | -* CEPHFS: MDS evicts clients which are not advancing their request tids which causes |
78 | | - a large buildup of session metadata resulting in the MDS going read-only due to |
79 | | - the RADOS operation exceeding the size threshold. `mds_session_metadata_threshold` |
80 | | - config controls the maximum size that a (encoded) session metadata can grow. |
| 77 | +* CephFS: The MDS evicts clients which are not advancing their request tids, which causes |
| 78 | + a large buildup of session metadata, which in turn results in the MDS going read-only |
| 79 | + due to RADOS operations exceeding the size threshold. `mds_session_metadata_threshold` |
| 80 | + config controls the maximum size to which (encoded) session metadata can grow. |
81 | 81 | * CephFS: A new "mds last-seen" command is available for querying the last time |
82 | 82 | an MDS was in the FSMap, subject to a pruning threshold. |
83 | | -* CephFS: For clusters with multiple CephFS file systems, all the snap-schedule |
| 83 | +* CephFS: For clusters with multiple CephFS file systems, all snap-schedule |
84 | 84 | commands now expect the '--fs' argument. |
85 | 85 | * CephFS: The period specifier ``m`` now implies minutes and the period specifier |
86 | | - ``M`` now implies months. This has been made consistent with the rest |
87 | | - of the system. |
| 86 | + ``M`` now implies months. This is consistent with the rest of the system. |
88 | 87 | * RGW: New tools have been added to radosgw-admin for identifying and |
89 | 88 | correcting issues with versioned bucket indexes. Historical bugs with the |
90 | 89 | versioned bucket index transaction workflow made it possible for the index |
91 | 90 | to accumulate extraneous "book-keeping" olh entries and plain placeholder |
92 | 91 | entries. In some specific scenarios where clients made concurrent requests |
93 | | - referencing the same object key, it was likely that a lot of extra index |
| 92 | + referencing the same object key, it was likely that extra index |
94 | 93 | entries would accumulate. When a significant number of these entries are |
95 | 94 | present in a single bucket index shard, they can cause high bucket listing |
96 | | - latencies and lifecycle processing failures. To check whether a versioned |
| 95 | + latency and lifecycle processing failures. To check whether a versioned |
97 | 96 | bucket has unnecessary olh entries, users can now run ``radosgw-admin |
98 | 97 | bucket check olh``. If the ``--fix`` flag is used, the extra entries will |
99 | | - be safely removed. A distinct issue from the one described thus far, it is |
100 | | - also possible that some versioned buckets are maintaining extra unlinked |
101 | | - objects that are not listable from the S3/ Swift APIs. These extra objects |
102 | | - are typically a result of PUT requests that exited abnormally, in the middle |
103 | | - of a bucket index transaction - so the client would not have received a |
104 | | - successful response. Bugs in prior releases made these unlinked objects easy |
105 | | - to reproduce with any PUT request that was made on a bucket that was actively |
106 | | - resharding. Besides the extra space that these hidden, unlinked objects |
107 | | - consume, there can be another side effect in certain scenarios, caused by |
108 | | - the nature of the failure mode that produced them, where a client of a bucket |
109 | | - that was a victim of this bug may find the object associated with the key to |
110 | | - be in an inconsistent state. To check whether a versioned bucket has unlinked |
111 | | - entries, users can now run ``radosgw-admin bucket check unlinked``. If the |
112 | | - ``--fix`` flag is used, the unlinked objects will be safely removed. Finally, |
113 | | - a third issue made it possible for versioned bucket index stats to be |
114 | | - accounted inaccurately. The tooling for recalculating versioned bucket stats |
115 | | - also had a bug, and was not previously capable of fixing these inaccuracies. |
116 | | - This release resolves those issues and users can now expect that the existing |
117 | | - ``radosgw-admin bucket check`` command will produce correct results. We |
118 | | - recommend that users with versioned buckets, especially those that existed |
119 | | - on prior releases, use these new tools to check whether their buckets are |
120 | | - affected and to clean them up accordingly. |
121 | | -* rgw: The User Accounts feature unlocks several new AWS-compatible IAM APIs |
122 | | - for the self-service management of users, keys, groups, roles, policy and |
| 98 | + be safely removed. An additional issue is that some versioned buckets |
| 99 | + may maintain extra unlinked objects that are not listable via the S3/Swift |
| 100 | + APIs. These extra objects are typically a result of PUT requests that |
| 101 | + exited abnormally in the middle of a bucket index transaction, and thus |
| 102 | + the client would not have received a successful response. Bugs in prior |
| 103 | + releases made these unlinked objects easy to reproduce with any PUT |
| 104 | + request made on a bucket that was actively resharding. In certain |
| 105 | + scenarios, a client of a bucket that was a victim of this bug may find |
| 106 | + the object associated with the key to be in an inconsistent state. To check |
| 107 | + whether a versioned bucket has unlinked entries, users can now run |
| 108 | + ``radosgw-admin bucket check unlinked``. If the ``--fix`` flag is used, |
| 109 | + the unlinked objects will be safely removed. Finally, a third issue made |
| 110 | + it possible for versioned bucket index stats to be accounted inaccurately. |
| 111 | + The tooling for recalculating versioned bucket stats also had a bug, and |
| 112 | + was not previously capable of fixing these inaccuracies. This release |
| 113 | + resolves those issues and users can now expect that the existing |
| 114 | + ``radosgw-admin bucket check`` command will produce correct results. |
| 115 | + We recommend that users with versioned buckets, especially those that |
| 116 | + existed on prior releases, use these new tools to check whether their |
| 117 | + buckets are affected and to clean them up accordingly. |
| 118 | +* RGW: The "user accounts" feature unlocks several new AWS-compatible IAM APIs |
| 119 | + for self-service management of users, keys, groups, roles, policy and |
123 | 120 | more. Existing users can be adopted into new accounts. This process is optional |
124 | 121 | but irreversible. See https://docs.ceph.com/en/squid/radosgw/account and |
125 | 122 | https://docs.ceph.com/en/squid/radosgw/iam for details. |
126 | | -* rgw: On startup, radosgw and radosgw-admin now validate the ``rgw_realm`` |
| 123 | +* RGW: On startup, radosgw and radosgw-admin now validate the ``rgw_realm`` |
127 | 124 | config option. Previously, they would ignore invalid or missing realms and |
128 | 125 | go on to load a zone/zonegroup in a different realm. If startup fails with |
129 | 126 | a "failed to load realm" error, fix or remove the ``rgw_realm`` option. |
130 | | -* rgw: The radosgw-admin commands ``realm create`` and ``realm pull`` no |
| 127 | +* RGW: The radosgw-admin commands ``realm create`` and ``realm pull`` no |
131 | 128 | longer set the default realm without ``--default``. |
132 | 129 | * CephFS: Running the command "ceph fs authorize" for an existing entity now |
133 | 130 | upgrades the entity's capabilities instead of printing an error. It can now |
@@ -172,8 +169,9 @@ CephFS: Disallow delegating preallocated inode ranges to clients. Config |
172 | 169 | * RADOS: `get_pool_is_selfmanaged_snaps_mode` C++ API has been deprecated |
173 | 170 | due to being prone to false negative results. It's safer replacement is |
174 | 171 | `pool_is_in_selfmanaged_snaps_mode`. |
175 | | -* RADOS: For bug 62338 (https://tracker.ceph.com/issues/62338), we did not choose |
176 | | - to condition the fix on a server flag in order to simplify backporting. As |
| 172 | +* RADOS: For bug 62338 (https://tracker.ceph.com/issues/62338), in order to simplify |
| 173 | + backporting, we choose to not |
| 174 | + condition the fix on a server flag. As |
177 | 175 | a result, in rare cases it may be possible for a PG to flip between two acting |
178 | 176 | sets while an upgrade to a version with the fix is in progress. If you observe |
179 | 177 | this behavior, you should be able to work around it by completing the upgrade or |
|
0 commit comments