doc: update documentation for normalization and case sensitivity

xhernandez · xhernandez · commit d9704c00736a · 2025-03-25T14:52:22.000+01:00
Signed-off-by: Xavi Hernandez &lt;xhernandez@gmail.com&gt;
diff --git a/doc/cephfs/fs-volumes.rst b/doc/cephfs/fs-volumes.rst
@@ -287,7 +287,7 @@ Use a command of the following form to create a subvolume:
 
 .. prompt:: bash #
 
-   ceph fs subvolume create <vol_name> <subvol_name> [--size <size_in_bytes>] [--group_name <subvol_group_name>] [--pool_layout <data_pool_name>] [--uid <uid>] [--gid <gid>] [--mode <octal_mode>] [--namespace-isolated] [--earmark <earmark>]
+   ceph fs subvolume create <vol_name> <subvol_name> [--size <size_in_bytes>] [--group_name <subvol_group_name>] [--pool_layout <data_pool_name>] [--uid <uid>] [--gid <gid>] [--mode <octal_mode>] [--namespace-isolated] [--earmark <earmark>] [--normalization <form>] [--case-insensitive]
 
 
 The command succeeds even if the subvolume already exists.
@@ -327,6 +327,29 @@ Valid Earmarks
    be aware that user permissions and ACLs associated with the previous scope might still apply. Ensure that
    any necessary permissions are updated as needed to maintain proper access control.
 
+When creating a subvolume you can also specify an unicode normalization form by
+using the ``--normalization`` option. This will be used to internally mangle
+file names so that unicode characters that can be represented by different
+unicode code point sequences are all mapped to the representation, which means
+that they will all access the same file. However, users will continue to see
+the same name that they used when the file was created.
+
+The valid values for the unicode normalization form are:
+
+    - nfd: canonical decomposition (default)
+    - nfc: canonical decomposition, followed by canonical composition
+    - nfkd: compatibility decomposition
+    - nfkc: compatibility decomposition, followed by canonical composition
+
+To learn more about unicode normalization forms see https://unicode.org/reports/tr15
+
+It's also possible to configure a subvolume for case insensitive access when
+the ``--case-insensitive`` option is used. When this option is added, file
+names that only differ in the case of its characters will be mapped to the same
+file. The case of the file name used when the file was created is preserved.
+
+.. note:: Setting ``--case-insensitive`` option implicitly enables
+   unicode normalization on the subvolume.
 
 Removing a subvolume
 ~~~~~~~~~~~~~~~~~~~~