doc/rbd: Update nvme documentation

caroav · zdover23 · commit b8df0c023d01 · 2025-01-25T19:20:35.000+10:00
Signed-off-by: Aviv Caro <133020857+caroav@users.noreply.github.com> Update nvmeof-requirements.rst Signed-off-by: Aviv Caro <133020857+caroav@users.noreply.github.com> Update nvmeof-initiator-linux.rst Signed-off-by: Aviv Caro <133020857+caroav@users.noreply.github.com> Update nvmeof-initiator-esx.rst Signed-off-by: Aviv Caro <133020857+caroav@users.noreply.github.com> Update nvmeof-target-configure.rst Signed-off-by: Aviv Caro <133020857+caroav@users.noreply.github.com> doc/rbd: fix broken .rst Fix .rst errors introduced in ceph#61477. This commit will be squashed. Signed-off-by: Zac Dover <zac.dover@proton.me>
diff --git a/doc/rbd/nvmeof-initiator-esx.rst b/doc/rbd/nvmeof-initiator-esx.rst
@@ -39,7 +39,7 @@ The following instructions will use the default vSphere web client and esxcli.
         
           esxcli nvme adapter list
 
-    #. Discover NVMe-oF subsystems:
+    #. Optional: Discover NVMe-oF subsystems:
     
        .. prompt:: bash #
         
@@ -48,8 +48,10 @@ The following instructions will use the default vSphere web client and esxcli.
     #. Connect to NVME-oF gateway subsystem:
     
        .. prompt:: bash #
-        
-          esxcli nvme connect -a NVME_TCP_ADAPTER -i GATEWAY_IP -p 4420 -s SUBSYSTEM_NQN
+          
+          esxcli nvme fabrics discover -a NVME_TCP_ADAPTER -i GATEWAY_IP -p 8009 -c
+
+         - This command discovers the NVMe-oF gateways in the gateway group and then connects to the gateways providing multipath access 
 
     #. List the NVMe/TCP controllers:
     
diff --git a/doc/rbd/nvmeof-initiator-linux.rst b/doc/rbd/nvmeof-initiator-linux.rst
@@ -29,32 +29,53 @@ Installation
 
    .. prompt::  bash #
    
-      nvme discover -t tcp -a GATEWAY_IP -s 4420
+      nvme discover -t tcp -a GATEWAY_IP -s 8009
 
-4. Connect to the NVMe/TCP target:
+4. Connect to the NVMe/TCP target. For High-availability use the connect-all command: 
 
    .. prompt:: bash #
-   
-      nvme connect -t tcp -a GATEWAY_IP -n SUBSYSTEM_NQN
+      
+      nvme connect-all --traddr GATEWAY_IP --transport tcp -l 1800 -s 8009   
+
+   - '-l 1800' is recommended to allow the initiator to continue trying to connect to GWs for 1800 seconds. This is helpful in cases that the GW is temporarily unavailable for any reason.      
+
+   - '-s 8009' is the port address of the Discovery controller. The connect-all command will connect to the DC first, and then will use the information it retrns to connect to the GWs. 
 
 Next steps
 ==========
 
 Verify that the initiator is set up correctly:
 
-1. List the NVMe block devices:
+1. Verify that the initiator is connected to all NVMe-oF gateways and subsystems in the gateway group
+   
+
+   .. prompt::  bash #
+   
+     nvme list-subsys
+
+     example output: 
+
+     nvme-subsys<X> - NQN=<NQN>
+     \
+         +- nvmeX tcp traddr=<GW IP>,trsvcid=4420 live
+         +- nvmeY tcp traddr=<GW IP>,trsvcid=4420 live
+         +- nvmeZ tcp traddr=<GW IP>,trsvcid=4420 live
+         +- nvmeW tcp traddr=<GW IP>,trsvcid=4420 live
+
+
+2. List the NVMe block devices. 
 
    .. prompt:: bash #
    
       nvme list
 
-2. Create a filesystem on the desired device:
+3. Create a filesystem on the desired device:
 
    .. prompt:: bash #
    
       mkfs.ext4 NVME_NODE_PATH
 
-3. Mount the filesystem:
+4. Mount the filesystem:
 
    .. prompt:: bash #
    
@@ -64,19 +85,19 @@ Verify that the initiator is set up correctly:
    
       mount NVME_NODE_PATH /mnt/nvmeof
 
-4. List the NVME-oF files:
+5. List the NVME-oF files:
 
    .. prompt:: bash #
    
       ls /mnt/nvmeof
 
-5. Create a text file in the ``/mnt/nvmeof`` directory:
+6. Create a text file in the ``/mnt/nvmeof`` directory:
 
    .. prompt:: bash #
    
       echo "Hello NVME-oF" > /mnt/nvmeof/hello.text
 
-6. Verify that the file can be accessed:
+7. Verify that the file can be accessed:
 
    .. prompt:: bash #
    
diff --git a/doc/rbd/nvmeof-overview.rst b/doc/rbd/nvmeof-overview.rst
@@ -4,16 +4,9 @@
  Ceph NVMe-oF Gateway
 ======================
 
-The NVMe-oF Gateway presents an NVMe-oF target that exports
-RADOS Block Device (RBD) images as NVMe namespaces. The NVMe-oF protocol allows
-clients (initiators) to send NVMe commands to storage devices (targets) over a
-TCP/IP network, enabling clients without native Ceph client support to access
-Ceph block storage.  
+Storage administrators can install and configure an NVMe over Fabrics (NVMe-oF) gateways for a Storage Ceph cluster. With Ceph NVMe-oF gateways, you can effectively run a fully integrated block storage infrastructure with all features and benefits of a conventional Storage Area Network (SAN).
 
-Each NVMe-oF gateway consists of an `SPDK <https://spdk.io/>`_ NVMe-oF target
-with ``bdev_rbd`` and a control daemon. Ceph’s NVMe-oF gateway can be used to
-provision a fully integrated block-storage infrastructure with all the features
-and benefits of a conventional Storage Area Network (SAN).
+The NVMe-oF gateway integrates Storage Ceph with the NVMe over TCP (NVMe/TCP) protocol to provide an NVMe/TCP target that exports RADOS Block Device (RBD) images. The NVMe/TCP protocol allows clients, which are known as initiators, to send NVMe-oF commands to storage devices, which are known as targets, over an Internet Protocol network. Initiators can be Linux clients, VMWare clients, or both. For VMWare clients, the NVMe/TCP volumes are shown as VMFS Datastore and for Linux clients, the NVMe/TCP volumes are shown as block devices.
 
 .. ditaa::
                   Cluster Network (optional)
@@ -40,6 +33,41 @@ and benefits of a conventional Storage Area Network (SAN).
                              |    +-----------+   |
                              +--------------------+
 
+============================================
+High Availability with NVMe-oF gateway group
+============================================
+
+High Availability (HA) provides I/O and control path redundancies for the host initiators. High Availability is also sometimes referred to as failover and failback support. The redundancy that HA creates is critical to protect against one or more gateway failures. With HA, the host can continue the I/O with only the possibility of performance latency until the failed gateways are back and functioning correctly.
+
+NVMe-oF gateways are virtually grouped into gateway groups and the HA domain sits within the gateway group. An NVMe-oF gateway group currently supports eight gateways. Each NVMe-oF gateway in the gateway group can be used as a path to any of the subsystems or namespaces that are defined in that gateway group. HA is effective with two or more gateways in a gateway group.
+
+High Availability is enabled by default. To use High Availability, a minimum of two gateways and listeners must be defined for the subsystems on every GW.
+
+It is important to create redundancy between the host and the gateways. To create a fully redundant network connectivity, be sure that the host has two Ethernet ports that are connected to the gateways over a network with redundancy (for example, two network switches).
+
+The HA feature uses the Active/Standby approach for each namespace. Using Active/Standby means that at any point in time, only one of the NVMe-oF gateways in the group serve I/O from the host to a specific namespace. To properly use all NVMe-oF gateways, each namespace is assigned to a different load-balancing group. The number of load-balancing groups is equal to the number of NVMe-oF gateways in the gateway group.
+
+In case the one or more GWs in the group dies or cannot be seen by the Ceph NVMe-oF monitor, an automatic failover is triggered, and another GW in the group will assume reponsibilty to the load-balancing group of the failed GW(s). That means that there is no disruption to the IO, because another GW will continue serving these namespaces. If the failing GW(s) are showing up again, a failback is automatically triggered. 
+
+The NVMe-oF initoator on the host, will also continue trying to connect a failing GW, for the amount of time that was specified in the connect-all command. The recommended amount is 1800 seconds.
+
+================================
+Scaling-out with NVMe-oF gateway
+================================
+
+The NVMe-oF gateway supports scale-out. NVMe-oF gateway scale-out supports:
+Up to four NVMe-oF gateway groups.
+Up to eight NVMe-oF gateways in a gateway group.
+Up to 128 NVMe-oF subsystems within a gateway group.
+Up to 32 hosts per NVMe-oF subsystem.
+1024 namespaces per gateway group.
+
+=========================
+NVMe-oF gateway Discovery
+=========================
+
+Ceph NVMe-oF gateways support Discovery. Each NVMe-oF gateway that runs in the Ceph cluster also runs a Discovery Controller. The Discovery Controller reports all of the Subsystems in the GW group.
+
 .. toctree::
   :maxdepth: 1
 
diff --git a/doc/rbd/nvmeof-requirements.rst b/doc/rbd/nvmeof-requirements.rst
@@ -2,13 +2,9 @@
 NVME-oF Gateway Requirements
 ============================
 
-We recommend that you provision at least two NVMe/TCP gateways on different
-nodes to implement a highly-available Ceph NVMe/TCP solution.
+- At least 8 GB of RAM dedicated to the GW  (on each node running an NVME-oF GW) 
+- It is hightly recommended to dedicate at least 4 cores to the GW (1 can work but perofrmance will be accordingly) 
+- For high availability, provision at least 2 GWs in a GW group. 
+- A minimum a single 10Gb Ethernet link in the Ceph public network for the gateway. For higher performance use 25 or 100 Gb links in the public network. 
+- Provision at least two NVMe/TCP gateways on different Ceph nodes for highly-availability Ceph NVMe/TCP solution. 
 
-We recommend at a minimum a single 10Gb Ethernet link in the Ceph public
-network for the gateway. For hardware recommendations, see
-:ref:`hardware-recommendations` .
-
-.. note:: On the NVMe-oF gateway, the memory footprint is a function of the
-   number of mapped RBD images and can grow to be large. Plan memory
-   requirements accordingly based on the number of RBD images to be mapped.
diff --git a/doc/rbd/nvmeof-target-configure.rst b/doc/rbd/nvmeof-target-configure.rst
@@ -2,18 +2,10 @@
 Installing and Configuring NVMe-oF Targets
 ==========================================
 
-Traditionally, block-level access to a Ceph storage cluster has been limited to
-(1) QEMU and ``librbd`` (which is a key enabler for adoption within OpenStack
-environments), and (2) the Linux kernel client. Starting with the Ceph Reef
-release, block-level access has been expanded to offer standard NVMe/TCP
-support, allowing wider platform usage and potentially opening new use cases.
-
 Prerequisites
 =============
 
--  Red Hat Enterprise Linux/CentOS 8.0 (or newer); Linux kernel v4.16 (or newer)
-
--  A working Ceph Reef or later storage cluster, deployed with ``cephadm``
+-  A working Ceph Tentacle or later storage cluster, deployed with ``cephadm``
 
 -  NVMe-oF gateways, which can either be colocated with OSD nodes or on dedicated nodes
 
@@ -68,7 +60,7 @@ To download it use the following command:
 
    .. prompt:: bash #
    
-      podman run -it quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 subsystem add --subsystem SUSYSTEM_NQN
+      podman run -it --rm quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 subsystem add --subsystem SUSYSTEM_NQN
 
    The subsystem NQN is a user defined string, for example ``nqn.2016-06.io.spdk:cnode1``.
 
@@ -84,7 +76,7 @@ To download it use the following command:
 
        .. prompt:: bash #
     
-          podman run -it quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 listener add --subsystem SUBSYSTEM_NQN --gateway-name GATEWAY_NAME --traddr GATEWAY_IP --trsvcid 4420
+          podman run -it --rm quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 listener add --subsystem SUBSYSTEM_NQN --host-name HOST_NAME --traddr GATEWAY_IP --trsvcid 4420
 
 #. Get the host NQN (NVME Qualified Name) for each host:
 
@@ -100,13 +92,13 @@ To download it use the following command:
 
    .. prompt:: bash #
     
-      podman run -it quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 host add --subsystem SUBSYSTEM_NQN --host "HOST_NQN1, HOST_NQN2"
+      podman run -it --rm quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 host add --subsystem SUBSYSTEM_NQN --host "HOST_NQN1 HOST_NQN2"
 
 #. List all subsystems configured in the gateway:
 
    .. prompt:: bash #
     
-      podman run -it quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 subsystem list
+      podman run -it --rm quay.io/ceph/nvmeof-cli:latest --server-address GATEWAY_IP --server-port GATEWAY_PORT 5500 subsystem list
 
 #. Create a new NVMe namespace: