You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- '-l 1800' is recommended to allow the initiator to continue trying to connect to GWs for 1800 seconds. This is helpful in cases that the GW is temporarily unavailable for any reason.
41
+
42
+
- '-s 8009' is the port address of the Discovery controller. The connect-all command will connect to the DC first, and then will use the information it retrns to connect to the GWs.
39
43
40
44
Next steps
41
45
==========
42
46
43
47
Verify that the initiator is set up correctly:
44
48
45
-
1. List the NVMe block devices:
49
+
1. Verify that the initiator is connected to all NVMe-oF gateways and subsystems in the gateway group
50
+
51
+
52
+
.. prompt:: bash #
53
+
54
+
nvme list-subsys
55
+
56
+
example output:
57
+
58
+
nvme-subsys<X> - NQN=<NQN>
59
+
\
60
+
+- nvmeX tcp traddr=<GW IP>,trsvcid=4420 live
61
+
+- nvmeY tcp traddr=<GW IP>,trsvcid=4420 live
62
+
+- nvmeZ tcp traddr=<GW IP>,trsvcid=4420 live
63
+
+- nvmeW tcp traddr=<GW IP>,trsvcid=4420 live
64
+
65
+
66
+
2. List the NVMe block devices.
46
67
47
68
.. prompt:: bash #
48
69
49
70
nvme list
50
71
51
-
2. Create a filesystem on the desired device:
72
+
3. Create a filesystem on the desired device:
52
73
53
74
.. prompt:: bash #
54
75
55
76
mkfs.ext4 NVME_NODE_PATH
56
77
57
-
3. Mount the filesystem:
78
+
4. Mount the filesystem:
58
79
59
80
.. prompt:: bash #
60
81
@@ -64,19 +85,19 @@ Verify that the initiator is set up correctly:
64
85
65
86
mount NVME_NODE_PATH /mnt/nvmeof
66
87
67
-
4. List the NVME-oF files:
88
+
5. List the NVME-oF files:
68
89
69
90
.. prompt:: bash #
70
91
71
92
ls /mnt/nvmeof
72
93
73
-
5. Create a text file in the ``/mnt/nvmeof`` directory:
94
+
6. Create a text file in the ``/mnt/nvmeof`` directory:
Copy file name to clipboardExpand all lines: doc/rbd/nvmeof-overview.rst
+37-9Lines changed: 37 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,16 +4,9 @@
4
4
Ceph NVMe-oF Gateway
5
5
======================
6
6
7
-
The NVMe-oF Gateway presents an NVMe-oF target that exports
8
-
RADOS Block Device (RBD) images as NVMe namespaces. The NVMe-oF protocol allows
9
-
clients (initiators) to send NVMe commands to storage devices (targets) over a
10
-
TCP/IP network, enabling clients without native Ceph client support to access
11
-
Ceph block storage.
7
+
Storage administrators can install and configure an NVMe over Fabrics (NVMe-oF) gateways for a Storage Ceph cluster. With Ceph NVMe-oF gateways, you can effectively run a fully integrated block storage infrastructure with all features and benefits of a conventional Storage Area Network (SAN).
12
8
13
-
Each NVMe-oF gateway consists of an `SPDK <https://spdk.io/>`_ NVMe-oF target
14
-
with ``bdev_rbd`` and a control daemon. Ceph’s NVMe-oF gateway can be used to
15
-
provision a fully integrated block-storage infrastructure with all the features
16
-
and benefits of a conventional Storage Area Network (SAN).
9
+
The NVMe-oF gateway integrates Storage Ceph with the NVMe over TCP (NVMe/TCP) protocol to provide an NVMe/TCP target that exports RADOS Block Device (RBD) images. The NVMe/TCP protocol allows clients, which are known as initiators, to send NVMe-oF commands to storage devices, which are known as targets, over an Internet Protocol network. Initiators can be Linux clients, VMWare clients, or both. For VMWare clients, the NVMe/TCP volumes are shown as VMFS Datastore and for Linux clients, the NVMe/TCP volumes are shown as block devices.
17
10
18
11
.. ditaa::
19
12
Cluster Network (optional)
@@ -40,6 +33,41 @@ and benefits of a conventional Storage Area Network (SAN).
40
33
|+-----------+|
41
34
+--------------------+
42
35
36
+
============================================
37
+
High Availability with NVMe-oF gateway group
38
+
============================================
39
+
40
+
High Availability (HA) provides I/O and control path redundancies for the host initiators. High Availability is also sometimes referred to as failover and failback support. The redundancy that HA creates is critical to protect against one or more gateway failures. With HA, the host can continue the I/O with only the possibility of performance latency until the failed gateways are back and functioning correctly.
41
+
42
+
NVMe-oF gateways are virtually grouped into gateway groups and the HA domain sits within the gateway group. An NVMe-oF gateway group currently supports eight gateways. Each NVMe-oF gateway in the gateway group can be used as a path to any of the subsystems or namespaces that are defined in that gateway group. HA is effective with two or more gateways in a gateway group.
43
+
44
+
High Availability is enabled by default. To use High Availability, a minimum of two gateways and listeners must be defined for the subsystems on every GW.
45
+
46
+
It is important to create redundancy between the host and the gateways. To create a fully redundant network connectivity, be sure that the host has two Ethernet ports that are connected to the gateways over a network with redundancy (for example, two network switches).
47
+
48
+
The HA feature uses the Active/Standby approach for each namespace. Using Active/Standby means that at any point in time, only one of the NVMe-oF gateways in the group serve I/O from the host to a specific namespace. To properly use all NVMe-oF gateways, each namespace is assigned to a different load-balancing group. The number of load-balancing groups is equal to the number of NVMe-oF gateways in the gateway group.
49
+
50
+
In case the one or more GWs in the group dies or cannot be seen by the Ceph NVMe-oF monitor, an automatic failover is triggered, and another GW in the group will assume reponsibilty to the load-balancing group of the failed GW(s). That means that there is no disruption to the IO, because another GW will continue serving these namespaces. If the failing GW(s) are showing up again, a failback is automatically triggered.
51
+
52
+
The NVMe-oF initoator on the host, will also continue trying to connect a failing GW, for the amount of time that was specified in the connect-all command. The recommended amount is 1800 seconds.
53
+
54
+
================================
55
+
Scaling-out with NVMe-oF gateway
56
+
================================
57
+
58
+
The NVMe-oF gateway supports scale-out. NVMe-oF gateway scale-out supports:
59
+
Up to four NVMe-oF gateway groups.
60
+
Up to eight NVMe-oF gateways in a gateway group.
61
+
Up to 128 NVMe-oF subsystems within a gateway group.
62
+
Up to 32 hosts per NVMe-oF subsystem.
63
+
1024 namespaces per gateway group.
64
+
65
+
=========================
66
+
NVMe-oF gateway Discovery
67
+
=========================
68
+
69
+
Ceph NVMe-oF gateways support Discovery. Each NVMe-oF gateway that runs in the Ceph cluster also runs a Discovery Controller. The Discovery Controller reports all of the Subsystems in the GW group.
Copy file name to clipboardExpand all lines: doc/rbd/nvmeof-requirements.rst
+5-9Lines changed: 5 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,9 @@
2
2
NVME-oF Gateway Requirements
3
3
============================
4
4
5
-
We recommend that you provision at least two NVMe/TCP gateways on different
6
-
nodes to implement a highly-available Ceph NVMe/TCP solution.
5
+
- At least 8 GB of RAM dedicated to the GW (on each node running an NVME-oF GW)
6
+
- It is hightly recommended to dedicate at least 4 cores to the GW (1 can work but perofrmance will be accordingly)
7
+
- For high availability, provision at least 2 GWs in a GW group.
8
+
- A minimum a single 10Gb Ethernet link in the Ceph public network for the gateway. For higher performance use 25 or 100 Gb links in the public network.
9
+
- Provision at least two NVMe/TCP gateways on different Ceph nodes for highly-availability Ceph NVMe/TCP solution.
7
10
8
-
We recommend at a minimum a single 10Gb Ethernet link in the Ceph public
9
-
network for the gateway. For hardware recommendations, see
10
-
:ref:`hardware-recommendations` .
11
-
12
-
.. note:: On the NVMe-oF gateway, the memory footprint is a function of the
13
-
number of mapped RBD images and can grow to be large. Plan memory
14
-
requirements accordingly based on the number of RBD images to be mapped.
0 commit comments