22 Network Configuration Reference
33=================================
44
5- Network configuration is critical for building a high performance :term: `Ceph
6- Storage Cluster `. The Ceph Storage Cluster does not perform request routing or
7- dispatching on behalf of the :term: `Ceph Client `. Instead, Ceph Clients make
8- requests directly to Ceph OSD Daemons. Ceph OSD Daemons perform data replication
9- on behalf of Ceph Clients, which means replication and other factors impose
10- additional loads on Ceph Storage Cluster networks.
11-
12- Our Quick Start configurations provide a trivial Ceph configuration file that
13- sets monitor IP addresses and daemon host names only. Unless you specify a
14- cluster network, Ceph assumes a single "public" network. Ceph functions just
15- fine with a public network only, but you may see significant performance
16- improvement with a second "cluster" network in a large cluster.
17-
18- It is possible to run a Ceph Storage Cluster with two networks: a public
19- (client, front-side) network and a cluster (private, replication, back-side)
20- network. However, this approach
21- complicates network configuration (both hardware and software) and does not usually
22- have a significant impact on overall performance. For this reason, we recommend
23- that for resilience and capacity dual-NIC systems either active/active bond
24- these interfaces or implement a layer 3 multipath strategy with eg. FRR.
25-
26- If, despite the complexity, one still wishes to use two networks, each
27- :term: `Ceph Node ` will need to have more than one network interface or VLAN. See `Hardware
28- Recommendations - Networks `_ for additional details.
5+ Careful network infrastructure and configuration is critical for building a
6+ resilieht and high performance :term: `Ceph Storage Cluster `. The Ceph Storage
7+ Cluster does not perform request routing or dispatching on behalf of
8+ the :term: `Ceph Client `. Instead, Ceph clients make requests directly to Ceph
9+ OSD Daemons. Ceph OSDs perform data replication on behalf of Ceph clients,
10+ which imposes additional load on Ceph networks.
11+
12+ Our Quick Start configurations provide a minimal Ceph configuration file that
13+ includes Monitor IP addresses and daemon host names. Unless you specify a
14+ cluster network, Ceph assumes a single *public * network. Ceph functions just
15+ fine with only a public network in many deployments, especially with 25GE or
16+ faster network links. Clusters with high client traffic may experience
17+ significant resilience and performance improvement by provisioning a second,
18+ private network.
19+
20+ It is possible to run a Ceph Storage Cluster with two networks: a *public *
21+ (*client *, *front-side *) network and a *cluster * (*private *, *replication *,
22+ *back-side *) network. However, this approach complicates network configuration,
23+ costs, and management, and often may not have a significant impact on overall
24+ performance. If the network technology in use is slow by modern standards, say
25+ 1GE or for dense or SSD nodes 10GE, you may wish to bond more than two links for
26+ sufficient throughput and/or implement a dedicated replication network.
27+
28+ We recommend that for resilience and capacity network interfaces are bonded
29+ and connect to redundant switches. Bonding should be active/active,
30+ or implement a layer 3 multipath strategy with FRR or similar technlogy. When
31+ using LACP bonding it is important to consult your organization's network team
32+ to determine the proper transmit hash policy Usually this is 2+3 or 3+4. The
33+ wrong choice can result in imbalanced network link utilization with a fraction
34+ of the available throughput. Network observability tools including ``bmon ``
35+ and ``iftop `` and ``netstat `` are invaluable when ensuring that bond member
36+ links are well-utilized.
37+
38+ If, despite the complexity, one still wishes to provision a dedicated replication
39+ network for a Ceph cluster, each :term: `Ceph Node ` will need to have more than
40+ one network interface or VLAN. See `Hardware Recommendations - Networks `_ for additional details.
2941
3042.. ditaa ::
3143 +-------------+
@@ -34,23 +46,23 @@ Recommendations - Networks`_ for additional details.
3446 | ^
3547 Request | : Response
3648 v |
37- /----------------------------------*--*-------------------------------------\
38- | Public Network |
39- \---*--*------------*--*-------------*--*------------*--*------------*--*---/
40- ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
41- | | | | | | | | | |
42- | : | : | : | : | :
43- v v v v v v v v v v
44- +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+
45- | Ceph MON | | Ceph MDS | | Ceph OSD | | Ceph OSD | | Ceph OSD |
46- +----------+ +----------+ +---*--*---+ +---*--*---+ +---*--*---+
47- ^ ^ ^ ^ ^ ^
48- The cluster network relieves | | | | | |
49- OSD replication and heartbeat | : | : | :
50- traffic from the public network. v v v v v v
51- /------------------------------------*--*------------*--*------------*--*---\
52- | cCCC Cluster Network |
53- \---------------------------------------------------------------------------/
49+ /----------------------------------*--*------------------------------------------- \
50+ | Public Network |
51+ \----- *--*-------------- *--*-------------*--*------------*--*------------*--*-- ---/
52+ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
53+ | | | | | | | | | |
54+ | : | : | : | : | :
55+ v v v v v v v v v v
56+ +----- *--*-- ---+ +---*--*---+ +---*--*---+ +---*--*---+ +---*--*---+
57+ | Ceph Monitor | | Ceph MDS | | Ceph OSD | | Ceph OSD | | Ceph OSD |
58+ +-------------- + +----------+ +---*--*---+ +---*--*---+ +---*--*---+
59+ ^ ^ ^ ^ ^ ^
60+ The cluster network offloads | | | | | |
61+ OSD replication and heartbeat | : | : | :
62+ traffic from the public network v v v v v v
63+ /---------------------------------------- *--*------------*--*------------*--*- ---\
64+ | cCCC Cluster Network |
65+ \-------------------------------------------------------------------------------- /
5466
5567
5668IP Tables
@@ -73,6 +85,12 @@ You will need to delete these rules on both your public and cluster networks
7385initially, and replace them with appropriate rules when you are ready to
7486harden the ports on your Ceph Nodes.
7587
88+ .. note :: Docker and Podman containers may experience disruption when rules
89+ are adjusted or reloaded. You may find it best to update rules on
90+ cluster nodes by serially setting maintenance mode, stopping
91+ container services, applying rule changes, then starting container
92+ services and exiting maintenance mode.
93+
7694
7795Monitor IP Tables
7896-----------------
0 commit comments