-
Notifications
You must be signed in to change notification settings - Fork 63
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Issue report
What version of MicroCeph are you using ?
(squid/stable) 19.2.1+snap74c0060321
What are the steps to reproduce this issue ?
- Setup 3 machines with full mesh network (basic idea here: https://pve.proxmox.com/wiki/Full_Mesh_Network_for_Ceph_Server e.g. node 1 has a direct link to node 2 and 3, node 2 has a direct line to node 1 and 3, node 3 has a direct link to nodes 1 and 2)
- Install FRR
- Setup fabricd within FRR:
enable the fabrid daemon and set up nodes similar to below
frr.conf:
frr defaults traditional
hostname node1
log syslog warning
ip forwarding
no ipv6 forwarding
service integrated-vtysh-config
!
interface lo
ip address 10.15.15.51/32
ip router openfabric 1
openfabric passive
!
interface enp2s0f0np0
ip router openfabric 1
openfabric csnp-interval 2
openfabric hello-interval 1
openfabric hello-mulitplier 2
!
interface enp2s0f1np1
ip router openfabric 1
openfabric csnp-interval 2
openfabric hello-interval 1
openfabric hello-mulitplier 2
!
line vty
!
router openfabric 1
net 49.0001.1111.1111.1111.00
lsp-gen-interval 1
max-lsp-lifetime 600
lsp-refresh-interval 180
netplan:
network:
version: 2
ethernets:
enp88s0:
addresses:
- "192.168.8.2/24"
nameservers:
addresses:
- 192.168.8.1
search: []
routes:
- to: "default"
via: "192.168.8.1"
enp2s0f0np0:
mtu: 9000
enp2s0f1np1:
mtu: 9000
- You'll now be able to ping each node from each other node, with redundancy against any single link failure, but (and I think this is related to the problem), the IP address used by openfabric for each node isn't listed when executing
ifconfig, e.g. (following commands and response executed on node 2)
$ ifconfig
enp2s0f0np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet6 fe80::5a47:caff:fe7a:c1da prefixlen 64 scopeid 0x20<link>
ether 58:47:ca:7a:c1:da txqueuelen 1000 (Ethernet)
RX packets 3960444 bytes 10763025277 (10.7 GB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4257559 bytes 1789028060 (1.7 GB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp2s0f1np1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000
inet6 fe80::5a47:caff:fe7a:c1db prefixlen 64 scopeid 0x20<link>
ether 58:47:ca:7a:c1:db txqueuelen 1000 (Ethernet)
RX packets 776652 bytes 1995475489 (1.9 GB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 486683 bytes 1360628932 (1.3 GB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
enp88s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.8.3 netmask 255.255.255.0 broadcast 192.168.8.255
inet6 fe80::5a47:caff:fe7a:c1dd prefixlen 64 scopeid 0x20<link>
ether 58:47:ca:7a:c1:dd txqueuelen 1000 (Ethernet)
RX packets 5171181 bytes 5922053009 (5.9 GB)
RX errors 0 dropped 5518 overruns 0 frame 0
TX packets 1402604 bytes 161658482 (161.6 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device memory 0x6c500000-6c5fffff
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 1647425 bytes 486132167 (486.1 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1647425 bytes 486132167 (486.1 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
$ ping 10.15.15.51/32
PING 10.15.15.51 (10.15.15.51) 56(84) bytes of data.
64 bytes from 10.15.15.51: icmp_seq=1 ttl=64 time=0.443 ms
64 bytes from 10.15.15.51: icmp_seq=2 ttl=64 time=0.462 ms
^C
--- 10.15.15.51 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1046ms
rtt min/avg/max/mdev = 0.443/0.452/0.462/0.009 ms
- On node 1 get a join key:
sudo microceph cluster add node2 - On node 2, attempt to join:
sudo microceph cluster join <key>
What happens (observed behaviour) ?
Error: failed to generate the configuration: failed to locate IP on public network 10.15.15.51/32: no IP belongs to provided subnet 10.15.15.51/32
What were you expecting to happen ?
Since the machines can communicate, as demonstrated by both ping (and in my case, the nodes are all part of a microk8s that has no issues communicating on the same IPs that I'm trying to use for microceph...), the node should join the cluster
Relevant logs, error output, etc.
Node 1:
microceph (squid/stable) 19.2.1+snap74c0060321 from Canonical✓ installed
richard@node1:~$ sudo microceph cluster bootstrap
richard@node1:~$ sudo microceph disk add /dev/nvme0n1p4
+----------------+---------+
| PATH | STATUS |
+----------------+---------+
| /dev/nvme0n1p4 | Success |
+----------------+---------+
richard@node1:~$ sudo snap refresh --hold microceph
General refreshes of "microceph" held indefinitely
richard@node1:~$ sudo microceph cluster add node2
eyJzZWNyZXQiOiJkMmJkYTk0NTkxYmZjN2ZlNzNkZWQzNGQzNDZjOTc0MThjODI0YTZjZDc0Y2VjNzA3YTJiYmU2OTRkY2Q1NGY1IiwiZmluZ2VycHJpbnQiOiIwNWRkZjZjOTEyMjdhOTA5YmVkOTU4Njg1Y2Q1YzgxNjBjM2M2NDUxZTYxNjMxZGJmYzk4NGM3MjU3ODJiYmVmIiwiam9pbl9hZGRyZXNzZXMiOlsiMTAuMTUuMTUuNTE6NzQ0MyJdfQ==
richard@node1:~$ sudo microceph cluster add node3
eyJzZWNyZXQiOiJlYWQ4ZDM3N2JmMDRiYzVkMzMwYzc2NTA5Mjk3YTFmZGQ3MjY0YTllNTc0MmExMzM0NGE2NmViY2MwY2Y0MGVjIiwiZmluZ2VycHJpbnQiOiIwNWRkZjZjOTEyMjdhOTA5YmVkOTU4Njg1Y2Q1YzgxNjBjM2M2NDUxZTYxNjMxZGJmYzk4NGM3MjU3ODJiYmVmIiwiam9pbl9hZGRyZXNzZXMiOlsiMTAuMTUuMTUuNTE6NzQ0MyJdfQ==
Node 2:
richard@node2:~$ sudo snap install microceph --channel=squid/stable
[sudo] password for richard:
microceph (squid/stable) 19.2.1+snap74c0060321 from Canonical✓ installed
richard@node2:~$ sudo snap refresh --hold microceph
General refreshes of "microceph" held indefinitely
richard@node2:~$ sudo microceph cluster join eyJzZWNyZXQiOiJkMmJkYTk0NTkxYmZjN2ZlNzNkZWQzNGQzNDZjOTc0MThjODI0YTZjZDc0Y2VjNzA3YTJiYmU2OTRkY2Q1NGY1IiwiZmluZ2VycHJpbnQiOiIwNWRkZjZjOTEyMjdhOTA5YmVkOTU4Njg1Y2Q1YzgxNjBjM2M2NDUxZTYxNjMxZGJmYzk4NGM3MjU3ODJiYmVmIiwiam9pbl9hZGRyZXNzZXMiOlsiMTAuMTUuMTUuNTE6NzQ0MyJdfQ==
Error: failed to generate the configuration: failed to locate IP on public network 10.15.15.51/32: no IP belongs to provided subnet 10.15.15.51/32
richard@node2:~$ ping 10.15.15.51
PING 10.15.15.51 (10.15.15.51) 56(84) bytes of data.
64 bytes from 10.15.15.51: icmp_seq=1 ttl=64 time=0.443 ms
64 bytes from 10.15.15.51: icmp_seq=2 ttl=64 time=0.462 ms
^C
--- 10.15.15.51 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1046ms
rtt min/avg/max/mdev = 0.443/0.452/0.462/0.009 ms
Additional comments.
…
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working