After node reboot rabbitmq rejoins cluster but message are being discarded.

Rabbitmq version: 3.8.11
Erlang : 23.2.7, 22.0.2

Team,

We are seeing the following discard messages after the node restarts and try to join the existing HA member. 
```
2021-03-08 16:53:39.052 [error] emulator Discarding message {'$gen_call',{<0.1341.0>,#Ref<0.3121386697.1169686531.88317>},stat} from <0.1341.0> to <0.4171.0> in an old incarnation (1615219349) of this node (1615222369)

2021-03-08 16:53:39.054 [error] <0.1338.0> Discarding message {'$gen_call',{<0.1338.0>,#Ref<0.3121386697.1169686532.73676>},stat} from <0.1338.0> to <0.4181.0> in an old incarnation (1615219349) of this node (1615222369)

2021-03-08 16:53:39.054 [error] emulator Discarding message {'$gen_call',{<0.1338.0>,#Ref<0.3121386697.1169686532.73676>},stat} from <0.1338.0> to <0.4181.0> in an old incarnation (1615219349) of this node (1615222369)

2021-03-08 16:53:39.054 [error] <0.1343.0> Discarding message {'$gen_call',{<0.1343.0>,#Ref<0.3121386697.1169686531.88323>},stat} from <0.1343.0> to <0.4157.0> in an old incarnation (1615219349) of this node (1615222369)

2021-03-08 16:53:39.054 [error] <0.1332.0> Discarding message {'$gen_call',{<0.1332.0>,#Ref<0.3121386697.1169686529.99253>},stat} from <0.1332.0> to <0.4160.0> in an old incarnation (1615219349) of this node (1615222369)

2021-03-08 16:53:39.054 [error] emulator Discarding message {'$gen_call',{<0.1343.0>,#Ref<0.3121386697.1169686531.88323>},stat} from <0.1343.0> to <0.4157.0> in an old incarnation (1615219349) of this node (1615222369)

2021-03-08 16:53:39.054 [error] <0.1337.0> Discarding message {'$gen_call',{<0.1337.0>,#Ref<0.3121386697.1169686531.88324>},stat} from <0.1337.0> to <0.4181.0> in an old incarnation (1615219349) of this node (1615222369)

```

This is from a K8 environment where EPMD and Rabbitmq are co-located in a container. So we when we restart rabbit both EPMD and Rabbitmq are getting restarted.  We have talked to Erlang developers and they suggested to reach rabbitmq developers for the same.  When we restart Rabbitmq with EPMD, the ask here is it should handle the restart without creating issue for the cluster.


Reply below is  from Erlang dev:

"The process identifiers used when sending these messages identifies a node with the same nodename as the receiving node, but it is an old instance of the node that is identified.

A node is identified by its name and an integer value called "creation" which is assigned when the Erlang distribution is started.
Both nodename and creation is stored in all process identifiers. If the nodename match but creation doesn't match when sending a message using a process identifier, the receiving node will print messages like below and drop the message (since the receiving process doesn't exist on the node). In your case below, the creation of the old instance is
1615219349 and the new instance is 1615222369.

Either the receiving node has been restarted with the same name, or the Erlang distribution on the receiving node has been restarted under the same name. In both cases a new creation will be assigned to the node and it will reject messages directed to the old instance of the node.

In OTP 23 we began using 32-bit creation values. In OTP 22 these values were only 2-bits. That is, in OTP 22 creation values were reused *very* quickly. This is probably the reason to why you don't see this issue as often with OTP 22.

I think you have to turn to the RabbitMQ team for support regarding this and/or the person(s) that have configured your system."







Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

After node reboot rabbitmq rejoins cluster but message are being discarded. #2949

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

After node reboot rabbitmq rejoins cluster but message are being discarded. #2949

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions