A question about a quorum queue behavior on 3.13.0 #12404

GroovyRice · 2024-09-30T00:52:22Z

GroovyRice
Sep 30, 2024

Describe the bug

This might not be the right place for this please change or reassign if necessary. Best to give some contextual understanding of the environment. It's a cluster of 4 nodes:

(2) on airgapped environment called rabbit@5BFVPMC01 and rabbit@5BFVPMC02
(2) not on an airgapped environment called rabbit@EPCVMSG01 and rabbit@EPCVMSG02
They can communicate with each other as all the ports required are open.

On the 27th we received this issue in the RabbitMQ log:

2024-09-27 00:47:12.332000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 00:47:12.332000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 07:50:10.675000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 07:50:10.675000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 08:01:21.174000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 08:01:21.174000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 09:16:26.929000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 09:16:26.929000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 09:48:15.241000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 09:48:15.241000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 10:20:12.364000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 10:20:12.364000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 15:14:59.351000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 15:14:59.351000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 16:13:08.325000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 16:13:08.325000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 16:13:10.735000+10:00 [info] <0.24642567.0> accepting AMQP connection <0.24642567.0> (10.82.43.26:55206 -> 10.82.43.25:5672)
2024-09-27 16:13:10.756000+10:00 [info] <0.24642567.0> connection <0.24642567.0> (10.82.43.26:55206 -> 10.82.43.25:5672) has a client-provided name: PAM.Subscriber
2024-09-27 16:13:10.759000+10:00 [info] <0.24642567.0> connection <0.24642567.0> (10.82.43.26:55206 -> 10.82.43.25:5672 - PAM.Subscriber): user 'bf5pmcs' authenticated and granted access to vhost '/'
2024-09-27 16:43:37.539000+10:00 [warning] <0.24642567.0> closing AMQP connection <0.24642567.0> (10.82.43.26:55206 -> 10.82.43.25:5672 - PAM.Subscriber, vhost: '/', user: 'bf5pmcs'):
2024-09-27 16:43:37.539000+10:00 [warning] <0.24642567.0> client unexpectedly closed TCP connection
2024-09-27 18:22:42.608000+10:00 [info] <0.514.0> queue 'BMMS.EKA.QualityUpdate.Dev' in vhost '/': granting vote for {'%2F_BMMS.EKA.QualityUpdate.Dev',rabbit@5BFVPMC02} with last indexterm {218,46} for term 47 previous term was 46
2024-09-27 18:22:42.615000+10:00 [info] <0.514.0> queue 'BMMS.EKA.QualityUpdate.Dev' in vhost '/': detected a new leader {'%2F_BMMS.EKA.QualityUpdate.Dev',rabbit@5BFVPMC02} in term 47
2024-09-27 18:22:42.645000+10:00 [info] <0.518.0> queue 'MonitorViewer.Tst' in vhost '/': granting vote for {'%2F_MonitorViewer.Tst',rabbit@5BFVPMC02} with last indexterm {681547,14} for term 15 previous term was 14
2024-09-27 18:22:42.651000+10:00 [info] <0.518.0> queue 'MonitorViewer.Tst' in vhost '/': detected a new leader {'%2F_MonitorViewer.Tst',rabbit@5BFVPMC02} in term 15
2024-09-27 18:22:48.510000+10:00 [notice] <0.516.0> queue 'Dead.All' in vhost '/': candidate -> leader in term: 44 machine version: 3
2024-09-27 18:22:48.649000+10:00 [notice] <0.494.0> queue 'BMMS.EKA.Transactions.Dev' in vhost '/': candidate -> leader in term: 28 machine version: 3
2024-09-27 18:22:49.008000+10:00 [notice] <0.552.0> queue '5BF.LIMS.HMT.Tst' in vhost '/': candidate -> leader in term: 47 machine version: 3
2024-09-27 18:22:49.356000+10:00 [notice] <0.554.0> queue 'BMMS.EKA.Transactions.Tst4' in vhost '/': candidate -> leader in term: 31 machine version: 3
2024-09-27 18:22:50.795000+10:00 [info] <0.502.0> queue '5BF.CastingGUI.IT' in vhost '/': granting vote for {'%2F_5BF.CastingGUI.IT',rabbit@5BFVPMC02} with last indexterm {207509,73} for term 74 previous term was 74
2024-09-27 18:22:50.803000+10:00 [info] <0.502.0> queue '5BF.CastingGUI.IT' in vhost '/': detected a new leader {'%2F_5BF.CastingGUI.IT',rabbit@5BFVPMC02} in term 74
2024-09-27 18:22:51.065000+10:00 [info] <0.449.0> rabbit_stream_coordinator: granting vote for {rabbit_stream_coordinator,rabbit@5BFVPMC02} with last indexterm {127,97} for term 98 previous term was 98
2024-09-27 18:22:51.071000+10:00 [info] <0.449.0> rabbit_stream_coordinator: detected a new leader {rabbit_stream_coordinator,rabbit@5BFVPMC02} in term 98
2024-09-27 18:22:51.674000+10:00 [info] <0.506.0> queue '5BF.BISRA' in vhost '/': leader saw pre_vote_rpc from {'%2F_5BF.BISRA',rabbit@5BFVPMC02} for term 110 abdicates term: 109!
2024-09-27 18:22:51.676000+10:00 [notice] <0.506.0> queue '5BF.BISRA' in vhost '/': leader -> follower in term: 110 machine version: 3
2024-09-27 18:22:53.886000+10:00 [info] <0.426.0> rabbit on node rabbit@EPCVMSG01 down
2024-09-27 18:22:53.887000+10:00 [info] <0.426.0> node rabbit@EPCVMSG01 down: connection_closed
2024-09-27 18:22:54.385000+10:00 [info] <0.426.0> rabbit on node rabbit@EPCVMSG02 down
2024-09-27 18:22:54.385000+10:00 [info] <0.426.0> node rabbit@EPCVMSG02 down: connection_closed
2024-09-27 18:22:58.675000+10:00 [info] <0.5880832.0> queue '5BF.ProcessCalculations' in vhost '/': granting vote for {'%2F_5BF.ProcessCalculations',rabbit@5BFVPMC02} with last indexterm {35228,2} for term 3 previous term was 3
2024-09-27 18:22:58.681000+10:00 [info] <0.5880832.0> queue '5BF.ProcessCalculations' in vhost '/': follower did not have entry at 35228 in 2. Requesting {'%2F_5BF.ProcessCalculations',rabbit@5BFVPMC02} from 35227
2024-09-27 18:22:58.681000+10:00 [info] <0.5880832.0> queue '5BF.ProcessCalculations' in vhost '/': detected a new leader {'%2F_5BF.ProcessCalculations',rabbit@5BFVPMC02} in term 3
2024-09-27 18:23:09.671000+10:00 [info] <0.426.0> node rabbit@EPCVMSG01 up
2024-09-27 18:23:09.672000+10:00 [info] <0.544.0> queue '5BF.CastScheduler.OT' in vhost '/': follower did not have entry at 5631 in 14. Requesting {'%2F_5BF.CastScheduler.OT',rabbit@EPCVMSG01} from 5629
2024-09-27 18:23:09.672000+10:00 [info] <0.486.0> queue '5BF.PAMTransfers' in vhost '/': follower did not have entry at 2915570 in 73. Requesting {'%2F_5BF.PAMTransfers',rabbit@EPCVMSG01} from 2915570
2024-09-27 18:23:09.672000+10:00 [info] <0.544.0> queue '5BF.CastScheduler.OT' in vhost '/': detected a new leader {'%2F_5BF.CastScheduler.OT',rabbit@EPCVMSG01} in term 14
2024-09-27 18:23:09.672000+10:00 [info] <0.486.0> queue '5BF.PAMTransfers' in vhost '/': detected a new leader {'%2F_5BF.PAMTransfers',rabbit@EPCVMSG01} in term 73
2024-09-27 18:23:09.672000+10:00 [info] <0.526.0> queue '5BF.CastingEvents.Service' in vhost '/': detected a new leader {'%2F_5BF.CastingEvents.Service',rabbit@EPCVMSG01} in term 67
2024-09-27 18:23:09.673000+10:00 [info] <0.510.0> queue '5BF.Mainframe.Prod' in vhost '/': detected a new leader {'%2F_5BF.Mainframe.Prod',rabbit@EPCVMSG01} in term 75
2024-09-27 18:23:09.830000+10:00 [info] <0.321.0> RabbitMQ metadata store: follower did not have entry at 2353 in 9. Requesting {rabbitmq_metadata,rabbit@EPCVMSG01} from 2344
2024-09-27 18:23:09.830000+10:00 [info] <0.321.0> RabbitMQ metadata store: detected a new leader {rabbitmq_metadata,rabbit@EPCVMSG01} in term 9
2024-09-27 18:23:10.214000+10:00 [info] <0.538.0> queue 'MonitorViewer.Prod' in vhost '/': follower did not have entry at 848253 in 12. Requesting {'%2F_MonitorViewer.Prod',rabbit@EPCVMSG01} from 848252
2024-09-27 18:23:10.214000+10:00 [info] <0.538.0> queue 'MonitorViewer.Prod' in vhost '/': detected a new leader {'%2F_MonitorViewer.Prod',rabbit@EPCVMSG01} in term 12
2024-09-27 18:23:10.992000+10:00 [info] <0.532.0> queue 'MonitorViewer.PreProd' in vhost '/': granting vote for {'%2F_MonitorViewer.PreProd',rabbit@EPCVMSG01} with last indexterm {17,16} for term 17 previous term was 16
2024-09-27 18:23:11.014000+10:00 [info] <0.532.0> queue 'MonitorViewer.PreProd' in vhost '/': detected a new leader {'%2F_MonitorViewer.PreProd',rabbit@EPCVMSG01} in term 17
2024-09-27 18:23:11.269000+10:00 [info] <0.490.0> queue 'BMMS.EKA.Transactions.Tst' in vhost '/': follower did not have entry at 205091 in 57. Requesting {'%2F_BMMS.EKA.Transactions.Tst',rabbit@EPCVMSG01} from 205091
2024-09-27 18:23:11.269000+10:00 [info] <0.490.0> queue 'BMMS.EKA.Transactions.Tst' in vhost '/': detected a new leader {'%2F_BMMS.EKA.Transactions.Tst',rabbit@EPCVMSG01} in term 57
2024-09-27 18:23:11.447000+10:00 [info] <0.500.0> queue '5BF.Mainframe.Tst' in vhost '/': detected a new leader {'%2F_5BF.Mainframe.Tst',rabbit@EPCVMSG01} in term 70
2024-09-27 18:23:17.468000+10:00 [info] <0.502.0> queue '5BF.CastingGUI.IT' in vhost '/': detected a new leader {'%2F_5BF.CastingGUI.IT',rabbit@5BFVPMC02} in term 74
2024-09-27 18:23:17.672000+10:00 [info] <0.512.0> queue '5BF.LIMS.HMT' in vhost '/': detected a new leader {'%2F_5BF.LIMS.HMT',rabbit@5BFVPMC02} in term 91
2024-09-27 18:23:18.460000+10:00 [info] <0.524.0> queue '5BF.BurdenRecipe.IT' in vhost '/': detected a new leader {'%2F_5BF.BurdenRecipe.IT',rabbit@5BFVPMC02} in term 105
2024-09-27 18:23:24.055000+10:00 [notice] <0.532.0> queue 'MonitorViewer.PreProd' in vhost '/': candidate -> leader in term: 18 machine version: 3
2024-09-27 18:23:32.384000+10:00 [info] <0.426.0> node rabbit@EPCVMSG01 down: connection_closed
2024-09-27 18:23:46.561000+10:00 [warning] <0.24745337.0> 'global' at rabbit@5BFVPMC01 failed to connect to rabbit@EPCVMSG01
2024-09-27 18:23:46.561000+10:00 [warning] <0.24745337.0> 
2024-09-27 18:23:47.611000+10:00 [info] <0.426.0> node rabbit@EPCVMSG02 up
2024-09-27 18:23:47.612000+10:00 [info] <0.506.0> queue '5BF.BISRA' in vhost '/': granting vote for {'%2F_5BF.BISRA',rabbit@5BFVPMC02} with last indexterm {18994,110} for term 111 previous term was 111
2024-09-27 18:23:47.613000+10:00 [info] <0.484.0> queue '5BF.TrendTransfers.OT' in vhost '/': follower did not have entry at 187701 in 56. Requesting {'%2F_5BF.TrendTransfers.OT',rabbit@EPCVMSG02} from 187692
2024-09-27 18:23:47.613000+10:00 [info] <0.484.0> queue '5BF.TrendTransfers.OT' in vhost '/': detected a new leader {'%2F_5BF.TrendTransfers.OT',rabbit@EPCVMSG02} in term 56
2024-09-27 18:23:47.613000+10:00 [info] <0.496.0> queue '5BF.IMHA.Prod' in vhost '/': follower did not have entry at 719042 in 70. Requesting {'%2F_5BF.IMHA.Prod',rabbit@EPCVMSG02} from 719028
2024-09-27 18:23:47.613000+10:00 [info] <0.496.0> queue '5BF.IMHA.Prod' in vhost '/': detected a new leader {'%2F_5BF.IMHA.Prod',rabbit@EPCVMSG02} in term 70
2024-09-27 18:23:47.613000+10:00 [info] <0.530.0> queue '5BF.TrendTransfers.IT' in vhost '/': follower did not have entry at 187768 in 71. Requesting {'%2F_5BF.TrendTransfers.IT',rabbit@EPCVMSG02} from 187761
2024-09-27 18:23:47.613000+10:00 [info] <0.508.0> queue '5BF.CastingGUI.OT' in vhost '/': follower did not have entry at 234971 in 78. Requesting {'%2F_5BF.CastingGUI.OT',rabbit@EPCVMSG02} from 234959
2024-09-27 18:23:47.613000+10:00 [info] <0.522.0> queue '5BF.BurdenRecipe.OT' in vhost '/': follower did not have entry at 75438 in 77. Requesting {'%2F_5BF.BurdenRecipe.OT',rabbit@EPCVMSG02} from 75427
2024-09-27 18:23:47.613000+10:00 [info] <0.504.0> queue '5BF.CastToLevel2' in vhost '/': follower did not have entry at 99259 in 90. Requesting {'%2F_5BF.CastToLevel2',rabbit@EPCVMSG02} from 99256
2024-09-27 18:23:47.613000+10:00 [info] <0.522.0> queue '5BF.BurdenRecipe.OT' in vhost '/': detected a new leader {'%2F_5BF.BurdenRecipe.OT',rabbit@EPCVMSG02} in term 77
2024-09-27 18:23:47.613000+10:00 [info] <0.508.0> queue '5BF.CastingGUI.OT' in vhost '/': detected a new leader {'%2F_5BF.CastingGUI.OT',rabbit@EPCVMSG02} in term 78
2024-09-27 18:23:47.613000+10:00 [info] <0.530.0> queue '5BF.TrendTransfers.IT' in vhost '/': detected a new leader {'%2F_5BF.TrendTransfers.IT',rabbit@EPCVMSG02} in term 71
2024-09-27 18:23:47.614000+10:00 [info] <0.528.0> queue '5BF.EventViewer.OT' in vhost '/': follower did not have entry at 2932549 in 60. Requesting {'%2F_5BF.EventViewer.OT',rabbit@EPCVMSG02} from 2932538
2024-09-27 18:23:47.614000+10:00 [info] <0.504.0> queue '5BF.CastToLevel2' in vhost '/': detected a new leader {'%2F_5BF.CastToLevel2',rabbit@EPCVMSG02} in term 90
2024-09-27 18:23:47.614000+10:00 [info] <0.528.0> queue '5BF.EventViewer.OT' in vhost '/': detected a new leader {'%2F_5BF.EventViewer.OT',rabbit@EPCVMSG02} in term 60
2024-09-27 18:23:47.614000+10:00 [info] <0.498.0> queue '5BF.BurdenRecipe.Service' in vhost '/': follower did not have entry at 1027134 in 74. Requesting {'%2F_5BF.BurdenRecipe.Service',rabbit@EPCVMSG02} from 1027131
2024-09-27 18:23:47.614000+10:00 [info] <0.542.0> queue '5BF.EventViewer.IT' in vhost '/': follower did not have entry at 2932120 in 88. Requesting {'%2F_5BF.EventViewer.IT',rabbit@EPCVMSG02} from 2932110
2024-09-27 18:23:47.614000+10:00 [info] <0.546.0> queue '5BF.PAM.OT' in vhost '/': follower did not have entry at 3616811 in 65. Requesting {'%2F_5BF.PAM.OT',rabbit@EPCVMSG02} from 3616791
2024-09-27 18:23:47.614000+10:00 [info] <0.550.0> queue '5BF.LIMS' in vhost '/': follower did not have entry at 92448 in 114. Requesting {'%2F_5BF.LIMS',rabbit@EPCVMSG02} from 92440
2024-09-27 18:23:47.614000+10:00 [info] <0.550.0> queue '5BF.LIMS' in vhost '/': detected a new leader {'%2F_5BF.LIMS',rabbit@EPCVMSG02} in term 114
2024-09-27 18:23:47.614000+10:00 [info] <0.542.0> queue '5BF.EventViewer.IT' in vhost '/': detected a new leader {'%2F_5BF.EventViewer.IT',rabbit@EPCVMSG02} in term 88
2024-09-27 18:23:47.614000+10:00 [info] <0.546.0> queue '5BF.PAM.OT' in vhost '/': detected a new leader {'%2F_5BF.PAM.OT',rabbit@EPCVMSG02} in term 65
2024-09-27 18:23:47.614000+10:00 [info] <0.498.0> queue '5BF.BurdenRecipe.Service' in vhost '/': detected a new leader {'%2F_5BF.BurdenRecipe.Service',rabbit@EPCVMSG02} in term 74
2024-09-27 18:23:47.619000+10:00 [info] <0.506.0> queue '5BF.BISRA' in vhost '/': follower did not have entry at 18994 in 110. Requesting {'%2F_5BF.BISRA',rabbit@5BFVPMC02} from 18989
2024-09-27 18:23:47.619000+10:00 [info] <0.506.0> queue '5BF.BISRA' in vhost '/': detected a new leader {'%2F_5BF.BISRA',rabbit@5BFVPMC02} in term 111
2024-09-27 18:23:47.640000+10:00 [notice] <0.490.0> queue 'BMMS.EKA.Transactions.Tst' in vhost '/': candidate -> leader in term: 58 machine version: 3
2024-09-27 18:23:47.989000+10:00 [info] <0.5880832.0> queue '5BF.ProcessCalculations' in vhost '/': granting vote for {'%2F_5BF.ProcessCalculations',rabbit@EPCVMSG02} with last indexterm {35231,3} for term 4 previous term was 3
2024-09-27 18:23:47.995000+10:00 [info] <0.5880832.0> queue '5BF.ProcessCalculations' in vhost '/': detected a new leader {'%2F_5BF.ProcessCalculations',rabbit@EPCVMSG02} in term 4
2024-09-27 18:23:48.474000+10:00 [info] <0.426.0> node rabbit@EPCVMSG01 up
2024-09-27 18:23:48.475000+10:00 [info] <0.486.0> queue '5BF.PAMTransfers' in vhost '/': follower did not have entry at 2915587 in 73. Requesting {'%2F_5BF.PAMTransfers',rabbit@EPCVMSG01} from 2915579
2024-09-27 18:23:48.475000+10:00 [info] <0.510.0> queue '5BF.Mainframe.Prod' in vhost '/': follower did not have entry at 152613 in 75. Requesting {'%2F_5BF.Mainframe.Prod',rabbit@EPCVMSG01} from 152613
2024-09-27 18:23:48.475000+10:00 [info] <0.486.0> queue '5BF.PAMTransfers' in vhost '/': detected a new leader {'%2F_5BF.PAMTransfers',rabbit@EPCVMSG01} in term 73
2024-09-27 18:23:48.475000+10:00 [info] <0.510.0> queue '5BF.Mainframe.Prod' in vhost '/': detected a new leader {'%2F_5BF.Mainframe.Prod',rabbit@EPCVMSG01} in term 75
2024-09-27 18:23:48.475000+10:00 [info] <0.526.0> queue '5BF.CastingEvents.Service' in vhost '/': follower did not have entry at 82180 in 67. Requesting {'%2F_5BF.CastingEvents.Service',rabbit@EPCVMSG01} from 82180
2024-09-27 18:23:48.476000+10:00 [info] <0.544.0> queue '5BF.CastScheduler.OT' in vhost '/': follower did not have entry at 5637 in 14. Requesting {'%2F_5BF.CastScheduler.OT',rabbit@EPCVMSG01} from 5636
2024-09-27 18:23:48.476000+10:00 [info] <0.526.0> queue '5BF.CastingEvents.Service' in vhost '/': detected a new leader {'%2F_5BF.CastingEvents.Service',rabbit@EPCVMSG01} in term 67
2024-09-27 18:23:48.476000+10:00 [info] <0.544.0> queue '5BF.CastScheduler.OT' in vhost '/': detected a new leader {'%2F_5BF.CastScheduler.OT',rabbit@EPCVMSG01} in term 14
2024-09-27 18:23:49.048000+10:00 [info] <0.538.0> queue 'MonitorViewer.Prod' in vhost '/': follower did not have entry at 848263 in 12. Requesting {'%2F_MonitorViewer.Prod',rabbit@EPCVMSG01} from 848256
2024-09-27 18:23:49.048000+10:00 [info] <0.538.0> queue 'MonitorViewer.Prod' in vhost '/': detected a new leader {'%2F_MonitorViewer.Prod',rabbit@EPCVMSG01} in term 12
2024-09-27 18:23:49.216000+10:00 [info] <0.500.0> queue '5BF.Mainframe.Tst' in vhost '/': follower did not have entry at 446 in 70. Requesting {'%2F_5BF.Mainframe.Tst',rabbit@EPCVMSG01} from 444
2024-09-27 18:23:49.216000+10:00 [info] <0.500.0> queue '5BF.Mainframe.Tst' in vhost '/': detected a new leader {'%2F_5BF.Mainframe.Tst',rabbit@EPCVMSG01} in term 70
2024-09-27 18:23:49.312000+10:00 [info] <0.321.0> RabbitMQ metadata store: follower did not have entry at 2381 in 9. Requesting {rabbitmq_metadata,rabbit@EPCVMSG01} from 2373
2024-09-27 18:23:49.313000+10:00 [info] <0.321.0> RabbitMQ metadata store: detected a new leader {rabbitmq_metadata,rabbit@EPCVMSG01} in term 9
2024-09-27 20:49:10.480000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 20:49:10.480000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}
2024-09-27 21:32:07.981000+10:00 [error] <0.604.0> Could not infer the number of file handles used: {badmatch,
2024-09-27 21:32:07.981000+10:00 [error] <0.604.0>                                                   {error,{exit_status,1}}}

At 18:22:53.886000 it appears that connection was lost from the airgapped nodes to the others. Upon reconnection, quorums determined leaders. The interesting part is that certain queues like 5BF.PAMTransfers or 5BF.BurdenRecipe.Service started experiencing a peculiar issue. They were not receiving any messages for particular routing keys, this is because the quorum queue would start automatically acknowledging the messages and prevent the consumer from knowing what it is. To simulate that this was the issue, I dropped the consumer and the queue was acknowledging every message that was coming into it (didn't know the best way to show this was the case):

I resolved the issue by restarting all of the nodes. Not sure why this occurred and why for certain queues not all of them?

Hypothesis, but again not sure. These queues have a routing key that is frequently used. I wonder if mid way through consumption the airgapped environment dropped and on reconnection it keeps telling all nodes that the message has been acknowledged. So this caused an endless cycle. Unsure, but could be an idea. Not sure what to do to check this theory.

Reproduction steps

Unsure how to reproduce it, as its more of an edge case issue or could be a configuration issue on my behalf than a bug. Please move this to the correct label / branch if thats the case.

Expected behavior

Messages to not be automatically consumed by quorum queues until consumers themselves acknowledge the message.

Additional context

Please let me know if there is anything else I can provide to discern the issue more or accurately depict it better.

michaelklishin · 2024-09-30T02:18:54Z

michaelklishin
Sep 30, 2024
Maintainer

RabbitMQ 3.13.x is out of community support.

Even before 4.0.x shipped, we would not help you with anything on 3.13.0. Please upgrade to 4.0.2 or at least 3.13.7.

I'm afraid a screenshot with a few metrics that are all zeroes does not prove or demonstrate anything. The logs demonstrate that two nodes lost network connectivity and then a few quorum queues have had a leader election.

Finally, "could not infer the number of file handles" has nothing to do with quorum queues per se. This is a message logged by an external/infrastructure node stats collector.

0 replies

michaelklishin · 2024-09-30T02:19:55Z

michaelklishin
Sep 30, 2024
Maintainer

@GroovyRice GitHub does not send out notifications when an issue is moved to a discussion, so here is your notification. See my response above.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

A question about a quorum queue behavior on 3.13.0 #12404

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

A question about a quorum queue behavior on 3.13.0 #12404

Uh oh!

Uh oh!

GroovyRice Sep 30, 2024

Describe the bug

Reproduction steps

Expected behavior

Additional context

Replies: 2 comments

Uh oh!

michaelklishin Sep 30, 2024 Maintainer

Uh oh!

michaelklishin Sep 30, 2024 Maintainer

GroovyRice
Sep 30, 2024

michaelklishin
Sep 30, 2024
Maintainer

michaelklishin
Sep 30, 2024
Maintainer