Skip to content

Conversation

@dcorbacho
Copy link
Contributor

These shovels are stuck in a restart loop and need to be listed on shovel status, which also allows for its deletion

See #14623

These shovels are stuck in a restart loop and need to be listed on
shovel status, which also allows for its deletion
@mkuratczyk
Copy link
Contributor

One of two scenarios seems to be resolved.

Scenario 1 (seems solved):
I declared many shovels with an invalid src-uri:

for i in (seq 100); rabbitmqctl set_parameter shovel myshovel-$i '{"src-protocol": "amqp091", "src-uri": "amqp://foo", "src-queue": "q1", "dest-protocol": "amqp10", "dest-uri": "amqp://localhost", "dest-address": "/queues/q2"}'; end;

and delete them shortly after

for i in (seq 100); rabbitmqctl delete_shovel myshovel-$i; end

This was not deterministic but usually afterwards I'd have 1 shovel listed on Shovel status page, even though it was "deleted" (the delete_shovel command succeeded). I can no longer reproduce this.

Scenario 2 (still failing):
Declare a shovel with a non-existent dest-address:

rabbitmqctl set_parameter shovel myshovel '{"src-protocol": "amqp091", "src-uri": "amqp://localhost", "src-queue": "q1", "dest-protocol": "amqp10", "dest-uri": "amqp://localhost", "dest-address": "/queues/q2"}'

Such a shovel is failing with:

{outbound_link_detached,{'v1_0.error',{symbol,<<"amqp:not-found">>},
                                      {utf8,<<"no queue 'q2' in vhost '/'">>},
                                      undefined}}

Now try to delete it:

$ rabbitmqctl delete_shovel myshovel
Deleting shovel myshovel in vhost /
Stack trace: 

** (FunctionClauseError) no function clause matching in :proplists.get_value/3
    (stdlib 7.1) proplists.erl:222: :proplists.get_value(:node, {:outbound_link_detached, {:"v1_0.error", {:symbol, "amqp:not-found"}, {:utf8, "no queue 'q2' in vhost '/'"}, :undefined}}, :rabbit@K6L59PF0JR)
    (rabbitmq_shovel 4.2.0+beta.4.10.g9f39f60.dirty) Elixir.RabbitMQ.CLI.Ctl.Commands.DeleteShovelCommand.erl:84: RabbitMQ.CLI.Ctl.Commands.DeleteShovelCommand.run/2
    (rabbitmqctl 4.2.0+beta.4.9.ga09383d.dirty) lib/rabbitmqctl.ex:185: RabbitMQCtl.maybe_run_command/3
    (rabbitmqctl 4.2.0+beta.4.9.ga09383d.dirty) lib/rabbitmqctl.ex:153: anonymous fn/5 in RabbitMQCtl.do_exec_parsed_command/5
    (rabbitmqctl 4.2.0+beta.4.9.ga09383d.dirty) lib/rabbitmqctl.ex:653: RabbitMQCtl.maybe_with_distribution/3
    (rabbitmqctl 4.2.0+beta.4.9.ga09383d.dirty) lib/rabbitmqctl.ex:118: RabbitMQCtl.exec_command/2
    (rabbitmqctl 4.2.0+beta.4.9.ga09383d.dirty) lib/rabbitmqctl.ex:52: RabbitMQCtl.main1/1
    (elixir 1.18.4) lib/kernel/cli.ex:137: anonymous fn/3 in Kernel.CLI.exec_fun/2

Error:
:function_clause

@mkuratczyk
Copy link
Contributor

one more scenario I now tried: a happily running shovel can be deleted from any node (rabbitmqctl -n value). However, a shovel with an invalid URL, even with this branch, requires targeting a specific node:

make start-cluster

# declare a shovel running on rabbit-1, the URL is not reachable
rabbitmqctl -n rabbit-1 set_parameter shovel myshovel-1 '{"src-protocol": "amqp091", "src-uri": "amqp://localhost", "src-queue": "q1", "dest-protocol": "amqp10", "dest-uri": "amqp://foo", "dest-address": "/queues/q2"}'

# declare a shovel running on rabbit-2, the URL is not reachable
rabbitmqctl -n rabbit-2 set_parameter shovel myshovel-2 '{"src-protocol": "amqp091", "src-uri": "amqp://localhost", "src-queue": "q1", "dest-protocol": "amqp10", "dest-uri": "amqp://foo", "dest-address": "/queues/q2"}'

# delete shovel on rabbit-1 from rabbit-1 (works)
rabbitmqctl -n rabbit-1 delete_shovel myshovel-1

# delete shovel on rabbit-2 from rabbit-1 (doesn't work)
rabbitmqctl -n rabbit-1 delete_shovel myshovel-2
Deleting shovel myshovel-2 in vhost /
Error:
Shovel with the given name was not found on the target node 'rabbit-1@K6L59PF0JR' and/or virtual host '/'. It may be failing to connect and report its state, will delete its runtime parameter...

@michaelklishin
Copy link
Collaborator

Just FTR, distributed shovels in Tanzu RabbitMQ should deal with that scenario well @mkuratczyk :)

%% terminated
({Name, Type, {terminated, Reason}, Metrics, Timestamp}) ->
{Name, Type, {terminated, Reason}, Metrics, Timestamp};
{Name, Type, {terminated, [{node, Node}], Reason}, Metrics, Timestamp};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelklishin Is this a breaking change? It seems just delete/restart CLI commands use cluster_status_with_nodes

@dcorbacho dcorbacho marked this pull request as ready for review September 30, 2025 08:22
@michaelklishin michaelklishin added this to the 4.3.0 milestone Sep 30, 2025
@michaelklishin michaelklishin merged commit 270c43f into main Sep 30, 2025
285 checks passed
@michaelklishin michaelklishin deleted the issue-14623 branch September 30, 2025 16:25
michaelklishin added a commit that referenced this pull request Sep 30, 2025
Shovels: fix shovel status and deletion of failed shovels (backport #14637)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants