Skip to content

Workers randomly die for some reason #3612

@MoralCode

Description

@MoralCode

This is mostly an issue just to document some weird behavior ive seen so i can reference it in a PR.

Main symptoms: all workers show "offline" in flower. logs for rabbit show really unhelpful (but scary) red text. lots of it looks kidna like this, with lots of indents, almost like its dumping its internal state or something (the weird unicode you see at the beginning and end are the ASCII color codes)

Stack Trace
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>     reason: {noproc,{gen_server,call,[mnesia_sync,sync,infinity]}}�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>     offender: [{pid,<0.533.0>},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                {id,rabbit_amqqueue},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                {mfargs,�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                    {rabbit_prequeue,start_link,�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                        [{amqqueue,�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                             {resource,<<"augur_vhost">>,queue,<<"celery">>},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                             true,false,none,[],<0.533.0>,[],[],[],undefined,�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                             undefined,[],[],stopped,0,[],<<"augur_vhost">>,�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                             #{user => <<"augur">>},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                             rabbit_classic_queue,#{}},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                         recovery,<0.531.0>]}},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                {restart_type,transient},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                {significant,true},�[0m
rabbitmq-1      | �[38;5;160m2026-01-16 03:30:43.238430+00:00 [error] <0.532.0>                {shutdown,600000},�[0m

I suspect (after seeing this a couple times) this may be either some kind of startup race condition, or some kind of issue tied to running out of disk space on the root disk (which i remember hearing from @sgoggins recently that some part of augur does not like very much/is least able to handle it).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions