Skip to content

No Management Proxy Node: Coordinator randomly goes down #606

@DiscordJim

Description

@DiscordJim

Affected Stackable version

24.3

Affected Apache Druid version

28.0.1

Current and expected behavior

After roughly 3-4 days, the router will display "No Management Proxy Node." It seems, from testing, that the error is that the router cannot connect to the coordinator. However, all services display healthy logs and there are no clear errors, nor error codes from the panel.

The difficulty to debug comes from the fact that there are no errors.

Possible solution

The only way we have to recover from this state is to restart all services.

Additional context

  • Extensions: '["druid-kafka-indexing-service", "druid-datasketches", "prometheus-emitter", "druid-basic-security", "druid-opa-authorizer", "postgresql-metadata-storage", "druid-hdfs-storage", "druid-stats"]'
  • Deep Storage: HDFS
  • Metadata Store: Postgres

Environment

AKS

Would you like to work on fixing this bug?

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions