Skip to content
This repository was archived by the owner on Jan 26, 2023. It is now read-only.

During a jobspec change to the redis-server task, the cluster did not automatically rebuild #34

@jcjones

Description

@jcjones

I changed the name of the redis-server task (from server to redis-server) and redeployed the jobspec. For each allocation, after it restarted with the new task name, it failed to rejoin the cluster until being restarted a second time (via the Nomad GUI).

Attache-control logs:

time="2022-08-13T00:20:54Z" level=info msg="starting /usr/local/bin/attache-control"
time="2022-08-13T00:20:54Z" level=info msg="initializing a new redis client"
time="2022-08-13T00:20:54Z" level=info msg="initializing a new consul client"
time="2022-08-13T00:20:54Z" level=info msg="fetching scaling options from consul path 'service/redis-cluster/scaling'"
time="2022-08-13T00:20:57Z" level=info msg="this node is already part of an existing cluster"
time="2022-08-13T00:20:57Z" level=info msg="running until killed..."

Redis however is not part of a cluster:

10.0.32.81:20001> cluster nodes
0bd16fb965741d36e64304458b4f0264c248d25e 10.0.32.81:20001@30001 myself,master - 0 0 0 connected

The Redis log is very empty, no mention of being told to join:

1:C 13 Aug 2022 00:20:46.789 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 13 Aug 2022 00:20:46.789 # Redis version=6.2.7, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 13 Aug 2022 00:20:46.789 # Configuration loaded
1:M 13 Aug 2022 00:20:46.796 # A key '__redis__compare_helper' was added to Lua globals which is not on the globals allow list nor listed on the deny list.
1:M 13 Aug 2022 00:20:46.796 # Server initialized
1:M 13 Aug 2022 00:20:46.796 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
1:M 13 Aug 2022 00:20:46.847 # IP address for this node updated to 10.0.32.81

This is trivially fixable with operator intervention by just restarting the alloc again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions