Skip to content

Fix Shape.Supervisor not restarting #3630

@robacourt

Description

@robacourt

Current behaviour

If a child of the Shape.Supervisor crashes more than 3 times in 5 seconds the Shape.Supervisor shuts down and doesn't restart because it is transient. The WAL will then start to build up. If the connection restarts the Connection.Manager with then report that the Shape.Supervisor is :already_present.

This happened in production for AutoArc when the ShapeLogCollector crashed 4 times in under a second due to trying to call a missing Materializer.

Suggested behaviour

If a child of the Shape.Supervisor crashes more than 3 times in 5 seconds the Shape.Supervisor should be restarted but with all shape data wiped. It is important that the shape data is wiped in this situation as otherwise the ShapeLogCollector will go on trying to processes the txn/shape combinations it crashes on, going into an infinite crash loop. Ideally the Shape.Supervisor should be permanent not transient.

Please note that the Connection.Manager should not match on :already_present. :already_present indicates a serious issue has already happened (the shape subsystem has gone down and has not been restarted) and there is no point in handling this gracefully.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions