Skip to content
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 12 additions & 7 deletions src/Control/Distributed/Process/Platform/Supervisor.hs
Original file line number Diff line number Diff line change
Expand Up @@ -1169,18 +1169,23 @@ doRestartChild _ spec _ state = do -- TODO: use ProcessId and DiedReason to log
case start' of
Right (ref, st') -> do
return $ markActive st' ref spec
Left _ -> do -- TODO: handle this by policy
Left err -> do
-- All child failures are handled via monitor signals, apart from
-- BadClosure, which comes back from doStartChild as (Left err).
-- Since we cannot recover from that, there's no point in trying
-- to start this child again (as the closure will never resolve),
-- so we remove the child forthwith. We should provide a policy
-- for handling this situation though...
return $ ( (active ^: Map.filter (/= chKey))
-- BadClosure and UnresolvableAddress from the StarterProcess
-- variants of ChildStart, which both come back from
-- doStartChild as (Left err).
sup <- getSelfPid
logEntry Log.error $
mkReport "Unrecoverable error in child" sup (childKey spec) (show err)
if isTemporary (childRestart spec)
then return $ ( (active ^: Map.filter (/= chKey))
. (bumpStats Active chType decrement)
. (bumpStats Specified chType decrement)
$ removeChild spec st
)
else die $ "Unrecoverable error in child " ++ (childKey spec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if die is the right way to crap out here. It might be better to refactor the type of this function so that we can stop with a non-normal exit reason. The question is, do we want to propagate any information to the children at this point? If there is a shutdown strategy in place (order, timeouts, etc) do we want to observe the configured policies when terminating the other children? Just crashing the supervisor process is a pretty extreme step to take.

Let me poke around in the code and see how things fit (since I've forgotten!) and we can discuss this point a bit more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. What I intended was shutdown the other children and exit abnormally. I was forgetting that die is too brutal for that.

-- TODO: convert this to a meaningful exception type

where
chKey = childKey spec
chType = childType spec
Expand Down