Skip to content

Registered processes

Paul Louth edited this page Jan 6, 2021 · 2 revisions

Source

When you register a Process it does one of two things:

  • If the Process is local-only, then it gets registered in an in-memory map of names to ProcessIds
  • If the Process is visible to the cluster, then it gets registered in a Redis map of names to ProcessIds

Registered names are name-spaced with the role that the node doing the registering is in. The result of calling Process.register(name) is also a ProcessId that looks like this: /disp/reg/<role>-<name>. So there is a dispatcher for registered Processes called reg.

The default behaviour of this dispatcher is to get the full list of processes that have been registered with a specific name, and dispatch to all of them (broadcast). This behaviour is the most expected, because it doesn't pass any judgement on who registered what when. It simply realises there are multiple processes registered with the same name, and you're trying to communicate with a Process by name, and therefore that's all of them.

find

If you call Process.find(name) the system merely returns /disp/reg/<current_role>-<name> - so as processes register or de-register the number of possible destinations for a message increases and decreases dynamically.

The keen eyed amongst you may realise that if you can get n processes registering themselves as 'a named thing', then you could implement high-availability strategies. And to that end, you can combine a registered ProcessId with other dispatcher behaviour. i.e.

    ProcessId pid = Disptach.leastBusy(find("mail-server"));
    tell(pid, msg);

The pid variable above would look like this:

    /disp/least-busy/disp/reg/mail-server

This is actually a general feature of dispatchers that they can be combined. You can imagine that the reg dispatcher returns a list of registered mail-server ProcessIds and then the least-busy dispatcher finds out which of those mail-server processes has the smallest queue before dispatching the message.

Use @name instead of find("name")

In any reasonably sized system you'll want to register all reachable endpoints so that you don't litter your code with path construction or hard-coded ProcessId strings. Therefore you'll be calling find("name") a lot. Because of that you can use the short-cut: "@name". i.e.

    // This
    tell(find("name"), msg);

    // Becomes this
    tell("@name", msg);

Finding processes in other roles

Note that find("name") and "@name" will only find registered processes in your current role. To find registered processes in other roles, use: find("role","name") or "@role:name". This name-spacing of registered processes should remove the inevitable clash of common names like 'service' or 'app' whilst still allowing cross-role communication.

Registering dispatchers

You can take registered processes even further and instead of just registering a ProcessId that refers to a single Process, you can register a 'dispatcher' ProcessId. Remember, when you call register(name, pid) the pid is a ProcessId and so are the special dispatcher ProcessIds. So you could do this:

    var pid1 = spawn("proc1", ... );
    var pid2 = spawn("proc2", ... );
    var pid3 = spawn("proc3", ... );

    ProcessId parcelPassing = Dispatch.roundRobin(pid1,pid2,pid3);   //  /disp/round-robin/[pid1,pid2,pid3]

    var reg = register("pass-parcel", parcelPassing);

The value of reg would be:

    /disp/reg/pass-parcel

If you then did a series of tell calls against reg then the messages would be sent round-robin to pid1, pid2 and pid3. This has very similar functionality to routers without the need for a router Process.

If you think the implications of that through further, let's say you had two data-centres and you wanted an 'eventually consistent' system by sending the same message to both data-centres, but you wanted the least-busy of 3 nodes in each centre to receive the message. A node in each centre could register a least-busy dispatcher ProcessId under the same registered name, and because the default behaviour of the registered dispatcher is to broadcast, you'd get the exact behaviour you wanted.

Things to note are that this isn't an aliveness system (see the roles section for that). A registered Process is registered until:

  • You call deregisterById(pid)
  • You call kill(pid) Killing a process wipes its state, inbox and registrations. If you want to kill a process but maintain its cluster state then call: shutdown(pid). So if a registered Process is offline then its inbox will keep filling up until it comes back online - that facilitates eventually consistent behaviour.

So its behaviour is more like a DNS system for processes.

De-registering

To de-register a process you should call:

    deregisterById(pid);

You must provide the exact same ProcessId as you did when you registered, otherwise nothing will happen. This is especially true if you de-register a dispatcher or role.

You can also wipe all registrations for by name:

    deregisterByName(name);

That will clear all registrations for the name specified. This is pretty brutal behaviour, because you don't know who else in the cluster has registered a Process and you're basically wiping their decision. You could use it as a type of leader election system (by deregistering everyone else and registering yourself); but one thing to note is the process wouldn't be atomic, and is therefore not particularly bulletproof.

Source

Clone this wiki locally