Skip to content

feat(backend): init horizontal clustering#1201

Open
lusergit wants to merge 5 commits intoedgehog-device-manager:mainfrom
lusergit:push-nrvnswvyxzqv
Open

feat(backend): init horizontal clustering#1201
lusergit wants to merge 5 commits intoedgehog-device-manager:mainfrom
lusergit:push-nrvnswvyxzqv

Conversation

@lusergit
Copy link
Collaborator

@lusergit lusergit commented Jan 28, 2026

Horizontally clustering edgheog

Based on #1199

These changes aim at automatically scale edgheog when multiple nodes are able to see each other.

libcluster

libcluster is being added to allow different nodes to see each other. Depending on the deployment environment, different strategies can be selected to query services and discover other edgheog instances in the same cluster.

horde

Horde has been added to allow registries with different processes to share the load across multiple replicas. This would allow a better management of active processes and automatically handles netsplits.

Checklist

  • I have read the CONTRIBUTING.md
  • I have added tests that prove my fix is effective or that my feature works
  • I have added or updated documentation (if appropriate)

Further Comments

Communication between different services of edgehog happen trough the pubsub module (Edgehog.PubSub), which internally uses the Phoenix.PubSub. This mechanism already shares messages between different replicas, hence messages are free to pass between one replica and the other
(e.g., a campaign is started on one node, hence its process is active on one node, but a deployment is updated in another node. In this case the pubsub mechanism correctly messages all listening services, even the one on different nodes, and the services should be able to work as usual).

Optional: testing

Testing this new feature can be done by adding

  edgehog-backend:
    image: edgehogdevicemanager/edgehog-backend:0.10.0
    build:
      context: backend
    # This section here
    deploy:
      replicas: 6 
    # ...

In the docker-compose.yaml file

@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch 3 times, most recently from 6b06d59 to 13cf0b7 Compare January 29, 2026 08:52
@coveralls
Copy link

coveralls commented Jan 29, 2026

Pull Request Test Coverage Report for Build 0f8a90b75d89dc74710f86829858fc0614fd290a-PR-1201

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 24 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.3%) to 81.35%

Files with Coverage Reduction New Missed Lines %
lib/edgehog/application.ex 1 88.89%
lib/edgehog/devices/reconciler/reconciler.ex 7 31.82%
lib/edgehog/config.ex 16 41.18%
Totals Coverage Status
Change from base Build 898ad56e5c2bac7413e1fdf031b5c9c6243cad6a: -0.3%
Covered Lines: 2735
Relevant Lines: 3362

💛 - Coveralls

@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch 12 times, most recently from 7c1c7c6 to bc88347 Compare February 3, 2026 13:38
@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch 6 times, most recently from ab5cb83 to 5d89de1 Compare February 12, 2026 10:09
@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch 6 times, most recently from 9a51c35 to ab965df Compare February 19, 2026 16:28
@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch 2 times, most recently from 0f57caf to 4948a33 Compare February 23, 2026 11:56
@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch 4 times, most recently from 2977a5e to a475ed8 Compare March 2, 2026 16:24
@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch 4 times, most recently from 518c0a1 to 159c707 Compare March 9, 2026 16:20
lusergit added 5 commits March 9, 2026 17:40
Signed-off-by: Luca Zaninotto <luca.zaninotto@secomind.com>
Adds libcluster configs to scale edgehog horizontally.

This allows edgehog to manage the workload among different replicas for
- campaign execution
- notifications
- reconciliation tasks

New environment vairables have been introduced, to allow different edgehog nodes
to see eachother based on the deployment strategy.

- `EDGEHOG_CLUSTERING_STRATEGY`: one of `none`, `docker-compose` or
  `kubernetes`. This chooses the strategy edgehog will use to lookup other nodes
  in the cluster.

- `EDGEHOG_CLUSTERING_KUBERNETES_SELECTOR`: The endpoint label to get other
  edgehog instances. This defaults to `app=edgehog`.

- `EDGEHOG_CLUSTERING_KUBERNETES_NAMESPACE` the kubernetes namespace to find
  other edgehog instances. This defaults to `edgehog`.

Signed-off-by: Luca Zaninotto <luca.zaninotto@secomind.com>
When deploying on kubernetes, it is possible to deploy multiple replicas of the
backend service. To do so, a couple fo environment variables have been added to
instruct edgehog on how to find and connect to other nodes.

Signed-off-by: Luca Zaninotto <luca.zaninotto@secomind.com>
Signed-off-by: Luca Zaninotto <luca.zaninotto@secomind.com>
Moves relevant registries and supervisors in the application tree to allow
edgehog to scale horizontally. This is done only with registries and supervsors
where it makes sens to chare the load:

- `Containers.Reconciler`, where the process spawns per-tenant and talks with
  the DB.
- `Tenant.Reconciler`, where tasks and processes are again spawned per-tenant.
- `Devices.Reconciler` is moved to a single process managed trough a horde
  process to make it single per-cluster.

Signed-off-by: Luca Zaninotto <luca.zaninotto@secomind.com>
@lusergit lusergit force-pushed the push-nrvnswvyxzqv branch from 159c707 to 66497cd Compare March 9, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants