Skip to content

Commit 0058923

Browse files
committed
Update Blog “announcing-hpe-swarm-learning-2-0-0”
1 parent 37b79b1 commit 0058923

File tree

1 file changed

+2
-7
lines changed

1 file changed

+2
-7
lines changed

content/blog/announcing-hpe-swarm-learning-2-0-0.md

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,13 @@
11
---
22
title: Announcing HPE Swarm Learning 2.0.0
3-
date: 2023-05-11T04:47:46.996Z
3+
date: 2023-06-12T16:48:53.923Z
44
featuredBlog: true
55
author: HPE Swarm Learning Team
66
authorimage: /img/Avatar1.svg
77
disable: false
88
---
9-
<!--StartFragment-->
10-
119
We’re excited to announce HPE Swarm Learning 2.0.0 community release!!
1210

13-
1411
In the previous Swarm version, if the sentinel SN goes down during Swarm training, the training process would stop, and there was no way to resume it. However, with this release, we have addressed the issue by implementing a mesh topology(connectivity) between SNs, replacing the previous star topology where only the sentinel SN was connected to other SNs. Also, we now support multiple blockchain miners instead of just one miner in the sentinel SN. Now, even if the initial sentinel SN goes down, since other SNs also function as miners, it allows the training to continue uninterrupted. Additionally, when the initial sentinel SN is down and if a new SN wants to join the network, it can seamlessly integrate and join the Swarm network with the help of any other SN node. This **high availability configuration** ensures improved resilience and robustness of Swarm Learning.
1512

1613
In Swarm Learning at the sync stage (defined by Sync Frequency), when it is time to share the learning from the individual model, one of the SL nodes is designated as “leader”. This leader node collects the individual models from each peer node and merges them into a single model by combining parameters of all the individuals. **Leader Failure Detection and Recovery (LFDR)** feature enables SL nodes to continue Swarm training during merging process when an SL leader node fails. A new SL leader node is selected to continue the merging process. If the failed SL leader node comes back after the new SL leader node is in action, the failed SL leader node is treated as a normal SL node and contributes its learning to the swarm global model.
@@ -42,6 +39,4 @@ With HPE Swarm Learning v2.0.0 release, user can now extend Swarm client to supp
4239
* #### [HPE Swarm Learning home page](https://github.com/HewlettPackard/swarm-learning)
4340
* [HPE Swarm Learning client readme](https://github.com/HewlettPackard/swarm-learning/blob/master/lib/src/README.md)
4441

45-
#### For any questions, start a discussion in our [\#hpe-swarm-learning](https://hpedev.slack.com/archives/C04A5DK9TUK) slack channel on [HPE Developer Slack Workspace](https://slack.hpedev.io/)
46-
47-
<!--EndFragment-->
42+
#### For any questions, start a discussion in our [\#hpe-swarm-learning](https://hpedev.slack.com/archives/C04A5DK9TUK) slack channel on [HPE Developer Slack Workspace](https://slack.hpedev.io/)

0 commit comments

Comments
 (0)