HewlettPackard
diff --git a/‎.gitignore‎
Lines changed: 39 additions & 1 deletion b/‎.gitignore‎
Lines changed: 39 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 11 additions & 11 deletions b/‎README.md‎
Lines changed: 11 additions & 11 deletions
diff --git a/‎docs/Install/Environment_variables.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/Install/Environment_variables.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/Install/Install_the_License_Server.md‎
Lines changed: 6 additions & 5 deletions b/‎docs/Install/Install_the_License_Server.md‎
Lines changed: 6 additions & 5 deletions
diff --git a/‎docs/Install/Installing_HPE_Swarm_Learning_Management_UI(SLM-UI).md‎
Lines changed: 34 additions & 8 deletions b/‎docs/Install/Installing_HPE_Swarm_Learning_Management_UI(SLM-UI).md‎
Lines changed: 34 additions & 8 deletions
diff --git a/‎docs/Install/Prerequisites.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/Install/Prerequisites.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎examples/README.md‎
Lines changed: 2 additions & 0 deletions b/‎examples/README.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎examples/fraud-detection/swci/taskdefs/user_env_tf_build_task.yaml‎
Lines changed: 2 additions & 0 deletions b/‎examples/fraud-detection/swci/taskdefs/user_env_tf_build_task.yaml‎
Lines changed: 2 additions & 0 deletions
@@ -1,4 +1,42 @@
 .*swo
 .*swp
 **/__pycache__
-workspace/
+workspace/
+
+# Large data files and models
+*.bin
+*.json
+*.txt
+
+# Model files and checkpoints
+pytorch_model.bin
+model.safetensors
+config.json
+tokenizer.json
+vocab.txt
+
+# Data directories
+results/
+saved_models/
+experiments*/
+logs/
+
+# Python virtual environments
+venv/
+env/
+swarm_env/
+
+# Wheel files
+*.whl
+
+# Jupyter notebooks checkpoints
+.ipynb_checkpoints/
+
+# OS generated files
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
@@ -1,6 +1,6 @@
 # <d></d> <img style="float: right;" src="docs/images/GettyImages-1148109728_EAA-graphic-A_112_0_72_RGB.jpg?raw=true"/> SWARM LEARNING
 
-#### Product version: 2.2.0
+#### Product version: 2.3.0
 Swarm Learning is a decentralized, privacy-preserving Machine Learning framework. This framework utilizes the computing power at, or near, the distributed data sources to run the Machine Learning algorithms that train the models. It uses the security of a blockchain platform to share learnings with peers in a safe and secure manner. In Swarm Learning, training of the model occurs at the edge, where data is most recent, and where prompt, data-driven decisions are mostly necessary. In this completely decentralized architecture, only the insights learned are shared with the collaborating ML peers, not the raw data. This tremendously enhances data security and privacy.
 
 Swarm Learning nodes works in collaboration with other Swarm Learning nodes in the network. It regularly shares its learnings with the other nodes and incorporates their insights. This process continues until the Swarm Learning nodes train the model to desired state.   User can monitor the progress of the current training as shown in the below image. It shows all running Swarm nodes, loss, model metric (for example, accuracy) and overall training progress for each User ML node. On hovering over the "progress bar", one can see the number of completed epochs and the total number of epochs.
@@ -38,7 +38,7 @@ NOTE: The participating nodes must be able to access each other's ports.
 
 
 ## User ML component
-User can transform/modify any Keras or PyTorch based ML program that is written using Python3 into a Swarm Learning ML program by [making a few simple changes](./docs/User/How_to_Swarm_enable_an_ML_algorithm.md) to the model training code by including the `SwarmCallback` API. For more information, see any of the [examples](/examples/README.md) included with the Swarm Learning package.
+User can transform/modify any Keras or PyTorch or HuggingFace Trainer class based ML program that is written using Python3 into a Swarm Learning ML program by [making a few simple changes](./docs/User/How_to_Swarm_enable_an_ML_algorithm.md) to the model training code by including the `SwarmCallback` API. For more information, see any of the [examples](/examples/README.md) included with the Swarm Learning package.
 
 The transformed user Machine Learning \(user ML node\) program can be built as a Docker container or can be run on the host.
 
@@ -50,19 +50,20 @@ NOTE: HPE recommends users to build an ML Docker container for easier and automa
 The ML node is responsible to train and iteratively update the model. For each ML node, there is a corresponding SL node in the Swarm Learning framework, which performs the Swarm training. Each pair of ML and SL nodes must run on the same host. This process continues until the SL nodes train the model to the desired state.
 
 <blockquote>
-NOTE: All the ML nodes must use the same ML platform either Keras (based on TensorFlow 2 backend) or PyTorch. Using Keras for some and PyTorch for the other nodes is not supported.
+NOTE: All the ML nodes must use the same ML platform either Keras (based on TensorFlow 2 backend), PyTorch, or HuggingFace Trainer class. Using Keras for some and PyTorch for the other nodes is not supported.
 </blockquote>
 
 ## Quick Start 
   1. [Prerequisites](/docs/Install/Prerequisites.md) for Swarm Learning
   2. [Upgrading from earlier versions](/docs/Install/Versioning_and_upgrade.md)
   3. [Download and setup Swarm Learning](/docs/Install/HPE_Swarm_Learning_installation.md) using the SLM-UI installer 
-  4. Execute a simple predefined example - [MNIST example](/examples/mnist/README.md)
-  5. [Running MNIST example using SLM-UI](/docs/User/Running_MNIST_example_using_SLM-UI.md)
-  6. [Monitoring & Tracking Swarm Learning training using SLM-UI](/docs/User/Monitoring_Swarm_Learning_training_using_SLM-UI.md)
-  7. [Frequently Asked Questions](/docs/User/Frequently_asked_questions.md)
-  8. [Troubleshooting](/docs/User/Troubleshooting.md)
-  9. [Release Notes](/docs/HPE_Swarm_learning_2.2.0_Release_Notes.pdf)
+  4. Execute a simple example - [MNIST example](/examples/mnist/README.md)
+  5. Execute a mini LLM fine-tuning example - [HuggingFace Trainer LoRA](/examples/huggingface-peft/README.md)
+  6. [Running MNIST example using SLM-UI](/docs/User/Running_MNIST_example_using_SLM-UI.md)
+  7. [Monitoring & Tracking Swarm Learning training using SLM-UI](/docs/User/Monitoring_Swarm_Learning_training_using_SLM-UI.md)
+  8. [Frequently Asked Questions](/docs/User/Frequently_asked_questions.md)
+  9. [Troubleshooting](/docs/User/Troubleshooting.md)
+ 10. [Release Notes](/docs/HPE_Swarm_learning_2.2.0_Release_Notes.pdf)
 
 <blockquote>
 
@@ -104,8 +105,7 @@ NOTE: The examples and scripts that are bundled with the Swarm UI installer **ma
   Refer to [Acronyms and Abbreviations](docs/Generic/acronyms.md) for more information.
 
 ## Getting in touch 
-  Feedback and questions are appreciated. You can use the issue tracker to report bugs on GitHub.  (Or)
-  Join the [HPE Developer Slack Workspace](https://slack.hpedev.io/) and start a discussion in our [#hpe-swarm-learning](https://hpedev.slack.com/archives/C04A5DK9TUK) channel.
+  Feedback and questions are appreciated. You can use the issue tracker to report bugs on GitHub.
 
 ## Contributing
   Refer to [Contributing](docs/Generic/CONTRIBUTING.md) for more information.
 
@@ -24,6 +24,7 @@ The following environment variables are available to set and modify:
 |`SL_LEADER_FAILURE_BASE_TIMEOUT`|Sets the minimum timeout value \(in seconds\). If Swarm merging does not happen within this timeout, a new SL leader node is selected. The swarm training continues to run, regardless of SL leader node failures. This timeout will kickin after `min_peers` nodes have completed their local training. <br> Default value: 600 seconds. <br>This variable may need tunning depending on the ML application complexity.|
 |`SL_WAIT_FOR_FULL_QUORUM_SECONDS`|Sets the maximum time for an SL leader node to wait for full quorum after minPeers are ready for merge. This parameter lets you to maximize the number of peers participating in the merge process.<br>Default value: 30 secs|
 |`SL_RAM_INTENSIVE`|Optimizes the usage of RAM in the SL leader node for coordinate and geometric median merge methods. Unlike mean merge method, coordinate and geometric median merge methods involve memory intensive operations. If SL Leader node has limited hardware \(RAM\) configuration, then merging the intermediate model parameters using the median methods can result in memory issues. For such scenarios, user can set up the SL\_RAM\_INTENSIVE flag to 'False' for merging the model parameters layer by layer. This 'False' option is based on I/O operations and is time consuming, hence the default option is set to 'True'.<br> User can pass this parameter in slenvvars option within SWOP profile. This option can be different for each SL node depending on its hardware capacity. Example: 'slenvvars : \[SL\_RAM\_INTENSIVE : False\]' <br> Default value: True|
+|`SL_MODEL_PARAMS_COMPRESSION_THRESHOLD_MB`|Adaptive compression threshold for model parameters (in MB). Model parameter files smaller than this threshold will be compressed to reduce network transfer time. Larger files skip compression to avoid disk I/O contention and CPU blocking. <br> Default Value: 250 (MB).|
 |`SWCI_RUN_TASK_MAX_WAIT_TIME`|Specifies a maximum timeout value for the completion of a Run task (RUN_SWARM).<br>This value must be set in minutes, and the default is 120 mins (2 hours).|
 |`SWCI_GENERIC_TASK_MAX_WAIT_TIME`|Specifies a maximum timeout value for the completion of tasks other than RUN_SWARM type task.<br>This value must be set in minutes, and the default is 120 mins (2 hours).|
 |`SWCI_MODE`| Enables SWCIs web interface instead of command line interface. Allowed values are CLI and WEB.<br> Default value: CLI<br> |
 
@@ -1,12 +1,13 @@
 # <a name="GUID-CCE936EF-FB0D-4BF1-B002-3CB9125C55B9"/> Installing the License Server
 
-1.  After purchasing Swarm Learning from HPE, you will receive an email with a download link **Access Your Products**.
+1.  After purchasing Swarm Learning from HPE, you will receive an email with a download link **Access Your Products**. If you are using the free community version,
+    then you can skip this and directly click the MY HPE SOFTWARE CENTER(MSC) link given below.
 
 2.  From the email, click **Access Your Products**. You are redirected to [MY HPE SOFTWARE CENTER](https://myenterpriselicense.hpe.com/cwp-ui/auth/login).
 
 3.  If you have the HPE Passport account, enter the credentials and **Sign In**. If you do not have it, create the HPE Passport Account and **Sign In**.
 
-    After signing in, you should see the Software Notification Message Receipt page listing the products.
+    After signing in, you should see the Software Notification Message Receipt page listing the products. If you are using the free community version, then in the MSC page, click Software->Search -> Product Info -> "Swarm Learning" (as search term). In the search results, choose "HPE Swarm Learning Community edition" ver 2.2.0 > Action (drop down) -> Product Details -> Installation -> Pre install APLS and download APLS software & documentation ZIP file. For quick reference, APLS container based steps are mentioned below.
 
 4.  Download APLS container and run it using the following procedures.
 
@@ -25,7 +26,7 @@
     3.  Pull the image with a tag.
 
         ```
-        docker pull hub.myenterpriselicense.hpe.com/hpe_eval/autopass/apls:9.14
+        docker pull hub.myenterpriselicense.hpe.com/hpe_eval/autopass/apls:9.15
         ```
 
     4.  Configure Data persistence.
@@ -76,9 +77,9 @@
    
    ![Lock code](GUID-A37C5798-B8B7-4B93-B786-A2682797AB37-high.png)
 
-7.  Go to the Software Notification Message Receipt page and click **Access Your Products**.
+7.  Go to the Software Notification Message Receipt page and click **Access Your Products**. 
 
-    You will be navigated to the [MY HPE SOFTWARE CENTER](https://myenterpriselicense.hpe.com/cwp-ui/auth/login) home page. After signing in with your HPE Passport credentials, you will see the **Activate** page.
+    You will be navigated to the [MY HPE SOFTWARE CENTER](https://myenterpriselicense.hpe.com/cwp-ui/auth/login) home page. After signing in with your HPE Passport credentials, you will see the **Activate** page.   If you are using the free community version, then in the MSC page, click Software->Search -> Product Info -> "Swarm Learning" (as search term). In the search results, choose "HPE Swarm Learning Community edition" ver 2.2.0 > Action (drop down) -> Get License
 
 8.  Activate the license:
 
 
@@ -1,24 +1,50 @@
 # Installing HPE Swarm Learning Management UI \(SLM-UI\)
 
-Installing Swarm Learning is a two-step process.
-
-1.  Using SLM-UI Installer, you can install the SLM-UI on one host.
+### Pre-requisite: 
+APLS license server is installed and Swarm licenses are installed as detailed in the [License server installation steps](Install_the_License_Server.md)
+
+## Manual installation for 2.3.0 version: 
+We support **only manual** installation for 2.3.0 version. You need to:
+1. Either Clone or download this git repo on **each host machine** where you want to install Swarm learning.
+
+2. If your downloading, then navigate to the main page of the repository. To the right of the list of files, click Releases and select 2.3.0 version. Scroll down to the "Assets" section of the release, click Source code (tar.gz). Copy and extract the tar.gz **on each host machine**
+
+3. Preferable to extract it under /opt/hpe/swarm-learning. 
+
+4. Do a Docker login from your host:
+  
+       docker login hub.myenterpriselicense.hpe.com –u <YOUR-HPE-PASSPORT-EMAIL> -p hpe
+5. Pull the signed Swarm Learning images from HPEs Docker Trust Registry (DTR):
+   
+       docker pull hub.myenterpriselicense.hpe.com/hpe/swarm-learning/sn:2.3.0 
+       docker pull hub.myenterpriselicense.hpe.com/hpe/swarm-learning/sl:2.3.0 
+       docker pull hub.myenterpriselicense.hpe.com/hpe/swarm-learning/swci:2.3.0 
+       docker pull hub.myenterpriselicense.hpe.com/hpe/swarm-learning/swop:2.3.0 
+       docker pull hub.myenterpriselicense.hpe.com/hpe/swarm-learning/slm-ui:2.2.0 
+       docker pull hub.myenterpriselicense.hpe.com/hpe/swarm-learning/slm-ui-postgres:2.2.0
+       docker pull hello-world
+You can skip rest of the installation steps mentioned below.
+
+## Automatic installation for 2.2.0 version: 
+Installing Swarm Learning is a two-step process using the GUI.
+
+1.  Using SLM-UI Installer GUI, you can install the SLM-UI on one linux host.
 2.  Using SLM-UI, you can install SL in multiple hosts and run the examples.
 
 1.  Navigate to the [MY HPE SOFTWARE CENTER](https://myenterpriselicense.hpe.com/cwp-ui/auth/login) home page.
 
 2.  Perform the following actions after signing in with your HPE Passport credentials:
 
-    1.  Go to **My Activations** and select your ordered product.
+    1.  Go to **My Activations** and select your ordered product. If you are using the free community version, then in the MSC page, click Software->Search -> Product Info -> "Swarm Learning" (as search term). In the search results, choose "HPE Swarm Learning Community edition" ver 2.2.0 > Action (drop down) 
 
     2.  Go to **Action** pull down and then select **Download/Re-download** page.
 
     3.  Select and download listed software files.
 
-        -   The tar file containing docs and scripts.
-
-        -   The signature file for the above tar file.
-
         -   The docker digest hash file \(JSON\).
 
         -   Download the Swarm Learning SLM-UI installer for your platform, Mac, Windows, or Linux.
+          
+        -   The tar file containing docs and scripts.
+          
+        -   The signature file for the above tar file.
@@ -51,7 +51,7 @@ Qualified with Keras 2.9.0 \(TensorFlow 2 backend\) and PyTorch 1.5 based Machin
 
 <blockquote>
 
-  NOTE: Python version must be between 3.6 to 3.9.
+  NOTE: Python version must be between 3.8 to 3.9.
 
 </blockquote>
 
 
@@ -4,6 +4,8 @@ Several examples of using Swarm Learning are provided. These examples use differ
 
 For details of running each example, see the below:
 
+-   [LLM fine-tuning](examples/huggingface/README.md)
+-   [LLM fine-tuning with LoRA](examples/huggingface-peft/README.md)
 -   [MNIST](/examples/mnist/README.md)
 -   [MNIST-PYT](/examples/mnist-pyt/README.md)
 -   [CIFAR-10](/examples/cifar10/README.md)
 
@@ -15,7 +15,9 @@ Body:
     - RUN pip3 install --upgrade pip && pip3 install \
     - '   keras matplotlib opencv-python pandas protobuf==3.15.6 '
     - ' '
+    - RUN pip3 install pip==23.3.2
     - RUN mkdir -p /tmp/hpe-swarmcli-pkg
     - COPY swarmlearning-client-py3-none-manylinux_2_24_x86_64.whl /tmp/hpe-swarmcli-pkg/swarmlearning-client-py3-none-manylinux_2_24_x86_64.whl
     - RUN pip3 install /tmp/hpe-swarmcli-pkg/swarmlearning-client-py3-none-manylinux_2_24_x86_64.whl
+    - RUN pip3 install --upgrade pip