|
| 1 | +# Overview |
| 2 | + |
| 3 | +With the rapid development of artificial intelligence (AI), large-scale |
| 4 | +and high-quality data is playing an increasingly important role in |
| 5 | +achieving optimum model effect and user experience. However, further |
| 6 | +development of AI is restricted by a data utilization bottleneck, |
| 7 | +whereby data cannot be shared among devices due to issues regarding |
| 8 | +privacy, supervision, and engineering, resulting in data silos. To |
| 9 | +resolve this data silo problem, the concept of federated learning was |
| 10 | +proposed back in 2016. It aims to effectively utilize multi-party data |
| 11 | +for machine learning modeling while also meeting the requirements of |
| 12 | +user privacy protection, data security, and government regulations. |
| 13 | + |
| 14 | +## Definition |
| 15 | + |
| 16 | +Centralizing data from multiple parties means that user privacy |
| 17 | +protection cannot be guaranteed --- such an approach would also fail to |
| 18 | +comply with relevant laws and regulations. The core idea behind |
| 19 | +federated learning is that models move whereas data stays put. It |
| 20 | +enables models to move among data parties so that data can be used for |
| 21 | +modeling without being transferred out of devices. In federated |
| 22 | +learning, data of all parties is retained locally, and machine learning |
| 23 | +models are established by exchanging encrypted parameters or other |
| 24 | +information (on central servers). |
| 25 | + |
| 26 | +## Application Scenarios |
| 27 | + |
| 28 | +Federated learning can be classified into three categories based on |
| 29 | +whether samples and features overlap: horizontal federated learning |
| 30 | +(with different samples and overlapping features), vertical federated |
| 31 | +learning (with different features and overlapping samples), and |
| 32 | +federated transfer learning (without overlapping samples or features). |
| 33 | + |
| 34 | +**Horizontal federated learning** applies to scenarios where different |
| 35 | +individual participants have identical features. For example, in an |
| 36 | +advertisement recommendation scenario, algorithm developers use data of |
| 37 | +a specific feature (e.g., number of clicks, time on page/site, or |
| 38 | +frequency of use) relating to different mobile phone users in order to |
| 39 | +establish a model. Because such feature data cannot be transferred out |
| 40 | +of devices, horizontal federated learning is used to establish models by |
| 41 | +combining the feature data of multiple users. |
| 42 | + |
| 43 | +**Vertical federated learning** applies to scenarios with many |
| 44 | +overlapping samples but few overlapping features. Take two institutions |
| 45 | +as an example: one is an insurance company and the other is a hospital. |
| 46 | +The user groups of the two institutions are likely to include many local |
| 47 | +residents, meaning that the two institutions may have a large |
| 48 | +intersection of users. The insurance company holds data on users' |
| 49 | +income, expense statements, and credit ratings, whereas the hospital |
| 50 | +holds data on users' health and medical purchase records, resulting in a |
| 51 | +small intersection of user features. Vertical federated learning |
| 52 | +enhances model capabilities by aggregating different features in an |
| 53 | +encrypted state. |
| 54 | + |
| 55 | +**Federated transfer learning** aims to find the similarities between |
| 56 | +the source and target fields. Take another two institutions as an |
| 57 | +example: one is a bank in country A and the other is an e-commerce |
| 58 | +company in country B. The user groups of the two institutions have a |
| 59 | +small intersection due to geographical restrictions. In addition, |
| 60 | +because the two institutions are dissimilar, only a small part of their |
| 61 | +data features overlap. In this case, federated transfer learning is one |
| 62 | +of the only ways to implement federated learning effectively and improve |
| 63 | +the model effect because it can overcome the fact that there is limited |
| 64 | +single-side data and few labeled samples. |
| 65 | + |
| 66 | +## Deployment Scenarios |
| 67 | + |
| 68 | +The architecture of federated learning is similar to that of a parameter |
| 69 | +server (i.e., distributed learning in data centers). In both |
| 70 | +architectures, centralized servers and distributed clients (i.e., |
| 71 | +multiple clients communicate with one server, and there is no |
| 72 | +communication between clients) are used to build a machine learning |
| 73 | +model. Based on the scenario in which it is deployed, federated learning |
| 74 | +can be classified into cross-silo federated learning and cross-device |
| 75 | +federated learning. Generally, users of cross-silo federated learning |
| 76 | +are enterprises and institutions, whereas cross-device federated |
| 77 | +learning is oriented to portable electronic devices (PEDs), mobile |
| 78 | +devices, and the like. Table |
| 79 | +[\[ch10-federated-learning-different-connection\]](#ch10-federated-learning-different-connection){reference-type="ref" |
| 80 | +reference="ch10-federated-learning-different-connection"} describes the |
| 81 | +differences and relationships among distributed learning in data |
| 82 | +centers, cross-silo federated learning, and cross-device federated |
| 83 | +learning. |
| 84 | + |
| 85 | +[]{#ch10-federated-learning-different-connection |
| 86 | +label="ch10-federated-learning-different-connection"} |
| 87 | + |
| 88 | +## Common Frameworks |
| 89 | + |
| 90 | +As users and developers continue to place higher demands on federated |
| 91 | +learning technologies, more and more federated learning tools and |
| 92 | +frameworks are emerging. The following lists some of the mainstream |
| 93 | +federated learning frameworks: |
| 94 | + |
| 95 | +1. **TensorFlow Federated (TFF):** an open-source federated learning |
| 96 | + framework developed by Google to promote open research and |
| 97 | + experimentation in federated learning. It is used to implement |
| 98 | + machine learning and other types of computing on decentralized data. |
| 99 | + In this framework, a shared global model is trained among many |
| 100 | + participating customers who save their training data locally. For |
| 101 | + example, federated learning has been successfully used to train |
| 102 | + prediction models for mobile keyboards without uploading sensitive |
| 103 | + typed data to the server. |
| 104 | + |
| 105 | +2. **PaddleFL:** an open-source federated learning framework proposed |
| 106 | + by Baidu based on PaddlePaddle. It enables researchers to easily |
| 107 | + replicate and compare different federated learning algorithms and |
| 108 | + allows developers to readily deploy PaddleFL-based federated |
| 109 | + learning systems in large-scale distributed clusters. This framework |
| 110 | + provides multiple federated learning strategies (e.g., horizontal |
| 111 | + federated learning and vertical federated learning) and |
| 112 | + corresponding applications in fields such as computer vision, |
| 113 | + natural language processing, and recommendation algorithms. It also |
| 114 | + supports the application of traditional machine learning training |
| 115 | + strategies, for example, applying transfer learning to multitask |
| 116 | + learning and federated learning environments. PaddleFL can be easily |
| 117 | + deployed based on full-stack open-source software and leveraging |
| 118 | + PaddlePaddle's large-scale distributed training capability and |
| 119 | + Kubernetes' capability of elastically scheduling training tasks. |
| 120 | + |
| 121 | +3. **Federated AI Technology Enabler (FATE):** the world's first |
| 122 | + industrial-grade open-source framework proposed by WeBank for |
| 123 | + federated learning. It enables enterprises and institutions to |
| 124 | + collaborate on data while also ensuring data security and preventing |
| 125 | + data privacy leakage. By using secure multi-party computation (MPC) |
| 126 | + and homomorphic encryption technologies to build low-level secure |
| 127 | + computation protocols, FATE supports secure computation of different |
| 128 | + types of machine learning, including logistic regression, tree-based |
| 129 | + algorithms, deep learning, and transfer learning. This framework was |
| 130 | + opened to the public for the first time in February 2019 along with |
| 131 | + the launch of the FATE community, whose members include major cloud |
| 132 | + computing and financial service enterprises in China. |
| 133 | + |
| 134 | +4. **FedML:** an open-source research and baseline library proposed by |
| 135 | + the University of Southern California (USC) for federated learning. |
| 136 | + It facilitates the development of new federated learning algorithms |
| 137 | + and fair performance comparison. FedML supports three computing |
| 138 | + paradigms (i.e., distributed training, training on mobile devices, |
| 139 | + and independent simulation) for users to conduct experiments in |
| 140 | + different system environments. It also implements and promotes |
| 141 | + diversified algorithm research through flexible and general-purpose |
| 142 | + API design and reference baselines. To enable fair comparison of |
| 143 | + federated learning algorithms, FedML provides comprehensive |
| 144 | + benchmark datasets, including non-independent and identically |
| 145 | + distributed (non-iid) datasets. |
| 146 | + |
| 147 | +5. **PySyft:** a Python library released by University College London |
| 148 | + (UCL), DeepMind, and OpenMined for deep learning of security and |
| 149 | + privacy. It involves federated learning, and differential privacy |
| 150 | + (An encryption method: The differential privacy method is used to |
| 151 | + ensure that the impact of a single record on the data set is always |
| 152 | + lower than a certain threshold when the information is output, so |
| 153 | + that the third party cannot judge the change or deletion of a single |
| 154 | + record according to the change of the output. This method is |
| 155 | + considered as the highest security level in the current |
| 156 | + perturbation-based privacy protection method), and multi-party |
| 157 | + learning. PySyft uses differential privacy and encrypted computation |
| 158 | + (MPC and homomorphic encryption) to decouple private data from model |
| 159 | + training. |
| 160 | + |
| 161 | +6. **Fedlearner:** a vertical federated learning framework proposed by |
| 162 | + ByteDance for joint modeling based on data distributed among |
| 163 | + institutions. It comes with peripheral infrastructure for cluster |
| 164 | + management, job management, job monitoring, and network proxy. |
| 165 | + Fedlearner uses the cloud-native deployment solution and stores data |
| 166 | + in Hadoop Distributed File System (HDFS), and manages and starts |
| 167 | + tasks through Kubernetes. The two parties involved in a Fedlearner |
| 168 | + training task need to start the task simultaneously by using |
| 169 | + Kubernetes. All training tasks are managed by the master node in a |
| 170 | + unified manner, and the communication is implemented through Worker. |
| 171 | + |
| 172 | +7. **OpenFL:** a Python framework proposed by Intel for federated |
| 173 | + learning. OpenFL is designed to be a flexible, scalable, and |
| 174 | + easy-to-learn tool for data scientists. |
| 175 | + |
| 176 | +8. **Flower:** an open-source federated learning system released by the |
| 177 | + University of Cambridge for performing optimization in application |
| 178 | + scenarios where federated learning algorithms are deployed on |
| 179 | + large-scale heterogeneous devices. |
| 180 | + |
| 181 | +9. **MindSpore Federated:** an open-source federated learning framework |
| 182 | + proposed by Huawei. It supports the commercial deployment of tens of |
| 183 | + millions of stateless devices, and enables all-scenario intelligent |
| 184 | + applications when user data is stored locally. MindSpore Federated |
| 185 | + focuses on horizontal federated learning involving a large number of |
| 186 | + participants, enabling them to jointly build AI models without |
| 187 | + sharing local data. It mainly addresses the difficulties of |
| 188 | + deploying federated learning in industrial scenarios, including |
| 189 | + difficulties in privacy security, large-scale federated aggregation, |
| 190 | + semi-supervised federated learning, communication compression, and |
| 191 | + cross-platform deployment. |
0 commit comments