This is the source code for our paper: EdgeLD: Locally Distributed Deep Learning Inference on Edge Device Clusters. A brief introduction of this work is as follows:
Deep Neural Networks (DNN) have been widely used in a large number of application scenarios. However, DNN models are generally both computation-intensive and memory-intensive, thus difficult to be deployed on resource-constrained edge devices. Most previous studies focus on local model compression or remote cloud offloading, but overlook the potential benefits brought by distributed DNN execution on multiple edge devices. In this paper, we propose EdgeLD, a new framework for locally distributed execution of DNN-based inference tasks on a cluster of edge devices. In EdgeLD, DNN models' time cost will be firstly profiled in terms of computing capability and network bandwidth. Guided by profiling, an efficient model partition scheme is designed in EdgeLD to balance the assigned workload and the inference runtime among different edge devices. We also propose to employ layer fusion to reduce communication overheads on exchanging intermediate data among devices. Experiment results show that our proposed partition scheme saves up to 15.82% of inference time with regard to the conventional solution. Besides, applying layer fusion can speedup the DNN inference by 1.11-1.13X. When combined, EdgeLD can accelerate the original inference time by 1.77-3.57X on a cluster of 2-4 edge devices.
深度神经网络(DNN)已被广泛应用于众多场景。然而,DNN模型通常具有计算密集和内存密集的双重特性,导致其难以部署在资源受限的边缘设备上。现有研究多集中于本地模型压缩或远程云端卸载方案,却忽视了在多台边缘设备上分布式执行DNN带来的潜在优势。本文提出EdgeLD框架,实现在边缘设备集群上本地分布式执行基于DNN的推理任务。该框架首先从计算能力和网络带宽两个维度对DNN模型进行时间成本分析,并以此为指导设计高效的模型划分方案,以平衡不同边缘设备间的工作负载分配与推理运行时长。我们还采用层融合技术来降低设备间交换中间数据产生的通信开销。实验结果表明:与传统方案相比,我们提出的划分方案最高可节省15.82%的推理时间;应用层融合技术可使DNN推理速度提升1.11至1.13倍;当两者结合时,EdgeLD在2-4台边缘设备组成的集群上可实现1.77至3.57倍的原始推理加速。
This work has been published by IEEE HPCC 2020 link. The technique report can be downloaded from here.
PyTorch
@inproceedings{xue2020edgeld,
title={Edgeld: Locally distributed deep learning inference on edge device clusters},
author={Xue, Feng and Fang, Weiwei and Xu, Wenyuan and Wang, Qi and Ma, Xiaodong and Ding, Yi},
booktitle={2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)},
pages={613--619},
year={2020},
organization={IEEE}
}
Feng Xue ([email protected])
Please note that the open source code in this repository was mainly completed by the graduate student author during his master's degree study. Since the author did not continue to engage in scientific research work after graduation, it is difficult to continue to maintain and update these codes. We sincerely apologize that these codes are for reference only.