Skip to content

Commit 83c352c

Browse files
authored
Pubs: Fed-ensemble, LLM survey, Pyxis (#266)
1 parent 9c6f992 commit 83c352c

File tree

2 files changed

+70
-6
lines changed

2 files changed

+70
-6
lines changed

source/_data/SymbioticLab.bib

Lines changed: 45 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -920,7 +920,7 @@ @article{ioft-survey:ieee-access
920920
pages={156071--156113},
921921
year={2021},
922922
publisher={IEEE},
923-
publist_confkey = {IEEEAccess:9},
923+
publist_confkey = {IEEE Access:9},
924924
publist_link = {paper || https://doi.org/10.1109/ACCESS.2021.3127448},
925925
publist_badge = {Featured Article},
926926
publist_topic = {Wide-Area Computing},
@@ -1809,4 +1809,47 @@ @InProceedings{memstrata:osdi24
18091809
18101810
To address these challenges, we introduce Memstrata, a lightweight multi-tenant memory allocator. Memstrata employs page coloring to eliminate inter-VM contention. It improves performance for VMs with access patterns that are sensitive to hardware tiering by allocating them more local DRAM using an online slowdown estimator. In multi-VM experiments on prototype hardware, Memstrata is able to identify performance outliers and reduce their degradation from above 30% to below 6%, providing consistent performance across a wide range of workloads.
18111811
}
1812-
}
1812+
}
1813+
1814+
@article{fed-ensemble:ieee-tase,
1815+
author = {Naichen Shi and Fan Lai and Raed Al Kontar and Mosharaf Chowdhury},
1816+
title = {Fed-ensemble: Ensemble Models in Federated Learning for Improved Generalization and Uncertainty Quantification},
1817+
journal = {IEEE Transactions on Automation Science and Engineering},
1818+
year = {2023},
1819+
month = {May},
1820+
publist_confkey = {IEEE TASE},
1821+
publist_link = {paper || https://doi.org/10.1109/TASE.2023.3269639},
1822+
publist_topic = {Wide-Area Computing},
1823+
publist_abstract = {
1824+
The increase in the computational power of edge devices has opened up the possibility of processing some of the data at the edge and distributing model learning. This paradigm is often called federated learning (FL), where edge devices exploit their local computational resources to train models collaboratively. Though FL has seen recent success, it is unclear how to characterize uncertainties in FL predictions. In this paper, we propose Fed-ensemble : a simple approach that brings model ensembling to FL. Instead of aggregating local models to update a single global model, Fed-ensemble uses random permutations to update a group of $K$ models and then obtains predictions through model averaging. Fed-ensemble can be readily utilized within established FL methods and does not impose a computational overhead compared with single-model methods. Empirical results show that our model has superior performance over several FL algorithms on a wide range of data sets and excels in heterogeneous settings often encountered in FL applications. Also, by carefully choosing client-dependent weights in the inference stage, Fed-ensemble becomes personalized and yields even better performance. Theoretically, we show that predictions on new data from all $K$ models belong to the same predictive posterior distribution under a neural tangent kernel regime. This result, in turn, sheds light on the generalization advantages of model averaging and justifies the uncertainty quantification capability. We also illustrate that Fed-ensemble has an elegant Bayesian interpretation. Note to Practitioners —provides an algorithm that extracts a set of $K$ solutions without imposing any additional communication overhead in FL. Given multiple solutions, Fed-ensemble can be exploited to personalize inference as well as quantify uncertainty. Such capabilities may be beneficial within multiple practical systems that require uncertainty-aware decision-making. Further, Fed-ensemble may be useful for model validation and hypothesis testing.
1825+
}
1826+
}
1827+
1828+
@article{llm-survey:tmlr,
1829+
author = {Zhongwei Wan and Xin Wang and Che Liu and Samiul Alam and Yu Zheng and Jiachen Liu and Zhongnan Qu and Shen Yan and Yi Zhu and Quanlu Zhang and Mosharaf Chowdhury and Mi Zhang},
1830+
title = {Efficient Large Language Models: A Survey},
1831+
journal = {Transactions on Machine Learning Research},
1832+
year = {2024},
1833+
month = {May},
1834+
publist_confkey = {TMLR},
1835+
publist_link = {paper || https://openreview.net/forum?id=bsCCJHbO8A},
1836+
publist_topic = {Systems + AI},
1837+
publist_abstract = {
1838+
Large Language Models (LLMs) have demonstrated remarkable capabilities in important tasks such as natural language understanding and language generation, and thus have the potential to make a substantial impact on our society. Such capabilities, however, come with the considerable resources they demand, highlighting the strong need to develop effective techniques for addressing their efficiency challenges. In this survey, we provide a systematic and comprehensive review of efficient LLMs research. We organize the literature in a taxonomy consisting of three main categories, covering distinct yet interconnected efficient LLMs topics from model-centric, data-centric, and framework-centric perspective, respectively. We have also created a GitHub repository where we organize the papers featured in this survey at https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey. We will actively maintain the repository and incorporate new research as it emerges. We hope our survey can serve as a valuable resource to help researchers and practitioners gain a systematic understanding of efficient LLMs research and inspire them to contribute to this important and exciting field.}
1839+
}
1840+
1841+
@article{pyxis:tpds,
1842+
author = {Sheng Qi and Chao Jin and Mosharaf Chowdhury and Zhenming Liu and Xuanzhe Liu and Xin Jin},
1843+
title = {Pyxis: Scheduling Mixed Tasks in Disaggregated Datacenters},
1844+
journal = {IEEE Transactions on Parallel and Distributed Systems},
1845+
year = {2024},
1846+
month = {June},
1847+
volume = {35},
1848+
number = {9},
1849+
pages = {1536-1550},
1850+
publist_confkey = {IEEE TPDS:35(9)},
1851+
publist_link = {paper || https://doi.org/10.1109/TPDS.2024.3418620},
1852+
publist_topic = {Disaggregation},
1853+
publist_abstract = {
1854+
Disaggregating compute from storage is an emerging trend in cloud computing. Effectively utilizing resources in both compute and storage pool is the key to high performance. The state-of-the-art scheduler provides optimal scheduling decisions for workloads with homogeneous tasks. However, cloud applications often generate a mix of tasks with diverse compute and IO characteristics, resulting in sub-optimal performance for existing solutions. We present Pyxis, a system that provides optimal scheduling decisions for mixed workloads in disaggregated datacenters with theoretical guarantees. Pyxis is capable of maximizing overall throughput while meeting latency SLOs. Pyxis decouples the scheduling of different tasks. Our insight is that the optimal solution has an “all-or-nothing” structure that can be captured by a single turning point in the spectrum of tasks. Based on task characteristics, the turning point partitions the tasks either all to storage nodes or all to compute nodes (none to storage nodes). We theoretically prove that the optimal solution has such a structure, and design an online algorithm with sub-second convergence. We implement a prototype of Pyxis. Experiments on CloudLab with various synthetic and application workloads show that Pyxis improves the throughput by 3–21× over the state-of-the-art solution.}
1855+
}

source/publications/index.md

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -338,13 +338,34 @@ venues:
338338
name: Sensors 2020, 20(21), 6100
339339
date: 2020-10-27
340340
url: https://www.mdpi.com/1424-8220/20/21
341-
IEEEAccess:
341+
'IEEE Access':
342342
category: Journals
343343
occurrences:
344-
- key: 'IEEEAccess:9'
344+
- key: 'IEEE Access:9'
345345
name: IEEE Access 2021, 9, 156071-156113
346-
date: 2021-12-1
347-
url: https://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=9312710&punumber=6287639&sortType=vol-only-newest&ranges=20211101_20211130_Search%20Latest%20Date
346+
date: 2021-12-01
347+
url: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639
348+
'IEEE TASE':
349+
category: Journals
350+
occurrences:
351+
- key: 'IEEE TASE'
352+
name: IEEE TASE 2023
353+
date: 2023-05-01
354+
url: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8856
355+
TMLR:
356+
category: Journals
357+
occurrences:
358+
- key: TMLR
359+
name: TMLR 2024
360+
date: 2024-05-01
361+
url: https://jmlr.org/tmlr/
362+
'IEEE TPDS':
363+
category: Journals
364+
occurrences:
365+
- key: 'IEEE TPDS:35(9)'
366+
name: IEEE TPDS 2024, 35(9), 1536-1550
367+
date: 2024-06-24
368+
url: https://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=10601540&punumber=71
348369
JMIRMH:
349370
category: Journals
350371
occurrences:

0 commit comments

Comments
 (0)