Multi-Adaptor Support for Edge Devices #7755

zhipenghan · 2024-06-05T00:29:57Z

zhipenghan
Jun 5, 2024

Idea proposal:
To support multiple fine-tuned adaptors simultaneously on memory-constrained edge devices, enabling users to leverage diverse capabilities of Small Language Models (SLMs) while optimizing resource utilization.

Problem Statement:
On edge devices, general SLMs often struggle with specific tasks, and hosting separate SLMs for each downstream task is costly. Engineers typically fine-tune models to enhance their capabilities, but this approach has limitations.

Proposal:
Allow users to host a base model and a series of adaptors that augment the base model's capabilities. This approach would unlock the potential for customizing SLM capabilities, enabling users to:

Utilize multiple fine-tuned adaptors simultaneously on edge devices
Leverage diverse strengths of SLMs (e.g., code writing, text writing, trip planning)
Optimize resource utilization on memory-constrained edge devices

I try to research and find ONNX runtime has similar design but the performance is not good enough.
[Performance] A way to share weights between sessions · Issue #15301 · microsoft/onnxruntime (github.com)
[Performance] Share weights between sessions to accelerate inference · Issue #20172 · microsoft/onnxruntime (github.com)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-Adaptor Support for Edge Devices #7755

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Multi-Adaptor Support for Edge Devices #7755

Uh oh!

zhipenghan Jun 5, 2024

Replies: 0 comments

zhipenghan
Jun 5, 2024