Releases: kubeedge/ianvs
v0.3.0
What's New in v0.3.0
Ianvs v0.3.0 brings powerful new LLM-related features, including comprehensive (1) LLM testing and benchmarking tools, (2) advanced cloud-edge collaborative inference paradigms, and (3) innovative algorithms tailored for large model optimization.
1. Support for LLM Testing and Benchmarks
Ianvs now supports robust testing for both locally deployed LLMs and public LLM APIs (e.g., OpenAI). This release introduces three specialized benchmarks for evaluating LLM capabilities in diverse scenarios:
-
Government-Specific Large Model Benchmark: Designed to assess LLM accuracy and reasoning in government-specific scenarios. using objective (multiple-choice) and subjective (Q&A) tests. Explore the benchmark dataset, try the example.
-
Robot-Specific Large Model Benchmark: A joint learning algorithm for multimodal robotics understanding with the pretrained RFNet model. Try the example here and learn more in the documentation.
-
Smart Coding Benchmark: This benchmark evaluates the debugging capabilities of LLMs using real-world coding issues from GitHub repositories. Learn more through the example and read the background documentation.
-
Large Language Model Edge Benchmark: Focused on testing LLM performance in edge environments, this benchmark evaluates resource efficiency and deployment performance. Access datasets and examples here and check out the detailed documentation.
2. Enhanced Cloud-Edge Collaborative Inference
This release introduces new paradigms and algorithms for collaborative inference to optimize cloud-edge cooperation and improve performance:
-
Cloud-Edge Collaborative Inference Paradigm: A new architecture enables efficient cloud-edge collaboration for LLM inference, featuring a baseline algorithm that delivers up to 50% token cost savings without compromising accuracy. Try the example.
-
Speculative Decoding Algorithm (EAGLE, ICML'24): Integrated within the collaborative inference framework, this algorithm accelerates inference speeds by 20% or more. Try the example and explore detailed documentation.
-
Joint Inference Paradigm for Pedestrian Tracking: A multi-edge inference paradigm for pedestrian tracking utilizing the pretrained ByteTrack model (ECCV'22). See the pedestrian tracking example or refer to the background documentation.
3. Support for New Large Model Algorithms
Ianvs includes new algorithms to improve LLM performance and usability in various scenarios:
-
Personalized LLM Agent Algorithm: This algorithm supports single-task learning using the pretrained Bloom model, enabling personalized LLM operations. Explore the example and review the documentation.
-
Unseen Task Processing Algorithm: Supports lifelong learning with pretrained models to handle unseen tasks effectively. Access the example and gain insights from the background documentation.
Detailed Pull Requests:
- Government-Specific Large Model Benchmark by @IcyFeather233 in #144
- Smart Coding Benchmark by @safe-b in #159
- Large Language Model Edge Benchmark by @XueSongTap in #150
- Cloud-Edge Collaborative Inference for LLM by @FuryMartin in #156
- Cloud-Edge Collaborative Speculative Decoding for LLM by @FuryMartin in #179
- Personalized LLM Agent by @Frank-lilinjie in #154
- Multimodal Large Model Joint Learning by @aryan0931 in #167
- Unseen task processing by @nailtu30 in #90
- Documentation refining by @AryanNanda17 in #182
New Contributors
- @IcyFeather233 made their first contribution in #113
- @safe-b made their first contribution in #120
- @XueSongTap made their first contribution in #127
- @FuryMartin made their first contribution in #122
- @aryan0931 made their first contribution in #166
- @AryanNanda17 made their first contribution in #171
v0.2.0
What's Changed
This version of Ianvs supports the following functions of unstructured lifelong learning:
- Support lifelong learning throughout the entire lifecycle, including task definition, task assignment, unknown task recognition, and unknown task handling, among other modules, with each module being decoupled.
- Support unknown task recognition and provide corresponding usage examples based on semantic segmentation tasks in this example.
- Support multi-task joint inference and provide corresponding usage examples based on object detection tasks in this example.
- Provide classic lifelong learning testing metrics, and support for visualizing test results.
- Support lifelong learning system metrics such as BWT and FWT.
- Support visualization of lifelong learning results.
- Provide real-world datasets and rich examples for lifelong learning testing, to better evaluate the effectiveness of lifelong learning algorithms in real environments.
- Provide cloud-robotics datasets in this website.
- Provide cloud-robotics semantic segmentation examples in this example.
The detailed pull requests are as follows:
- Lifelong learning feature and example by @JimmyYang20 in #28
- Edge Intelligence Benchmark for Edge-Cloud Collaborative Lifelong Detection by @iszhyang in #39
- Unknown Task Recognition Algorithm Reproduction by @Frank-lilinjie in #42
- Multi-task Joint Inference by @shifan-Z in #44
- Cloud-Robotic AI Benchmarking for Edge-cloud Collaborative Lifelong Learning by @hsj576 in #65
- Lifelong learning system metrics BWT and FWT by @hsj576 in #67
- Proposal for Cloud-robotics dataset by @hsj576 in #69
New Contributors
- @nailtu30 made their first contribution in #40
- @shifan-Z made their first contribution in #44
- @Sai-Suraj-27 made their first contribution in #49
Full Changelog: v0.1.0...v0.2.0
Ianvs v0.1.0 release
Release the Ianvs distributed synergy AI benchmarking framework.
- Release test environment management and configuration.
- Release test case management and configuration.
- Release test story management and configuration.
- Release the open-source test case generation tool: Use hyperparameter enumeration to fill in one configuration file to generate multiple test cases.
Release the PCB-AoI public dataset.
- Release the PCB-AoI public dataset, its corresponding preprocessing, and baseline algorithm projects. Ianvs is the first open-source site for that dataset.
Support two new paradigms in test environments and test cases.
- Test environments and test cases that support the single-task learning paradigm.
- Test environments and test cases that support the incremental learning paradigm.
Release PCB-AoI benchmark cases based on the two new paradigms.
- Release PCB-AoI benchmark cases based on single-task learning, including leaderboards and test reports.
- Release PCB-AoI benchmark cases based on incremental learning, including leaderboards and test reports.
Details as following
- Add ianvs core code by @JimmyYang20 in #1
- Add Ianvs examples and comments. by @jaypume in #2
- Add documentations by @MooreZheng in #3
- Revise key docs like the home page, setup and contribution by @MooreZheng in #9
- Add fossa analysis in the github workflow by @JimmyYang20 in #10
- Modify the hyperlinks of ci and license scan by @JimmyYang20 in #11
- Add templates of lifelong learning and mutli-edge-inference by @JimmyYang20 in #12
- Add a deploy ianvs method in quick-start.md by @JimmyYang20 in #31
- Add Proposal for Edge Intelligence Benchmark for Edge-Cloud Collaborative Lifelong Detection by @iszhyang in #35
- Add codes of Edge Intelligence Benchmark for Edge-Cloud Collaborative Lifelong Detection by @iszhyang in #39
- Add pedestrian tracking example. by @yqhok1 in #36
- Modify workflow platform by @luosiqi in #46
- Add new application of singletasklearning: segmentation of cityscapes by @luosiqi in #47
- Fix document and example resources by @JimmyYang20 in #5
- Fix hyperlinks in documentation by @JimmyYang20 in #6
- Fix quick start doc by @JimmyYang20 in #7
New Contributors
- @MooreZheng made their first contribution in #3
- @iszhyang made their first contribution in #35
- @yqhok1 made their first contribution in #36
Full Changelog: https://github.com/kubeedge/ianvs/commits/v0.1.0