-
Notifications
You must be signed in to change notification settings - Fork 57
[feat]threadpool monitor #500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mag1c-h
merged 1 commit into
ModelEngine-Group:feature_store_next
from
Lijiachen1018:dev_monitor
Dec 13, 2025
Merged
[feat]threadpool monitor #500
mag1c-h
merged 1 commit into
ModelEngine-Group:feature_store_next
from
Lijiachen1018:dev_monitor
Dec 13, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7c17201 to
a08aabd
Compare
a08aabd to
080bb53
Compare
eefb49f to
73955b4
Compare
mag1c-h
approved these changes
Dec 13, 2025
09ef857
into
ModelEngine-Group:feature_store_next
3 checks passed
mag1c-h
added a commit
that referenced
this pull request
Dec 13, 2025
* [Feat] Next Store Interface (#510) Define the StoreV1 interface, see issue #490 for details. * [bugfix]add init device (#511) add init device * [bugfix] clean up shm remnants to fix "BufferOut" error in PCStore (#512) PCStore uses shared memory to share data in DRAM. If the service terminates abnormally and residual files of the shared data are not cleaned up, it will cause newly started services to report BufferOut errors. * [Misc] enable CI in feature branch (#513) enable CI in feature branch * [feat]adapt v1 store (#514) * adapt v1 store * code style * set default local_rank_size to 1 (#515) * [feat]threadpool monitor (#500) threadpool monitor Co-authored-by: lijiachen19 <[email protected]> * modify factory_v1 (#516) * set version to 0.2.0rc1 (#517) * fix code style --------- Co-authored-by: Mag1c.H <[email protected]> Co-authored-by: Lijiachen1018 <[email protected]> Co-authored-by: lijiachen19 <[email protected]>
Lijiachen1018
added a commit
to Lijiachen1018/unified-cache-management
that referenced
this pull request
Dec 18, 2025
* [Feat] Next Store Interface (ModelEngine-Group#510) Define the StoreV1 interface, see issue ModelEngine-Group#490 for details. * [bugfix]add init device (ModelEngine-Group#511) add init device * [bugfix] clean up shm remnants to fix "BufferOut" error in PCStore (ModelEngine-Group#512) PCStore uses shared memory to share data in DRAM. If the service terminates abnormally and residual files of the shared data are not cleaned up, it will cause newly started services to report BufferOut errors. * [Misc] enable CI in feature branch (ModelEngine-Group#513) enable CI in feature branch * [feat]adapt v1 store (ModelEngine-Group#514) * adapt v1 store * code style * set default local_rank_size to 1 (ModelEngine-Group#515) * [feat]threadpool monitor (ModelEngine-Group#500) threadpool monitor Co-authored-by: lijiachen19 <[email protected]> * modify factory_v1 (ModelEngine-Group#516) * set version to 0.2.0rc1 (ModelEngine-Group#517) * fix code style --------- Co-authored-by: Mag1c.H <[email protected]> Co-authored-by: Lijiachen1018 <[email protected]> Co-authored-by: lijiachen19 <[email protected]>
Lijiachen1018
added a commit
to Lijiachen1018/unified-cache-management
that referenced
this pull request
Dec 18, 2025
* [Feat] Next Store Interface (ModelEngine-Group#510) Define the StoreV1 interface, see issue ModelEngine-Group#490 for details. * [bugfix]add init device (ModelEngine-Group#511) add init device * [bugfix] clean up shm remnants to fix "BufferOut" error in PCStore (ModelEngine-Group#512) PCStore uses shared memory to share data in DRAM. If the service terminates abnormally and residual files of the shared data are not cleaned up, it will cause newly started services to report BufferOut errors. * [Misc] enable CI in feature branch (ModelEngine-Group#513) enable CI in feature branch * [feat]adapt v1 store (ModelEngine-Group#514) * adapt v1 store * code style * set default local_rank_size to 1 (ModelEngine-Group#515) * [feat]threadpool monitor (ModelEngine-Group#500) threadpool monitor Co-authored-by: lijiachen19 <[email protected]> * modify factory_v1 (ModelEngine-Group#516) * set version to 0.2.0rc1 (ModelEngine-Group#517) * fix code style --------- Co-authored-by: Mag1c.H <[email protected]> Co-authored-by: Lijiachen1018 <[email protected]> Co-authored-by: lijiachen19 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Add a monitor to theadpool that keep watching workers
Modifications
Test
test with e2e script
pcstore_embed.py