-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Performance: Optimize .nemo tar extraction & model config processing #15245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
nithinraok
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulirish Thanks for the great PR, really appreciate it.
Made some comments.
|
For fixing format checker for CI-CD test, pls run: (we are working on improving contributing guide) |
Reduces load time by ~8.1s (41.1s -> 32.96s) on warm start. Avoids unnecessary deep copies in `nemo.utils.model_utils` and `nemo.core.classes.modelPT`. Enables in-place config updates for `EncDecMultiTaskModel`, `EncDecCTCModelBPE`, and `EncDecHybridRNNTCTCBPEModel`. Updates Serialization and FileIO mixins to use optimized config conversion. Signed-off-by: Paul Irish <[email protected]>
Reduces load time by ~9.8s when combined with config optimizations (32.96s -> 23.14s). On its own, reduces load time by ~15.5s (41.1s -> 25.62s). Prevents `EncDecMultiTaskModel` from re-extracting tarballs when they are already handled by `nemo.utils.model_utils`. Signed-off-by: Paul Irish <[email protected]>
Signed-off-by: nithinraok <[email protected]>
Signed-off-by: nithinraok <[email protected]>
Signed-off-by: nithinraok <[email protected]>
Signed-off-by: nithinraok <[email protected]>
94a633c to
a2a3eb4
Compare
|
@nithinraok thank you. |
Thanks @paulirish. Appreciate it. Delay in CI due to many open PRs, Will take care of this PR. :) |
What does this PR do ?
Optimizes model loading performance for ASR models, specifically reducing Canary's setup time by ~44% (from 41.1s to 23.1s) through optimized config processing and eliminating redundant archive extractions.
Collection: [ASR]
Changelog
EncDecMultiTaskModelfrom re-extracting tarballs when they are already handled bynemo.utils.model_utils.nemo.utils.model_utils.maybe_update_config_versionandconvert_model_config_to_dict_configto support in-place updates via amake_copyparameter, avoiding expensiveOmegaConfdeep copies.SerializationandFileIOmixins innemo.core.classes.commonto use non-copying config conversions where safe.EncDecMultiTaskModel,EncDecCTCModelBPE, andEncDecHybridRNNTCTCBPEModel.Measured performance gains on Canary model load:
- Baseline: 41.1s
- After config optimizations: 32.9s (~20% improvement)
- After just tar extraction fix: 25.6s (~37% improvement)
- Combined: 23.1s (~44% improvement)
These two commits are separate and I'm happy to drop one.
Usage
No changes to public APIs. Models will load significantly faster.
See #15240 for a more complete repro script
Before your PR is "Ready for review"
Pre checks:
PR Type:
--
fixes #15240 cc @nithinraok