feat: implement fit settings in llamacpp extension and overhaul argument builder tests #7442
+670
−26
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Describe Your Changes
Introduces support for the
fitparameter and its associated configurations (fit_target,fit_ctx) to allow automatic adjustment of arguments to device memory. This change spans the extension settings, guest-js types, and the Rust argument builder.Key changes:
Settings & Types: Added
fit,fit_target, andfit_ctxtosettings.jsonand synchronized these fields across the TypeScript definitions and the RustLlamacppConfigstruct.Logic Updates: * Implemented
add_fit_settingsin theArgumentBuilderto handle--fit,--fit-target, and--fit-ctxflags.Modified
add_gpu_layersto use-1as the default for loading all layers, while treating100as a manual override.Updated several argument methods (batch size, context size, etc.) to only append flags if the values differ from the defaults, reducing command-line clutter.
Added a check to exclude
fitsettings when using theikbackend fork.Testing: Significantly expanded the Rust test suite. Replaced basic assertions with dedicated helper functions (
assert_arg_pair,assert_has_flag,assert_no_flag) and added comprehensive test cases for various configurations, including GPU layers, embedding mode, and backend-specific behavior.Fixes Issues
Future Tasks
Self Checklist