-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: add optional integrity verification for model artifacts #767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add SRI (Subresource Integrity) hash verification for model config, WASM, and tokenizer files via an optional `integrity` field on `ModelRecord`. When specified, downloaded artifacts are verified against cryptographic hashes (SHA-256/384/512) before loading. - Add `ModelIntegrity` interface and `verifyIntegrity()` using Web Crypto API - Add `IntegrityError` class for hash mismatch errors - Verify config, WASM, and tokenizer in engine loading pipeline - Support `onFailure: "error" | "warn"` behavior - Zero new dependencies, fully backwards compatible - 25 unit tests + 8 integration tests Closes mlc-ai#761
Summary of ChangesHello @hamzaydia, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the security of WebLLM by introducing optional Subresource Integrity (SRI) verification for critical model artifacts. Users can now specify cryptographic hashes for configuration, WASM, and tokenizer files within the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an excellent and much-needed feature for integrity verification of model artifacts. The implementation is robust, well-tested, and backwards-compatible. The new integrity.ts module is clean and the unit tests are very thorough, covering various scenarios and edge cases. I have a few suggestions to improve maintainability by refactoring some duplicated code in engine.ts and cache_util.ts, and a minor suggestion to improve the documentation in README.md.
- Show sha256/sha384/sha512 openssl commands in README hash generation - Add `npx verifyfetch sign` as simpler alternative for hash generation - Extract `verifyTokenizerIfNeeded` helper in cache_util.ts to eliminate duplicated tokenizer verification logic - Unify config merging in engine.ts by separating fetch/verify from override spreading, reducing duplication between integrity and non-integrity code paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an excellent and much-needed feature: optional integrity verification for model artifacts using SRI hashes. The implementation is robust, well-designed, and fully backwards-compatible. The code is clean, follows existing patterns, and is accompanied by comprehensive unit tests and a clear example. The documentation in the README is also very helpful. I have a couple of minor suggestions to further improve the code and documentation, but overall this is a fantastic contribution.
- Always fetch config as arraybuffer and conditionally verify, eliminating the if/else branching for fetch format - Add Windows compatibility note recommending cross-platform `npx verifyfetch sign` or Git Bash/WSL for openssl commands
Summary
Adds optional SRI (Subresource Integrity) hash verification for model artifacts, as requested in #761.
When the
integrityfield is set on aModelRecord, WebLLM verifies downloaded config, WASM, and tokenizer files against the provided SHA-256/384/512 hashes before loading. If a hash does not match, anIntegrityErroris thrown (or a warning is logged whenonFailure: "warn").Usage
Design decisions
crypto.subtle.digest) and the existingloglevelloggerintegrityis optional onModelRecord, all fields withinModelIntegrityare optional, and omitting it entirely preserves existing behaviorarraybufferinstead ofjson(only when integrity is set), WASM verification slots in afterfetchWasmSource(), and tokenizer verification is added via a new optional parameter onasyncLoadTokenizer()prebuiltAppConfig— users generate SRI hashes for their own model deployments@mlc-ai/web-runtime'sfetchTensorCache(). For full model weight verification with resumable downloads and chunked verification,@verifyfetch/webllmcan be used as a drop-in complementChanges
src/integrity.tsverifyIntegrity(),isValidSRI(),ModelIntegrityinterface,SRIString/FileIntegrityMaptypessrc/error.tsIntegrityErrorclass (follows existing error class pattern)src/config.tsintegrity?: ModelIntegrityfield toModelRecordsrc/engine.tsreloadInternal(), pass integrity to tokenizer loadersrc/cache_util.tsasyncLoadTokenizer()src/index.tstests/integrity.test.tstests/cache_util.test.tsasyncLoadTokenizerexamples/integrity-verification/README.mdTest plan
npx rollup -cbuild succeedsnpx eslintandnpx prettier --checkpass on all modified filesCloses #761