Skip to content

Conversation

@hamzaydia
Copy link

Summary

Adds optional SRI (Subresource Integrity) hash verification for model artifacts, as requested in #761.

When the integrity field is set on a ModelRecord, WebLLM verifies downloaded config, WASM, and tokenizer files against the provided SHA-256/384/512 hashes before loading. If a hash does not match, an IntegrityError is thrown (or a warning is logged when onFailure: "warn").

Usage

const appConfig = {
  model_list: [{
    model: "https://huggingface.co/mlc-ai/Llama-3.2-1B-Instruct-q4f16_1-MLC",
    model_id: "Llama-3.2-1B-Instruct-q4f16_1-MLC",
    model_lib: "https://raw.githubusercontent.com/user/model-libs/main/model.wasm",
    integrity: {
      config: "sha256-<base64-hash>",
      model_lib: "sha256-<base64-hash>",
      tokenizer: { "tokenizer.json": "sha256-<base64-hash>" },
      onFailure: "error",
    },
  }],
};

Design decisions

  • Zero new runtime dependencies — uses only the Web Crypto API (crypto.subtle.digest) and the existing loglevel logger
  • Fully backwards compatibleintegrity is optional on ModelRecord, all fields within ModelIntegrity are optional, and omitting it entirely preserves existing behavior
  • Minimal diff to the loading pipeline — config verification requires fetching as arraybuffer instead of json (only when integrity is set), WASM verification slots in after fetchWasmSource(), and tokenizer verification is added via a new optional parameter on asyncLoadTokenizer()
  • No changes to prebuiltAppConfig — users generate SRI hashes for their own model deployments
  • No weight shard verification — this would require changes to @mlc-ai/web-runtime's fetchTensorCache(). For full model weight verification with resumable downloads and chunked verification, @verifyfetch/webllm can be used as a drop-in complement

Changes

File Description
src/integrity.ts New. Core module: verifyIntegrity(), isValidSRI(), ModelIntegrity interface, SRIString/FileIntegrityMap types
src/error.ts Add IntegrityError class (follows existing error class pattern)
src/config.ts Add optional integrity?: ModelIntegrity field to ModelRecord
src/engine.ts Verify config + WASM in reloadInternal(), pass integrity to tokenizer loader
src/cache_util.ts Verify tokenizer files in asyncLoadTokenizer()
src/index.ts Export new types and utilities
tests/integrity.test.ts New. 25 unit tests covering all hash algorithms, error paths, edge cases, and known reference hashes
tests/cache_util.test.ts 8 new integration tests verifying the integrity parameter is correctly wired into asyncLoadTokenizer
examples/integrity-verification/ New. Working example demonstrating native integrity usage
README.md New "Integrity Verification" section with usage example and SRI hash generation command

Test plan

  • All 25 unit tests pass (SHA-256/384/512, known hashes, error paths, edge cases)
  • All 8 integration tests pass (tokenizer verification wiring, error propagation)
  • Full test suite passes (179 tests, 16 suites, 0 failures)
  • npx rollup -c build succeeds
  • npx eslint and npx prettier --check pass on all modified files
  • Existing tests unaffected (backwards compatibility confirmed)

Closes #761

Add SRI (Subresource Integrity) hash verification for model config,
WASM, and tokenizer files via an optional `integrity` field on
`ModelRecord`. When specified, downloaded artifacts are verified
against cryptographic hashes (SHA-256/384/512) before loading.

- Add `ModelIntegrity` interface and `verifyIntegrity()` using Web Crypto API
- Add `IntegrityError` class for hash mismatch errors
- Verify config, WASM, and tokenizer in engine loading pipeline
- Support `onFailure: "error" | "warn"` behavior
- Zero new dependencies, fully backwards compatible
- 25 unit tests + 8 integration tests

Closes mlc-ai#761
@gemini-code-assist
Copy link

Summary of Changes

Hello @hamzaydia, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security of WebLLM by introducing optional Subresource Integrity (SRI) verification for critical model artifacts. Users can now specify cryptographic hashes for configuration, WASM, and tokenizer files within the ModelRecord, ensuring that downloaded resources have not been tampered with before they are loaded. This feature improves the trustworthiness of model deployments without introducing new runtime dependencies or breaking existing functionality.

Highlights

  • Optional Subresource Integrity (SRI) Verification: Introduces optional SRI hash verification for critical model artifacts, including configuration files, WebAssembly (WASM) model libraries, and tokenizer files, to ensure their authenticity and integrity.
  • New integrity Field in ModelRecord: Adds a new optional integrity field to the ModelRecord interface, allowing users to specify SHA-256/384/512 hashes for individual artifacts. This field is fully backwards compatible, preserving existing behavior if omitted.
  • No New Runtime Dependencies: The integrity verification feature is implemented using the native Web Crypto API (crypto.subtle.digest), ensuring no additional runtime dependencies are introduced to the project.
  • New IntegrityError Class: A dedicated IntegrityError class has been added to handle cases where integrity verification fails, providing clear error messages including the URL, expected hash, and actual hash.
  • Comprehensive Testing and Example: Includes a new integrity.ts module for core verification logic, a new example (examples/integrity-verification/) demonstrating usage, and extensive unit and integration tests covering various scenarios, algorithms, and error paths.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • README.md
    • Added a new 'Integrity Verification' section.
    • Provided usage examples for the integrity field in ModelRecord.
    • Included instructions for generating SRI hashes using openssl.
    • Mentioned @verifyfetch/webllm for advanced integrity features.
  • examples/integrity-verification/package.json
    • New file for the integrity verification example.
    • Defines project metadata and dependencies for the example.
  • examples/integrity-verification/src/integrity_verification.html
    • New HTML file for the integrity verification example.
    • Provides a basic UI to demonstrate the feature.
  • examples/integrity-verification/src/integrity_verification.ts
    • New TypeScript file containing the logic for the integrity verification example.
    • Demonstrates how to configure ModelRecord with integrity hashes and handles IntegrityError.
  • src/cache_util.ts
    • Imported ModelIntegrity and verifyIntegrity.
    • Modified asyncLoadTokenizer to accept an optional integrity parameter.
    • Added calls to verifyIntegrity for tokenizer.json and tokenizer.model if corresponding hashes are provided in the integrity object.
  • src/config.ts
    • Imported ModelIntegrity.
    • Added an optional integrity?: ModelIntegrity field to the ModelRecord interface.
  • src/engine.ts
    • Imported verifyIntegrity.
    • Modified reloadInternal to conditionally fetch mlc-chat-config.json as an ArrayBuffer and verify its integrity if modelRecord.integrity.config is present.
    • Added a call to verifyIntegrity for the WASM model library (model_lib) if modelRecord.integrity.model_lib is present.
    • Passed the modelRecord.integrity object to asyncLoadTokenizer.
  • src/error.ts
    • Added a new IntegrityError class, extending Error, to represent integrity verification failures.
  • src/index.ts
    • Exported verifyIntegrity, isValidSRI, ModelIntegrity, SRIString, FileIntegrityMap from src/integrity.ts.
    • Exported IntegrityError from src/error.ts.
  • src/integrity.ts
    • New file defining types (SRIString, FileIntegrityMap, ModelIntegrity) and core functions (verifyIntegrity, isValidSRI) for Subresource Integrity.
    • verifyIntegrity uses crypto.subtle.digest to compute hashes and compares them, throwing IntegrityError or logging a warning based on onFailure setting.
    • isValidSRI validates the format of an SRI string.
  • tests/cache_util.test.ts
    • Added new tests to verify that asyncLoadTokenizer correctly calls verifyIntegrity for tokenizer files when integrity settings are provided, and skips it when not.
    • Included tests for error propagation when verifyIntegrity fails.
  • tests/integrity.test.ts
    • New file containing comprehensive unit tests for isValidSRI and verifyIntegrity.
    • Tests cover various SHA algorithms (256, 384, 512), valid and invalid SRI formats, hash mismatches, error messages, onFailure behavior ("error" vs. "warn"), empty buffers, and large data.
Activity
  • Implemented optional Subresource Integrity (SRI) verification for model artifacts, addressing feature request [FEAT] Optionally Support integrity in model and wasm config #761.
  • Developed new core integrity verification logic in src/integrity.ts.
  • Integrated integrity checks into the model loading pipeline for config, WASM, and tokenizer files.
  • Introduced a new IntegrityError class for specific verification failures.
  • Created a new example (examples/integrity-verification/) to demonstrate usage.
  • Updated README.md with documentation and usage instructions for the new feature.
  • Added 25 new unit tests in tests/integrity.test.ts covering all hash algorithms, error paths, edge cases, and known reference hashes.
  • Added 8 new integration tests in tests/cache_util.test.ts to verify tokenizer integrity wiring and error propagation.
  • Ensured all 179 tests across 16 suites pass with 0 failures.
  • Confirmed successful build with npx rollup -c.
  • Verified code quality with npx eslint and npx prettier --check on modified files.
  • Confirmed existing tests remain unaffected, ensuring backwards compatibility.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an excellent and much-needed feature for integrity verification of model artifacts. The implementation is robust, well-tested, and backwards-compatible. The new integrity.ts module is clean and the unit tests are very thorough, covering various scenarios and edge cases. I have a few suggestions to improve maintainability by refactoring some duplicated code in engine.ts and cache_util.ts, and a minor suggestion to improve the documentation in README.md.

- Show sha256/sha384/sha512 openssl commands in README hash generation
- Add `npx verifyfetch sign` as simpler alternative for hash generation
- Extract `verifyTokenizerIfNeeded` helper in cache_util.ts to
  eliminate duplicated tokenizer verification logic
- Unify config merging in engine.ts by separating fetch/verify from
  override spreading, reducing duplication between integrity and
  non-integrity code paths
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an excellent and much-needed feature: optional integrity verification for model artifacts using SRI hashes. The implementation is robust, well-designed, and fully backwards-compatible. The code is clean, follows existing patterns, and is accompanied by comprehensive unit tests and a clear example. The documentation in the README is also very helpful. I have a couple of minor suggestions to further improve the code and documentation, but overall this is a fantastic contribution.

- Always fetch config as arraybuffer and conditionally verify,
  eliminating the if/else branching for fetch format
- Add Windows compatibility note recommending cross-platform
  `npx verifyfetch sign` or Git Bash/WSL for openssl commands
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT] Optionally Support integrity in model and wasm config

1 participant