Skip to content

Commit ee00410

Browse files
committed
Release v0.11.13
- Fix iOS embeddings with XNNPACK + SentencePiece - Add jsDelivr CDN support for web modules - Update documentation
1 parent 9dfaf44 commit ee00410

File tree

187 files changed

+212898
-31864
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

187 files changed

+212898
-31864
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
## 0.11.13
2+
-**iOS Embeddings Fix**: XNNPACK + SentencePiece integration for better results on iOS
3+
- 🌐 **Web CDN**: Modules available via jsDelivr (`@0.11.13/web/*.js`)
4+
15
## 0.11.12
26
- 🌐 **Web VectorStore**: Full RAG support on web with SQLite WASM
37
- Uses wa-sqlite with OPFS storage (10x faster than IndexedDB)

README.md

Lines changed: 163 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -118,8 +118,134 @@ await FlutterGemma.installModel(modelType: ModelType.general)
118118
119119
2. Run `flutter pub get` to install.
120120

121+
## Setup
122+
123+
> **⚠️ Important:** Complete platform-specific setup before using the plugin.
124+
125+
1. **Download Model and optionally LoRA Weights:** Obtain a pre-trained Gemma model (recommended: 2b or 2b-it) [from Kaggle](https://www.kaggle.com/models/google/gemma/frameworks/tfLite/)
126+
* For **multimodal support**, download [Gemma 3 Nano models](https://huggingface.co/google/gemma-3n-E2B-it-litert-preview) or [Gemma 3 Nano in LitertLM format](https://huggingface.co/google/gemma-3n-E2B-it-litert-lm) that support vision input
127+
* Optionally, [fine-tune a model for your specific use case]( https://www.kaggle.com/code/juanmerinobermejo/llm-pr-fine-tuning-with-gemma-2b?scriptVersionId=169776634)
128+
* If you have LoRA weights, you can use them to customize the model's behavior without retraining the entire model.
129+
* [There is an article that described all approaches](https://medium.com/@denisov.shureg/fine-tuning-gemma-with-lora-for-on-device-inference-android-ios-web-with-separate-lora-weights-f05d1db30d86)
130+
2. **Platform specific setup:**
131+
132+
**iOS**
133+
134+
* **Set minimum iOS version** in `Podfile`:
135+
```ruby
136+
platform :ios, '16.0' # Required for MediaPipe GenAI
137+
```
138+
139+
* **Enable file sharing** in `Info.plist`:
140+
```plist
141+
<key>UIFileSharingEnabled</key>
142+
<true/>
143+
```
144+
145+
* **Add network access description** in `Info.plist` (for development):
146+
```plist
147+
<key>NSLocalNetworkUsageDescription</key>
148+
<string>This app requires local network access for model inference services.</string>
149+
```
150+
151+
* **Enable performance optimization** in `Info.plist` (optional):
152+
```plist
153+
<key>CADisableMinimumFrameDurationOnPhone</key>
154+
<true/>
155+
```
156+
157+
* **Add memory entitlements** in `Runner.entitlements` (for large models):
158+
```xml
159+
<?xml version="1.0" encoding="UTF-8"?>
160+
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
161+
<plist version="1.0">
162+
<dict>
163+
<key>com.apple.developer.kernel.extended-virtual-addressing</key>
164+
<true/>
165+
<key>com.apple.developer.kernel.increased-memory-limit</key>
166+
<true/>
167+
<key>com.apple.developer.kernel.increased-debugging-memory-limit</key>
168+
<true/>
169+
</dict>
170+
</plist>
171+
```
172+
173+
* **Change the linking type** of pods to static in `Podfile`:
174+
```ruby
175+
use_frameworks! :linkage => :static
176+
```
177+
178+
* **For embedding models**, add force_load to `Podfile`'s post_install hook:
179+
```ruby
180+
post_install do |installer|
181+
installer.pods_project.targets.each do |target|
182+
flutter_additional_ios_build_settings(target)
183+
184+
# Required for embedding models (TensorFlow Lite SelectTfOps)
185+
if target.name == 'Runner'
186+
target.build_configurations.each do |config|
187+
sdk = config.build_settings['SDKROOT']
188+
if sdk.nil? || !sdk.include?('simulator')
189+
config.build_settings['OTHER_LDFLAGS'] ||= ['$(inherited)']
190+
config.build_settings['OTHER_LDFLAGS'] << '-force_load'
191+
config.build_settings['OTHER_LDFLAGS'] << '$(PODS_ROOT)/TensorFlowLiteSelectTfOps/Frameworks/TensorFlowLiteSelectTfOps.xcframework/ios-arm64/TensorFlowLiteSelectTfOps.framework/TensorFlowLiteSelectTfOps'
192+
end
193+
end
194+
end
195+
end
196+
end
197+
```
198+
199+
**Android**
200+
201+
* If you want to use a GPU to work with the model, you need to add OpenGL support in the manifest.xml. If you plan to use only the CPU, you can skip this step.
202+
203+
Add to 'AndroidManifest.xml' above tag `</application>`
204+
205+
```AndroidManifest.xml
206+
<uses-native-library
207+
android:name="libOpenCL.so"
208+
android:required="false"/>
209+
<uses-native-library android:name="libOpenCL-car.so" android:required="false"/>
210+
<uses-native-library android:name="libOpenCL-pixel.so" android:required="false"/>
211+
```
212+
213+
* **For release builds with ProGuard/R8 enabled**, the plugin automatically includes necessary ProGuard rules. If you encounter issues with `UnsatisfiedLinkError` or missing classes in release builds, ensure your `proguard-rules.pro` includes:
214+
215+
```proguard
216+
# MediaPipe
217+
-keep class com.google.mediapipe.** { *; }
218+
-dontwarn com.google.mediapipe.**
219+
220+
# Protocol Buffers
221+
-keep class com.google.protobuf.** { *; }
222+
-dontwarn com.google.protobuf.**
223+
224+
# RAG functionality
225+
-keep class com.google.ai.edge.localagents.** { *; }
226+
-dontwarn com.google.ai.edge.localagents.**
227+
```
228+
229+
**Web**
230+
231+
* **Authentication:** For gated models (Gemma 3 Nano, Gemma 3 1B/270M), you need to configure HuggingFace token. See [HuggingFace Authentication](#huggingface-authentication) section.
232+
* Web currently works only GPU backend models, CPU backend models are not supported by MediaPipe yet
233+
* **Multimodal support** (images) is fully supported on web platform
234+
* **Model formats**: Use `.litertlm` files for optimal web compatibility (recommended for multimodal models)
235+
236+
* Add dependencies to `index.html` file in web folder
237+
```html
238+
<script type="module">
239+
import { FilesetResolver, LlmInference } from 'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai@0.10.25';
240+
window.FilesetResolver = FilesetResolver;
241+
window.LlmInference = LlmInference;
242+
</script>
243+
```
244+
121245
## Quick Start
122246

247+
> **⚠️ Important:** Complete [platform setup](#setup) before running this code.
248+
123249
### 1. Install a Model (One Time)
124250

125251
```dart
@@ -551,107 +677,6 @@ await FlutterGemma.installModel(
551677

552678
**Important:** On web, FileSource only works with URLs or asset paths, not local file system paths.
553679

554-
## Setup
555-
556-
1. **Download Model and optionally LoRA Weights:** Obtain a pre-trained Gemma model (recommended: 2b or 2b-it) [from Kaggle](https://www.kaggle.com/models/google/gemma/frameworks/tfLite/)
557-
* For **multimodal support**, download [Gemma 3 Nano models](https://huggingface.co/google/gemma-3n-E2B-it-litert-preview) or [Gemma 3 Nano in LitertLM format](https://huggingface.co/google/gemma-3n-E2B-it-litert-lm) that support vision input
558-
* Optionally, [fine-tune a model for your specific use case]( https://www.kaggle.com/code/juanmerinobermejo/llm-pr-fine-tuning-with-gemma-2b?scriptVersionId=169776634)
559-
* If you have LoRA weights, you can use them to customize the model's behavior without retraining the entire model.
560-
* [There is an article that described all approaches](https://medium.com/@denisov.shureg/fine-tuning-gemma-with-lora-for-on-device-inference-android-ios-web-with-separate-lora-weights-f05d1db30d86)
561-
2. **Platform specific setup:**
562-
563-
**iOS**
564-
565-
* **Set minimum iOS version** in `Podfile`:
566-
```ruby
567-
platform :ios, '16.0' # Required for MediaPipe GenAI
568-
```
569-
570-
* **Enable file sharing** in `Info.plist`:
571-
```plist
572-
<key>UIFileSharingEnabled</key>
573-
<true/>
574-
```
575-
576-
* **Add network access description** in `Info.plist` (for development):
577-
```plist
578-
<key>NSLocalNetworkUsageDescription</key>
579-
<string>This app requires local network access for model inference services.</string>
580-
```
581-
582-
* **Enable performance optimization** in `Info.plist` (optional):
583-
```plist
584-
<key>CADisableMinimumFrameDurationOnPhone</key>
585-
<true/>
586-
```
587-
588-
* **Add memory entitlements** in `Runner.entitlements` (for large models):
589-
```xml
590-
<?xml version="1.0" encoding="UTF-8"?>
591-
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
592-
<plist version="1.0">
593-
<dict>
594-
<key>com.apple.developer.kernel.extended-virtual-addressing</key>
595-
<true/>
596-
<key>com.apple.developer.kernel.increased-memory-limit</key>
597-
<true/>
598-
<key>com.apple.developer.kernel.increased-debugging-memory-limit</key>
599-
<true/>
600-
</dict>
601-
</plist>
602-
```
603-
604-
* **Change the linking type** of pods to static in `Podfile`:
605-
```ruby
606-
use_frameworks! :linkage => :static
607-
```
608-
609-
**Android**
610-
611-
* If you want to use a GPU to work with the model, you need to add OpenGL support in the manifest.xml. If you plan to use only the CPU, you can skip this step.
612-
613-
Add to 'AndroidManifest.xml' above tag `</application>`
614-
615-
```AndroidManifest.xml
616-
<uses-native-library
617-
android:name="libOpenCL.so"
618-
android:required="false"/>
619-
<uses-native-library android:name="libOpenCL-car.so" android:required="false"/>
620-
<uses-native-library android:name="libOpenCL-pixel.so" android:required="false"/>
621-
```
622-
623-
* **For release builds with ProGuard/R8 enabled**, the plugin automatically includes necessary ProGuard rules. If you encounter issues with `UnsatisfiedLinkError` or missing classes in release builds, ensure your `proguard-rules.pro` includes:
624-
625-
```proguard
626-
# MediaPipe
627-
-keep class com.google.mediapipe.** { *; }
628-
-dontwarn com.google.mediapipe.**
629-
630-
# Protocol Buffers
631-
-keep class com.google.protobuf.** { *; }
632-
-dontwarn com.google.protobuf.**
633-
634-
# RAG functionality
635-
-keep class com.google.ai.edge.localagents.** { *; }
636-
-dontwarn com.google.ai.edge.localagents.**
637-
```
638-
639-
**Web**
640-
641-
* **Authentication:** For gated models (Gemma 3 Nano, Gemma 3 1B/270M), you need to configure HuggingFace token. See [HuggingFace Authentication](#huggingface-authentication) section.
642-
* Web currently works only GPU backend models, CPU backend models are not supported by MediaPipe yet
643-
* **Multimodal support** (images) is fully supported on web platform
644-
* **Model formats**: Use `.litertlm` files for optimal web compatibility (recommended for multimodal models)
645-
646-
* Add dependencies to `index.html` file in web folder
647-
```html
648-
<script type="module">
649-
import { FilesetResolver, LlmInference } from 'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-genai@0.10.25';
650-
window.FilesetResolver = FilesetResolver;
651-
window.LlmInference = LlmInference;
652-
</script>
653-
```
654-
655680
## Migration from Legacy to Modern API 🔄
656681

657682
If you're upgrading from the Legacy API, here are common migration patterns:
@@ -1496,7 +1521,17 @@ await embeddingModel.close();
14961521

14971522
### Web Setup (Embeddings + VectorStore)
14981523

1499-
**For web platform, you need to build JavaScript modules:**
1524+
**Option 1: Use CDN (Recommended for most users)**
1525+
1526+
Add script tags to your `index.html`:
1527+
```html
1528+
<!-- Load from jsDelivr CDN (version 0.11.13) -->
1529+
<script src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@0.11.13/web/cache_api.js"></script>
1530+
<script type="module" src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@0.11.13/web/litert_embeddings.js"></script>
1531+
<script type="module" src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@0.11.13/web/sqlite_vector_store.js"></script>
1532+
```
1533+
1534+
**Option 2: Build locally (For development or customization)**
15001535

15011536
1. Navigate to the `web/rag` directory in the flutter_gemma package
15021537
2. Follow the detailed setup guide: [`web/rag/README.md`](web/rag/README.md)
@@ -1560,7 +1595,7 @@ final embeddingModel = await FlutterGemmaPlugin.instance.createEmbeddingModel(
15601595
- ✅ Each embedding model consists of both model file (.tflite) and tokenizer file (.model)
15611596
- ✅ Different sequence length options allow trade-offs between accuracy and performance
15621597
- ✅ Modern API provides separate progress tracking for model and tokenizer downloads
1563-
- ⚠️ **VectorStore (RAG) is only available on Android and iOS** - web platform supports embeddings only
1598+
- **VectorStore (RAG) is available on ALL platforms** - Android/iOS use native SQLite, Web uses SQLite WASM (wa-sqlite + OPFS)
15641599

15651600
### VectorStore Optimization (v0.11.7)
15661601

@@ -1769,6 +1804,7 @@ Function calling is currently supported by the following models:
17691804
| **Streaming Responses** | ✅ Full | ✅ Full | ✅ Full | Real-time generation |
17701805
| **LoRA Support** | ✅ Full | ✅ Full | ✅ Full | Fine-tuned weights |
17711806
| **Text Embeddings** | ✅ Full | ✅ Full | ✅ Full | EmbeddingGemma, Gecko |
1807+
| **VectorStore (RAG)** | ✅ SQLite | ✅ SQLite | ✅ SQLite WASM | Semantic search, RAG |
17721808
| **File Downloads** | ✅ Background | ✅ Background | ✅ In-memory | Platform-specific |
17731809
| **Asset Loading** | ✅ Full | ✅ Full | ✅ Full | All source types |
17741810
| **Bundled Resources** | ✅ Full | ✅ Full | ✅ Full | Native bundles |
@@ -1857,6 +1893,7 @@ await FlutterGemma.instance.modelManager.clearCache();
18571893
- **Memory entitlements:** Required for large models (see Setup section)
18581894
- **Linking:** Static linking required (`use_frameworks! :linkage => :static`)
18591895
- **Storage:** Local file system in app documents directory
1896+
- **Embedding models:** Require force_load for TensorFlowLiteSelectTfOps in Podfile (see Setup section)
18601897

18611898
The full and complete example you can find in `example` folder
18621899

@@ -1897,6 +1934,29 @@ The full and complete example you can find in `example` folder
18971934
- Clean and reinstall pods: `cd ios && pod install --repo-update`
18981935
- Check that all required entitlements are in `Runner.entitlements`
18991936

1937+
**iOS Embedding Models:**
1938+
For embedding models on iOS, you must add force_load to your Podfile's post_install hook:
1939+
1940+
```ruby
1941+
post_install do |installer|
1942+
installer.pods_project.targets.each do |target|
1943+
flutter_additional_ios_build_settings(target)
1944+
1945+
# Required for embedding models
1946+
if target.name == 'Runner'
1947+
target.build_configurations.each do |config|
1948+
sdk = config.build_settings['SDKROOT']
1949+
if sdk.nil? || !sdk.include?('simulator')
1950+
config.build_settings['OTHER_LDFLAGS'] ||= ['$(inherited)']
1951+
config.build_settings['OTHER_LDFLAGS'] << '-force_load'
1952+
config.build_settings['OTHER_LDFLAGS'] << '$(PODS_ROOT)/TensorFlowLiteSelectTfOps/Frameworks/TensorFlowLiteSelectTfOps.xcframework/ios-arm64/TensorFlowLiteSelectTfOps.framework/TensorFlowLiteSelectTfOps'
1953+
end
1954+
end
1955+
end
1956+
end
1957+
end
1958+
```
1959+
19001960
## Advanced Usage
19011961

19021962
### ModelThinkingFilter (Advanced)

example/ios/Podfile

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,19 @@ end
3939
post_install do |installer|
4040
installer.pods_project.targets.each do |target|
4141
flutter_additional_ios_build_settings(target)
42+
43+
# Force load TensorFlowLiteSelectTfOps for embedding models
44+
if target.name == 'Runner'
45+
target.build_configurations.each do |config|
46+
# Only apply to device builds, not simulator
47+
sdk = config.build_settings['SDKROOT']
48+
if sdk.nil? || !sdk.include?('simulator')
49+
config.build_settings['OTHER_LDFLAGS'] ||= ['$(inherited)']
50+
config.build_settings['OTHER_LDFLAGS'] << '-force_load'
51+
config.build_settings['OTHER_LDFLAGS'] << '$(PODS_ROOT)/TensorFlowLiteSelectTfOps/Frameworks/TensorFlowLiteSelectTfOps.xcframework/ios-arm64/TensorFlowLiteSelectTfOps.framework/TensorFlowLiteSelectTfOps'
52+
end
53+
end
54+
end
4255
end
4356
end
4457

example/ios/Podfile.lock

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ EXTERNAL SOURCES:
7979
SPEC CHECKSUMS:
8080
background_downloader: 50e91d979067b82081aba359d7d916b3ba5fadad
8181
Flutter: cabc95a1d2626b1b06e7179b784ebcf0c0cde467
82-
flutter_gemma: 04b8b00e853fbf7fdb973c7506dfdd704092b249
82+
flutter_gemma: dc4a0a5e6bdba4cf05655c08f0fefd552022968a
8383
image_picker_ios: 7fe1ff8e34c1790d6fff70a32484959f563a928a
8484
integration_test: 4a889634ef21a45d28d50d622cf412dc6d9f586e
8585
large_file_handler: b37481e9b4972562ffcdc8f75700f47cd592bcec

example/lib/vector_store_test_screen.dart

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -739,7 +739,7 @@ class _VectorStoreTestScreenState extends State<VectorStoreTestScreen> {
739739
Widget build(BuildContext context) {
740740
return Scaffold(
741741
appBar: AppBar(
742-
title: const Text('VectorStore Tests (v0.11.10)'),
742+
title: const Text('VectorStore Tests'),
743743
backgroundColor: Colors.deepPurple,
744744
),
745745
body: Column(

example/web/index.html

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,13 +45,13 @@
4545
</script>
4646

4747
<!-- Cache API for persistent storage -->
48-
<script src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@v0.11.12_2/web/cache_api.js"></script>
48+
<script src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@0.11.13/web/cache_api.js"></script>
4949

5050
<!-- LiteRT.js Embeddings (bundled with Vite) -->
51-
<script type="module" src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@v0.11.12_2/web/litert_embeddings.js"></script>
51+
<script type="module" src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@0.11.13/web/litert_embeddings.js"></script>
5252

53-
<!-- SQLite WASM VectorStore (local for testing) -->
54-
<script type="module" src="sqlite_vector_store.js"></script>
53+
<!-- SQLite WASM VectorStore -->
54+
<script type="module" src="https://cdn.jsdelivr.net/gh/DenisovAV/flutter_gemma@0.11.13/web/sqlite_vector_store.js"></script>
5555
</head>
5656
<body>
5757
<script src="flutter_bootstrap.js" async></script>

0 commit comments

Comments
 (0)