This repository was archived by the owner on Jan 10, 2025. It is now read-only.
File tree Expand file tree Collapse file tree 3 files changed +13
-4
lines changed Expand file tree Collapse file tree 3 files changed +13
-4
lines changed Original file line number Diff line number Diff line change @@ -15,8 +15,9 @@ It provides 48 passages from the dataset for users to choose from.
1515![ demo gif] ( media/distilbert_qa.gif " Demo running offline on a Samsung Galaxy S8 ")
1616
1717> Available models:
18- > * "original" converted DistilBERT (266MB)
19- > * FP16 post-training-quantized DistilBERT (67MB)
18+ > * "original" converted DistilBERT (254MB)
19+ > * FP16 post-training-quantized DistilBERT (131MB)
20+ > * "hybrid" (8-bits precision weights) post-training-quantized DistilBERT (64MB)
2021
2122### Coming soon: GPT-2, quantization... and much more!
2223
@@ -81,6 +82,7 @@ To choose which model to use in the app:
8182``` java
8283" https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-distilled-squad-384.tflite" : " model.tflite" , // <- "original" converted DistilBERT (default)
8384// "https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-distilled-squad-384-fp16.tflite": "model.tflite", // <- fp16 quantized version of DistilBERT
85+ // "https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-distilled-squad-384-8bits.tflite": "model.tflite", // <- hybrid quantized version of DistilBERT
8486```
8587
8688## Models generation
Original file line number Diff line number Diff line change @@ -3,7 +3,8 @@ apply plugin: 'de.undercouch.download'
33task downloadLiteModel {
44 def downloadFiles = [
55 ' https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-distilled-squad-384.tflite' : ' model.tflite' ,
6- // 'https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-distilled-squad-384-fp16.tflite': 'model.tflite', // FP16 version
6+ // 'https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-distilled-squad-384-fp16.tflite': 'model.tflite', // FP16 quantization version
7+ // 'https://s3.amazonaws.com/models.huggingface.co/bert/distilbert-base-uncased-distilled-squad-384-8bits.tflite': 'model.tflite', // hybrid quantization version
78 ]
89 downloadFiles. each { key , value ->
910 download {
Original file line number Diff line number Diff line change 1414# For normal conversion:
1515converter .target_spec .supported_ops = [tf .lite .OpsSet .SELECT_TF_OPS ]
1616
17- # For FP16 conversion:
17+ # For conversion with FP16 quantization:
18+ # converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
19+ # converter.target_spec.supported_types = [tf.float16]
20+ # converter.optimizations = [tf.lite.Optimize.DEFAULT]
21+ # converter.experimental_new_converter = True
22+
23+ # For conversion with hybrid quantization:
1824# converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
1925# converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
2026# converter.experimental_new_converter = True
You can’t perform that action at this time.
0 commit comments