Skip to content

Questions about text_generator_main.cc #735

@liamsun2019

Description

@liamsun2019

Hi authors,
I have several questions about text_generator_main.cc.

  1. The calculation of kv_cache_max_size
    int kv_cache_max_size = kv_cache_k_0->dims->data[1];
    For gemma3-1b, kv_cache_k_0->dims->data[1] is always 1. The following logic of setting decode_steps:
    int decode_steps =
    std::min(max_decode_steps, kv_cache_max_size - prefill_seq_size);
    will result in a negative value of decode_steps and the MINIMAL_CHECK(decode_steps > 0) fails.

  2. I just simplly set kv_cache_max_size = kv_cache_k_0->dims->data[2] to get through decode_steps check. Then I executed following command:
    ./text_generator_main --tflite_model=gemma3-1b_q8_ekv1280.tflite --sentencepiece_model=tokenizer.model --prompt="What is Tensorflow?" --max_decode_steps=256 --start_token="<bos>" --stop_token="<eos>" --num_threads=2
    where, gemma3-1b_q8_ekv1280.tflite was generated by ai_torch_edge with "full_int8_dynamic_recipe" quantized. The results look normal.
    As a comparison, I also tested gemma3-1B-it-int4.tflite which was downloaded from https://www.kaggle.com/models/google/gemma-3/tfLite
    The command was similar:
    ./text_generator_main --tflite_model=gemma3-1B-it-int4.tflite --sentencepiece_model=tokenizer.model --prompt="What is Tensorflow?" --max_decode_steps=256 --start_token="<bos>" --stop_token="<eos>" --num_threads=2
    The results looked abnormal:
    100010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010010
    One note that I must point out, the original bazel building of text_generator_main does not support int4 quantization due to the tensorflow version used in the building. Therefore, I build text_generator_main combined with tensorflow 2.20.0 by cmake.

Any suggestions are appreciated. Big thanks.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions