Skip to content

Conversation

thariq-nugrohotomo
Copy link

@thariq-nugrohotomo thariq-nugrohotomo commented Oct 30, 2023

Using the pretrained model, when I pass cls or bos as the initial decoder token, the output (first decoded token) rarely get correct. But once I try to use eos, the output is correct, or at least similar with the output returned by model.generate().

In the official code from Microsoft, they will fallback to eos if the token is not specified https://github.com/microsoft/unilm/blob/6f60612e7cc86a2a1ae85c47231507a587ab4e01/trocr/generator.py#L84

Code excerpt to manually see the first decoded token:

decoder_start_token_id = processor.tokenizer.eos_token_id # processor.tokenizer.bos_token_id 
x = model(pixel_values, torch.tensor([[decoder_start_token_id]]))
x = x.logits
x = torch.argmax(x, -1)
print(processor.tokenizer.batch_decode(x))

Switch eos_token_id to bos_token_id then observe the different output.

When I pass `cls` or `bos` as the initial decoder token, the output (first decoded token) rarely get correct.
But once I try to use `eos`, the output is correct, or at least similar with the output returned by `model.generate()`.

In the official code from Microsoft, they will fallback to `eos` if the token is not specified https://github.com/microsoft/unilm/blob/6f60612e7cc86a2a1ae85c47231507a587ab4e01/trocr/generator.py#L84
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@thariq-nugrohotomo thariq-nugrohotomo marked this pull request as ready for review October 30, 2023 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant