Skip to content

Conversation

@Hemanth21k
Copy link

Fixed substring not found error by keeping skip_special_tokens=True for ans_tokenizer.decode in _extract answers.

Fixed substring not found error by keeping skip_special_tokens=True for ans_tokenizer.decode in _extract answers.
@Hemanth21k Hemanth21k changed the title Update pipelines.py Fixed substring not found error Oct 11, 2021
Copy link

@deangeckt deangeckt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, opened an issue about this as well (can close after this): #90

also i'd add:
in _prepare_inputs_for_qg_from_answers_hl()
if answer_text not in sent:
continue

@iamdarkangel
Copy link

@Hemanth21k Thanks a lot for this answer. Spent a lot of time debugging the issue but was unable to debug it until I found this. Keep on doing the good job.


dec = [self.ans_tokenizer.decode(ids, skip_special_tokens=False) for ids in outs]
dec = [self.ans_tokenizer.decode(ids, skip_special_tokens=True) for ids in outs]
answers = [item.split('<sep>') for item in dec]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm wondering if you set skip_special_tokens=True, will item.split('<sep>') still work? Will <sep> be skipped in decode?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants