Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Error in Related Work section of paper #4

@texttheater

Description

@texttheater

Since Google Scholar alerted me to the citation and I was curious, I checked this preprint of the Deep-EOS paper. It says there:

Further high-performers such asElephant(Evang et al., 2013) orCutter(Gra ̈en et al., 2018)follow a sequence labeling approach. However,they require a prior language-dependent tokeniza-tion of the input text.

At least I interpret this as saying that the input to Elephant is already tokenized text, on which sentence boundary detection is then performed. That is not true. Elephant performs tokenization and sentence boundary detection jointly. It is true that this scenario requires tokenized training data. However, Elephant could also be trained and used on data that is not tokenized and only sentence-segmented.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions