-
Notifications
You must be signed in to change notification settings - Fork 10
Word and abbreviation disambiguation #4
Copy link
Copy link
Open
Description
Hi!
There are some words in Russian which sound like abbreviation at the end of the sentence. I.e
...муж. (муж or husband vs male (abbr))
...жен. (жён or wifes vs female (abbr))
In these cases the expression like this:
nltk.sent_tokenize('А во-вторых, то, что твоя жена решила, что ей не нравится спать с мужчинами, не означает, что ты плохой муж. Хотя я бы этим особо не хвастался.', language='russian')
gives an incorrect result. I am sure in Russian there are many similar disambiguations not only mentioned above.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels