Skip to content

Collatinus OSX hangs when tagging certain abbreviations #51

@bnagy

Description

@bnagy

[copy also sent via email]

Hello,

Je vous présente mes excuses de vous écrire en anglais, mais ma
grammaire française est horrible. Néanmoins, j'arrive assez bien à la
lecture, alors n'hésitez pas à répondre en français si vous voulez :)

I have encountered some bugs while using Collatinus for OSX 11.1 full.
I have been using the TCP server with a custom python wrapper with the
statistical tagger. Overall, it works very well, and I have tagged
~1.8million sentences. However, certain words cause the server to go
into what looks like an infinite loop (100% CPU utilisation, does not
respond correctly to further tagging requests).

Based on experimentation, I think the main issues are with
abbreviations. Here is the list of words I have discovered so far:

Cn, Sex, Post, Pro, Cap, Ser, Oct, Ap, Kal, Tib, St, Pl

You should be able to replicate the issue by sending a remote tag
request with the client. eg:

/Applications/Collatinus_11.1.app/Contents/MacOS/Client_C11 -P3 "Ap"

Please let me know if you would like any more information. I'd be
happy to test any updated builds on my dataset.

Thankyou for the software!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions