Skip to content
Discussion options

You must be logged in to vote

Token/Lexeme.get_struct_attr (and basically all the cython methods) work with the string store hashes internally rather than strings, so it's expected to get an integer (attr_t which is uint64_t).

The underlying problem above is that LEMMA is only a Token attribute, not a Lexeme attribute. The only attribute that's stored on both underneath is NORM, but Token.get_struct_attr backs off to Lexeme.get_struct_attr for any unknown attributes. And then Lexeme.get_struct_attr returns 0 for any unknown attributes.

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@Gargonslipfisk
Comment options

@adrianeboyd
Comment options

@Gargonslipfisk
Comment options

@adrianeboyd
Comment options

Answer selected by adrianeboyd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / doc Feature: Doc, Span and Token objects
2 participants