Skip to content
Discussion options

You must be logged in to vote

You could a custom extension or even just your dict for this. Something like:

abbreviations = {"nyc": "New York City", ...}

out  = []
for tok in doc:
    out.append(abbreviations.get(tok.text, tok.text))
    out.append(tok.whitespace_)
print("".join(out))

If you use a custom extension you could set an underscore property, like tok._.expanded.

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@antonpibm
Comment options

@polm
Comment options

Answer selected by polm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage
2 participants