Skip to content
Discussion options

You must be logged in to vote

Hi @ghinch,

There is no built-in attribute for sentences. However, the big deal is that doc.sents yields Span objects. That is, you can you can take advantage of the corresponding Span for each sentence if you need to store it within a spaCy object. A similar issue was answered here by one of spaCy's maintainers.

This documentation provides information on extending new attributes. I provided an implementation that uses the attribute extension:

import spacy
from spacy.tokens import Span

nlp = spacy.load("en_core_web_sm")

Span.set_extension("unique_id", default=-1)
doc = nlp("This is sentence one. This is sentence two.")

for sent_i, sent in enumerate(doc.sents):
    sent._.unique_id = se…

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@ghinch
Comment options

@weezymatt
Comment options

Answer selected by ghinch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants