Skip to content

Can't create char_span if length of char 1Β #10937

@Artem-Haholkin-deepsee

Description

@Artem-Haholkin-deepsee

When I'm creating Span from a doc I want to create a span of length 1. But it returns None instead of span

  model = spacy.load(en_core_web_trf,
                   disable=[
                       'tok2vec',
                       'tagger',
                       'morphologizer',
                       'parser',
                       'attribute_ruler',
                       'lemmatizer'
                   ])

  ruler = model.add_pipe("entity_ruler", before="ner")
  ruler.add_patterns([
          {"label": "pattern", "pattern": [{"LOWER": c}]}
          for c in list_of_ents
      ] )
doc = nlp(text)
span = doc.char_span(start, end)

As I understand something wrong with tokens/chars
When I do doc[start:end] it throws an error not enough tokens to unpack. But I expect it won't be tokens but chars
According to documentation getting one item gives a tokens. But i think it should work another way for char_span

Environment

OS: MacOS Monterey 12.4
Python: Anaconda 3.8.12
Spacy: 3.0.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions