Evaluate a trained NER pipline on a new Dataset (Label mapping) #10094

maximilianreimer · 2022-01-19T16:09:33Z

maximilianreimer
Jan 19, 2022

Hello everyone,
I would like to evaluate a pipeline on another dataset than it was on. The main issue is, that the labels predicted by the pipeline are not the same as in the dataset.

Currently, I do:

nlp = get_nlp_model(config.language) # load the right mode
docs = read(config.path, config.format) # read the dataset as List[Doc]


texts = [doc.text for doc in docs]
predictions = list(nlp.pipe(texts)) # predict based on text

# map the labels to a common name space by replacing the ents with a list of ents where the .label_ is mapped to the new label
remapped_predictions = list(
        map_document(doc, config.backward_label_mapping)
        for doc in predictions
    )
remapped_docs = list(
      map_document(doc, config.forward_label_mapping)
      for doc in docs
)

# create examples
examples = [
        Example(pred, ref) for pred, ref in zip(remapped_predictions, remapped_docs)
 ]
 spacy_report = language.evaluate(examples)

Is this the correct way of doing this?

I have the issue that I get this error:

~/.conda/envs/brenda-server/lib/python3.7/site-packages/spacy/language.py in evaluate(self, examples, batch_size, scorer, component_cfg, scorer_cfg)
   1365                 eg.predicted = doc
   1366         end_time = timer()
-> 1367         results = scorer.score(examples)
   1368         n_words = sum(len(eg.predicted) for eg in examples)
   1369         results["speed"] = n_words / (end_time - start_time)

~/.conda/envs/brenda-server/lib/python3.7/site-packages/spacy/scorer.py in score(self, examples)
    133         for name, component in self.nlp.pipeline:
    134             if hasattr(component, "score"):
--> 135                 scores.update(component.score(examples, **self.cfg))
    136         return scores
    137 

~/.conda/envs/brenda-server/lib/python3.7/site-packages/spacy/pipeline/morphologizer.pyx in spacy.pipeline.morphologizer.Morphologizer.score()

~/.conda/envs/brenda-server/lib/python3.7/site-packages/spacy/scorer.py in score_token_attr_per_feat(examples, attr, getter, missing_values, **cfg)
    282                     if gold_i not in missing_indices:
    283                         value = getter(token, attr)
--> 284                         morph = gold_doc.vocab.strings[value]
    285                         if (
    286                             value not in missing_values

~/.conda/envs/brenda-server/lib/python3.7/site-packages/spacy/strings.pyx in spacy.strings.StringStore.__getitem__()

KeyError: "[E018] Can't retrieve string for hash '15421488757126245706'. This usually refers to an issue with the `Vocab` or `StringStore`."

More Context

def map_span_labels(
    spans: List[Span], label_mapping: Dict[str, str]
) -> List[Span]:
    """Re-maps the labels of the spans, according to the label_mapping.

    Args:
        spans (List[Span]): spans to re-map
        label_mapping (Dict[str, str]): mapping from the old or new labels

    Returns:
        List[Span]: new span
    """
    new_spans = []

    for span in spans:
        new_span = Span(
            span.doc, span.start, span.end, label_mapping[span.label_]
        )
        new_spans.append(new_span)
    return new_spans


def map_document(doc: Doc, label_mapping: Dict[str, str]) -> Doc:
    """Changes the labels for each span in the document based on the label_mapping.

    Args:
        doc (Doc):
        label_mapping (Dict[str, str]): mappg from the old or new labels

    Returns:
        Doc: same document object, but with the labels changed
    """
    mapped_ents = map_span_labels(doc.ents, label_mapping)
    doc.ents = mapped_ents
    return doc

Answered by thomashacker

Jan 19, 2022

Hello,
Unfortunately, your code doesn't show how the Language object is created and is likely the cause of the error.
You can try to run the scorer directly on the list of examples instead of running language.evaluate because this reruns the whole pipeline on the data before scoring. An example of calling the scorer is here: #10056 (comment)

View full answer

thomashacker · 2022-01-19T20:50:13Z

thomashacker
Jan 19, 2022

Hello,
Unfortunately, your code doesn't show how the Language object is created and is likely the cause of the error.
You can try to run the scorer directly on the list of examples instead of running language.evaluate because this reruns the whole pipeline on the data before scoring. An example of calling the scorer is here: #10056 (comment)

9 replies

thomashacker Jan 20, 2022

One main aspect to verify is whether the vocab of the created examples matches with the pipeline: example.predicted.vocab == nlp.vocab
In your code, shown above, I think I can see that you're still using language.evaluate(), try using scores = scorer.score(examples) instead

maximilianreimer Jan 20, 2022
Author

I was already doing this, just forgot to change it in the code here. But I added the following lines of code to check for the vocab mismatch:

 vocab_missmatch_preds = [
        doc for doc in predictions if doc.vocab != nlp.vocab
    ]
    vocab_missmatch_docs = [doc for doc in docs if doc.vocab != nlp.vocab]
    assert (
        len(vocab_missmatch_preds) == 0
    ), f"Predictions have different vocab ({len(vocab_missmatch_preds)})"
    assert (
        len(vocab_missmatch_docs) == 0
    ), f"Documents have different vocab ({len(vocab_missmatch_docs)})"

Turns out the prediction are fine but the documents do not share vocab with the pipline. Is this an issue?

thomashacker Jan 20, 2022

How does the read() function look like? docs = read(config.path, config.format) # read the dataset as List[Doc]

maximilianreimer Jan 20, 2022
Author

Its basically calling the utils function conllu_to_docs from spacy.training.converters.conllu_to_docs:

TYPE_CONFS = {
    "conll++": (
        conll_ner_to_docs,
        {
            "doc_delimiter": "-DOCSTART- -X- -X- O",
            "n_sents": 0,
        },
    ),
    "conllu": (
        conllu_to_docs,
        {
            "n_sents": 1,
        },
    ),
}


def read(
    file: os.PathLike,
    input_type: str,
    n_sents: Optional[int] = None,
) -> List[Doc]:
    f"""Reads a iob file in conell format to a list of spacy documents

    Args:
        file (PathLink): Path of the file
        input_type (str): Type of the input file. Can be conll++ or conllu
        n_sents (Optional[int], optional): Number of sentences to concatenate to
        one doc. None uses the defualt values reader {type_defaults_doc_str}.
        if 0 the reader will try to determine the document boundaries based on
        the annotations. Defaults to None.

    Returns:
        A list of training examples:
    """
    file = Path(file).resolve()
    if not file.exists():
        raise FileExistsError(file)

    if input_type not in TYPE_CONFS:
        supported_types = ", ".join(TYPE_CONFS)
        raise ValueError(
            f"Input type {input_type} not supported. "
            f"Use one of: {supported_types}"
        )
    with open(file, "r") as f:
        content = f.read()

    converter, args = TYPE_CONFS[input_type]

    if n_sents is not None:
        args["n_sents"] = n_sents
    
    docs = list(tqdm(converter(content, **args), desc=f"Reading {file}"))
    return docs

thomashacker Jan 20, 2022

It's a known issue that passing an existing vocab to the converters is not possible, as a workaround you might want to look into this #7872

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Evaluate a trained NER pipline on a new Dataset (Label mapping) #10094

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 9 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Evaluate a trained NER pipline on a new Dataset (Label mapping) #10094

Uh oh!

maximilianreimer Jan 19, 2022

More Context

Replies: 1 comment · 9 replies

Uh oh!

thomashacker Jan 19, 2022

Uh oh!

thomashacker Jan 20, 2022

Uh oh!

maximilianreimer Jan 20, 2022 Author

Uh oh!

Uh oh!

thomashacker Jan 20, 2022

Uh oh!

maximilianreimer Jan 20, 2022 Author

Uh oh!

thomashacker Jan 20, 2022

maximilianreimer
Jan 19, 2022

Replies: 1 comment 9 replies

thomashacker
Jan 19, 2022

maximilianreimer Jan 20, 2022
Author

maximilianreimer Jan 20, 2022
Author