Is there a way to extract the original annotation from .spacy file? #10869
-
I am trying to create a function to evaluate some annotation of my original test data, so I would like to know if is it possible to convert a .spacy file back to the annotation format. Example of annotation format:
What I am able to do by now:
Afterward, I can access the entities text and label using:
Doing that I get this output: [('Steve Gates', 'Name') Is there any optimized way of transforming the .spacy file, back to its original annotation format? I have one function that evaluates the training and data files which uses spacy's evaluate. And the one that I am creating now is for simple outputs that the user wants to test (e.g., simple annotations like the one in the example above), in which the user adds the annotation and I am able to evaluate the output using the following function:
Perhaps using the Is there a way to get the original annotation using the generated .spacy file? Many thanks. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
The original annotation format is not saved anywhere in the Doc, so in the general case, no, the original annotations can't be generated right away. Depending on the format there might be third-party tools to produce it from Docs, like spacy_conll. All the relevant information should be saved in the Doc though, so generating it usually isn't hard. Your input annotation format looks pretty simple, so you should just be able to create it from your Docs with short code like this.
I don't really understand how this relates to the |
Beta Was this translation helpful? Give feedback.
-
Many thanks, I will test the code that you wrote. The objective of doing something like that is to get one or many of my original documents in the Then you might ask why I don't use the original test dataset and only use my conversion functions again? That would be a good question as well, however, I really wanted to know if there was a way to convert back to the annotation form.😅 In the end, I guess all my workaround would be the same as getting a specific quantity of documents from my
Perhaps this would be the better way in the end... Well, I will try both ways and find out which is better, once again, many thanks. |
Beta Was this translation helpful? Give feedback.
The original annotation format is not saved anywhere in the Doc, so in the general case, no, the original annotations can't be generated right away. Depending on the format there might be third-party tools to produce it from Docs, like spacy_conll. All the relevant information should be saved in the Doc though, so generating it usually isn't hard.
Your input annotation format looks pretty simple, so you should just be able to create it from your Docs with short code like this.