Does spaCy cover all the functionalities that i.e. UiPath Document Understanding, Google Document AI & Microsoft Syntex has? #12880
-
Hi everyone, I'm completely new to this Framework but I love the idea behind digitizing documents and extracting important data from text and imagetext (i.e. scanned documents). This Framework is huge and I'm not entirely sure if I'm at the right place for the task I want to automate. Basically the title is my question, but I can also give an use example: I have multiple unstructured/semistructured documents that consist of free text (letter). This was possible by using UiPath Document Understanding, but I want to know if this would also be possible by using spaCy? I appreciate any help! :) |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
No, not 100%. spaCy is an NLP library that focuses just on text analysis. It doesn't include all the functionality for this kind of document processing out of the box, like representing the location of the text on a page or dealing with OCR errors. Many of the tasks you mentioned could probably be implemented using spaCy, but you would be writing rules and/or training statistical models from scratch for these particular tasks. Please see: https://spacy.io/usage/spacy-101 |
Beta Was this translation helpful? Give feedback.
No, not 100%. spaCy is an NLP library that focuses just on text analysis. It doesn't include all the functionality for this kind of document processing out of the box, like representing the location of the text on a page or dealing with OCR errors.
Many of the tasks you mentioned could probably be implemented using spaCy, but you would be writing rules and/or training statistical models from scratch for these particular tasks.
Please see: https://spacy.io/usage/spacy-101