You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Makes the ONNX version of detectron2 the default model. This means users can use it without the pain of installing detectron2. I also cleaned up a few things.
Copy file name to clipboardExpand all lines: README.md
+29-2Lines changed: 29 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ Run `pip install unstructured-inference`.
21
21
22
22
### Detectron2
23
23
24
-
[Detectron2](https://github.com/facebookresearch/detectron2) is required for most inference tasks
24
+
[Detectron2](https://github.com/facebookresearch/detectron2) is required for using models from the [layoutparser model zoo](#using-models-from-the-layoutparser-model-zoo)
25
25
but is not automatically installed with this package.
26
26
For MacOS and Linux, build from source with:
27
27
```shell
@@ -66,6 +66,33 @@ Once the model has detected the layout and OCR'd the document, the text extracte
66
66
page of the sample document will be displayed.
67
67
You can convert a given element to a `dict` by running the `.to_dict()` method.
68
68
69
+
## Models
70
+
71
+
The inference pipeline operates by finding text elements in a document page using a detection model, then extracting the contents of the elements using direct extraction (if available), OCR, and optionally table inference models.
72
+
73
+
We offer several detection models including [Detectron2](https://github.com/facebookresearch/detectron2) and [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX).
74
+
75
+
### Using a non-default model
76
+
77
+
When doing inference, an alternate model can be used by passing the model object to the ingestion method via the `model` parameter. The `get_model` function can be used to construct one of our out-of-the-box models from a keyword, e.g.:
78
+
```python
79
+
from unstructured_inference.models.base import get_model
80
+
from unstructured_inference.inference.layout import DocumentLayout
The `UnstructuredDetectronModel` class in `unstructured_inference.modelts.detectron2` uses the `faster_rcnn_R_50_FPN_3x` model pretrained on DocLayNet, but by using different construction parameters, any model in the `layoutparser`[model zoo](https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html) can be used. `UnstructuredDetectronModel` is a light wrapper around the `layoutparser``Detectron2LayoutModel` object, and accepts the same arguments. See [layoutparser documentation](https://layout-parser.readthedocs.io/en/latest/api_doc/models.html#layoutparser.models.Detectron2LayoutModel) for details.
89
+
90
+
### Using your own model
91
+
92
+
Any detection model can be used for in the `unstructured_inference` pipeline by wrapping the model in the `UnstructuredObjectDetectionModel` class. To integrate with the `DocumentLayout` class, a subclass of `UnstructuredObjectDetectionModel` must have a `predict` method that accepts a `PIL.Image.Image` and returns a list of `LayoutElement`s, and an `initialize` method, which loads the model and prepares it for inference.
93
+
94
+
## API
95
+
69
96
To build the Docker container, run `make docker-build`. Note that Apple hardware with an M1 chip
70
97
has trouble building `Detectron2` on Docker and for best results you should build it on Linux. To
71
98
run the API locally, use `make start-app-local`. You can stop the API with `make stop-app-local`.
@@ -90,7 +117,7 @@ start the API with hot reloading. The API will run at `http:/localhost:8000`.
90
117
91
118
View the swagger documentation at `http://localhost:5000/docs`.
0 commit comments