Skip to content

Commit 5a8237f

Browse files
authored
enhancement: make detectron2_onnx default (#108)
Makes the ONNX version of detectron2 the default model. This means users can use it without the pain of installing detectron2. I also cleaned up a few things.
1 parent 1c9d9c7 commit 5a8237f

File tree

12 files changed

+78
-20
lines changed

12 files changed

+78
-20
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
python${{ env.PYTHON_VERSION }} -m venv .venv
3535
source .venv/bin/activate
3636
make install-ci
37-
37+
3838
lint:
3939
runs-on: ubuntu-latest
4040
needs: setup

CHANGELOG.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
## 0.4.5-dev1
1+
## 0.4.5
22

33
* Preserve image format in PIL.Image.Image when loading
4-
* Added ONNX version of Detectron2
4+
* Added ONNX version of Detectron2 and make default model
55

66
## 0.4.4
77

README.md

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Run `pip install unstructured-inference`.
2121

2222
### Detectron2
2323

24-
[Detectron2](https://github.com/facebookresearch/detectron2) is required for most inference tasks
24+
[Detectron2](https://github.com/facebookresearch/detectron2) is required for using models from the [layoutparser model zoo](#using-models-from-the-layoutparser-model-zoo)
2525
but is not automatically installed with this package.
2626
For MacOS and Linux, build from source with:
2727
```shell
@@ -66,6 +66,33 @@ Once the model has detected the layout and OCR'd the document, the text extracte
6666
page of the sample document will be displayed.
6767
You can convert a given element to a `dict` by running the `.to_dict()` method.
6868

69+
## Models
70+
71+
The inference pipeline operates by finding text elements in a document page using a detection model, then extracting the contents of the elements using direct extraction (if available), OCR, and optionally table inference models.
72+
73+
We offer several detection models including [Detectron2](https://github.com/facebookresearch/detectron2) and [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX).
74+
75+
### Using a non-default model
76+
77+
When doing inference, an alternate model can be used by passing the model object to the ingestion method via the `model` parameter. The `get_model` function can be used to construct one of our out-of-the-box models from a keyword, e.g.:
78+
```python
79+
from unstructured_inference.models.base import get_model
80+
from unstructured_inference.inference.layout import DocumentLayout
81+
82+
model = get_model("yolox")
83+
layout = DocumentLayout.from_file("sample-docs/layout-parser-paper.pdf", model=model)
84+
```
85+
86+
### Using models from the layoutparser model zoo
87+
88+
The `UnstructuredDetectronModel` class in `unstructured_inference.modelts.detectron2` uses the `faster_rcnn_R_50_FPN_3x` model pretrained on DocLayNet, but by using different construction parameters, any model in the `layoutparser` [model zoo](https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html) can be used. `UnstructuredDetectronModel` is a light wrapper around the `layoutparser` `Detectron2LayoutModel` object, and accepts the same arguments. See [layoutparser documentation](https://layout-parser.readthedocs.io/en/latest/api_doc/models.html#layoutparser.models.Detectron2LayoutModel) for details.
89+
90+
### Using your own model
91+
92+
Any detection model can be used for in the `unstructured_inference` pipeline by wrapping the model in the `UnstructuredObjectDetectionModel` class. To integrate with the `DocumentLayout` class, a subclass of `UnstructuredObjectDetectionModel` must have a `predict` method that accepts a `PIL.Image.Image` and returns a list of `LayoutElement`s, and an `initialize` method, which loads the model and prepares it for inference.
93+
94+
## API
95+
6996
To build the Docker container, run `make docker-build`. Note that Apple hardware with an M1 chip
7097
has trouble building `Detectron2` on Docker and for best results you should build it on Linux. To
7198
run the API locally, use `make start-app-local`. You can stop the API with `make stop-app-local`.
@@ -90,7 +117,7 @@ start the API with hot reloading. The API will run at `http:/localhost:8000`.
90117

91118
View the swagger documentation at `http://localhost:5000/docs`.
92119

93-
## YoloX model
120+
### YoloX model
94121

95122
For using the YoloX model the endpoints are:
96123
```

setup.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
limitations under the License.
1919
"""
2020
from setuptools import setup, find_packages
21+
from typing import List
2122

2223
from unstructured_inference.__version__ import __version__
2324

@@ -27,11 +28,11 @@ def load_requirements(file_list=None):
2728
file_list = ["requirements/base.in"]
2829
if isinstance(file_list, str):
2930
file_list = [file_list]
30-
requirements = []
31+
requirements: List[str] = []
3132
for file in file_list:
32-
if not file.startswith("#"):
33-
with open(file, encoding="utf-8") as f:
34-
requirements.extend(f.readlines())
33+
with open(file, encoding="utf-8") as f:
34+
requirements.extend(f.readlines())
35+
requirements = [req for req in requirements if not req.startswith("#")]
3536
return requirements
3637

3738

test_unstructured_inference/inference/test_layout.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,8 @@ def test_read_pdf(monkeypatch, mock_page_layout):
119119
monkeypatch.setattr(detectron2, "is_detectron2_available", lambda *args: True)
120120

121121
with patch.object(layout, "load_pdf", return_value=(layouts, images)):
122-
doc = layout.DocumentLayout.from_file("fake-file.pdf")
122+
model = layout.get_model("detectron2_lp")
123+
doc = layout.DocumentLayout.from_file("fake-file.pdf", model=model)
123124

124125
assert str(doc).startswith("A Catchy Title")
125126
assert str(doc).count("A Catchy Title") == 2 # Once for each page

test_unstructured_inference/models/test_detectron2.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,15 +18,15 @@ def test_load_default_model(monkeypatch):
1818
monkeypatch.setattr(detectron2, "Detectron2LayoutModel", MockDetectron2LayoutModel)
1919

2020
with patch.object(detectron2, "is_detectron2_available", return_value=True):
21-
model = models.get_model()
21+
model = models.get_model("detectron2_lp")
2222

2323
assert isinstance(model.model, MockDetectron2LayoutModel)
2424

2525

2626
def test_load_default_model_raises_when_not_available():
2727
with patch.object(detectron2, "is_detectron2_available", return_value=False):
2828
with pytest.raises(ImportError):
29-
models.get_model()
29+
models.get_model("detectron2_lp")
3030

3131

3232
@pytest.mark.parametrize("config_path, model_path", [("asdf", "diufs"), ("dfaw", "hfhfhfh")])
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.4.5-dev1" # pragma: no cover
1+
__version__ = "0.4.5" # pragma: no cover

unstructured_inference/models/base.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,16 @@
1414
UnstructuredYoloXModel,
1515
)
1616

17+
DEFAULT_MODEL = "detectron2_onnx"
18+
1719

1820
def get_model(model_name: Optional[str] = None) -> UnstructuredModel:
1921
"""Gets the model object by model name."""
2022
# TODO(alan): These cases are similar enough that we can probably do them all together with
2123
# importlib
24+
if model_name is None:
25+
model_name = DEFAULT_MODEL
26+
2227
if model_name in DETECTRON2_MODEL_TYPES:
2328
model: UnstructuredModel = UnstructuredDetectronModel()
2429
model.initialize(**DETECTRON2_MODEL_TYPES[model_name])

unstructured_inference/models/detectron2.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
from unstructured_inference.logger import logger
1313
from unstructured_inference.inference.layoutelement import LayoutElement
14-
from unstructured_inference.models.unstructuredmodel import UnstructuredModel
14+
from unstructured_inference.models.unstructuredmodel import UnstructuredObjectDetectionModel
1515
from unstructured_inference.utils import LazyDict, LazyEvaluateInfo
1616

1717

@@ -29,7 +29,7 @@
2929
# NOTE(alan): Entries are implemented as LazyDicts so that models aren't downloaded until they are
3030
# needed.
3131
MODEL_TYPES = {
32-
None: LazyDict(
32+
"detectron2_lp": LazyDict(
3333
model_path=LazyEvaluateInfo(
3434
hf_hub_download,
3535
"layoutparser/detectron2",
@@ -56,7 +56,7 @@
5656
}
5757

5858

59-
class UnstructuredDetectronModel(UnstructuredModel):
59+
class UnstructuredDetectronModel(UnstructuredObjectDetectionModel):
6060
"""Unstructured model wrapper for Detectron2LayoutModel."""
6161

6262
def predict(self, x: Image):

unstructured_inference/models/detectron2onnx.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
from unstructured_inference.logger import logger
88
from unstructured_inference.inference.layoutelement import LayoutElement
9-
from unstructured_inference.models.unstructuredmodel import UnstructuredModel
9+
from unstructured_inference.models.unstructuredmodel import UnstructuredObjectDetectionModel
1010
from unstructured_inference.utils import LazyDict, LazyEvaluateInfo
1111
import onnxruntime
1212
import numpy as np
@@ -37,7 +37,7 @@
3737
}
3838

3939

40-
class UnstructuredDetectronONNXModel(UnstructuredModel):
40+
class UnstructuredDetectronONNXModel(UnstructuredObjectDetectionModel):
4141
"""Unstructured model wrapper for detectron2 ONNX model."""
4242

4343
# The model was trained and exported with this shape

0 commit comments

Comments
 (0)