Skip to content

Commit fd35cb3

Browse files
authored
Merge branch 'main' into pprados/fix_password
2 parents ce13fa1 + ab25fb9 commit fd35cb3

File tree

11 files changed

+180
-196
lines changed

11 files changed

+180
-196
lines changed

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,17 @@
1+
## 0.8.3
2+
3+
* fix: removed `layoutelement.from_lp_textblock()` and related tests as it's not used
4+
* fix: update requirements to drop `layoutparser` lib
5+
* fix: update `README.md` to remove layoutparser model zoo support note
6+
7+
## 0.8.2
8+
9+
* fix: fix bug when an empty list is passed into `TextRegions.from_list` triggers `IndexError`
10+
* fix: fix bug when concatenate a list of `LayoutElements` the class id mapping is no properly
11+
updated
12+
113
## 0.8.1
14+
215
* fix: fix list index out of range error caused by calling LayoutElements.from_list() with empty list
316

417
## 0.8.0

README.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -72,10 +72,6 @@ model = get_model("yolox")
7272
layout = DocumentLayout.from_file("sample-docs/layout-parser-paper.pdf", detection_model=model)
7373
```
7474

75-
### Using models from the layoutparser model zoo
76-
77-
The `UnstructuredDetectronModel` class in `unstructured_inference.modelts.detectron2` uses the `faster_rcnn_R_50_FPN_3x` model pretrained on DocLayNet, but by using different construction parameters, any model in the `layoutparser` [model zoo](https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html) can be used. `UnstructuredDetectronModel` is a light wrapper around the `layoutparser` `Detectron2LayoutModel` object, and accepts the same arguments. See [layoutparser documentation](https://layout-parser.readthedocs.io/en/latest/api_doc/models.html#layoutparser.models.Detectron2LayoutModel) for details.
78-
7975
### Using your own model
8076

8177
Any detection model can be used for in the `unstructured_inference` pipeline by wrapping the model in the `UnstructuredObjectDetectionModel` class. To integrate with the `DocumentLayout` class, a subclass of `UnstructuredObjectDetectionModel` must have a `predict` method that accepts a `PIL.Image.Image` and returns a list of `LayoutElement`s, and an `initialize` method, which loads the model and prepares it for inference.

requirements/base.in

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
11
-c constraints.in
2-
layoutparser
32
python-multipart
43
huggingface-hub
54
numpy<2
@@ -12,3 +11,6 @@ timm
1211
# NOTE(alan): Pinned because this is when the most recent module we import appeared
1312
transformers>=4.25.1
1413
rapidfuzz
14+
pandas
15+
scipy
16+
pdfplumber

requirements/base.txt

Lines changed: 41 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -4,58 +4,54 @@
44
#
55
# pip-compile requirements/base.in
66
#
7-
certifi==2024.8.30
7+
certifi==2024.12.14
88
# via requests
99
cffi==1.17.1
1010
# via cryptography
11-
charset-normalizer==3.3.2
11+
charset-normalizer==3.4.1
1212
# via
1313
# pdfminer-six
1414
# requests
1515
coloredlogs==15.0.1
1616
# via onnxruntime
1717
contourpy==1.3.0
1818
# via matplotlib
19-
cryptography==43.0.1
19+
cryptography==44.0.0
2020
# via pdfminer-six
2121
cycler==0.12.1
2222
# via matplotlib
23-
filelock==3.16.0
23+
filelock==3.16.1
2424
# via
2525
# huggingface-hub
2626
# torch
2727
# transformers
28-
flatbuffers==24.3.25
28+
flatbuffers==24.12.23
2929
# via onnxruntime
30-
fonttools==4.53.1
30+
fonttools==4.55.3
3131
# via matplotlib
32-
fsspec==2024.9.0
32+
fsspec==2024.12.0
3333
# via
3434
# huggingface-hub
3535
# torch
36-
huggingface-hub==0.24.7
36+
huggingface-hub==0.27.1
3737
# via
3838
# -r requirements/base.in
3939
# timm
4040
# tokenizers
4141
# transformers
4242
humanfriendly==10.0
4343
# via coloredlogs
44-
idna==3.8
44+
idna==3.10
4545
# via requests
46-
importlib-resources==6.4.5
46+
importlib-resources==6.5.2
4747
# via matplotlib
48-
iopath==0.1.10
49-
# via layoutparser
50-
jinja2==3.1.4
48+
jinja2==3.1.5
5149
# via torch
5250
kiwisolver==1.4.7
5351
# via matplotlib
54-
layoutparser==0.3.4
55-
# via -r requirements/base.in
56-
markupsafe==2.1.5
52+
markupsafe==3.0.2
5753
# via jinja2
58-
matplotlib==3.9.2
54+
matplotlib==3.9.4
5955
# via -r requirements/base.in
6056
mpmath==1.3.0
6157
# via sympy
@@ -65,7 +61,6 @@ numpy==1.26.4
6561
# via
6662
# -r requirements/base.in
6763
# contourpy
68-
# layoutparser
6964
# matplotlib
7065
# onnx
7166
# onnxruntime
@@ -74,107 +69,96 @@ numpy==1.26.4
7469
# scipy
7570
# torchvision
7671
# transformers
77-
onnx==1.16.2
72+
onnx==1.17.0
7873
# via -r requirements/base.in
7974
onnxruntime==1.19.2
8075
# via -r requirements/base.in
81-
opencv-python==4.10.0.84
82-
# via
83-
# -r requirements/base.in
84-
# layoutparser
85-
packaging==24.1
76+
opencv-python==4.11.0.86
77+
# via -r requirements/base.in
78+
packaging==24.2
8679
# via
8780
# huggingface-hub
8881
# matplotlib
8982
# onnxruntime
9083
# transformers
91-
pandas==2.2.2
92-
# via layoutparser
93-
pdf2image==1.17.0
94-
# via layoutparser
84+
pandas==2.2.3
85+
# via -r requirements/base.in
9586
pdfminer-six==20231228
9687
# via pdfplumber
97-
pdfplumber==0.11.4
98-
# via layoutparser
99-
pillow==10.4.0
88+
pdfplumber==0.11.5
89+
# via -r requirements/base.in
90+
pillow==11.1.0
10091
# via
101-
# layoutparser
10292
# matplotlib
103-
# pdf2image
10493
# pdfplumber
10594
# torchvision
106-
portalocker==2.10.1
107-
# via iopath
108-
protobuf==5.28.1
95+
protobuf==5.29.3
10996
# via
11097
# onnx
11198
# onnxruntime
11299
pycparser==2.22
113100
# via cffi
114-
pyparsing==3.1.4
101+
pyparsing==3.2.1
115102
# via matplotlib
116-
pypdfium2==4.30.0
103+
pypdfium2==4.30.1
117104
# via pdfplumber
118105
python-dateutil==2.9.0.post0
119106
# via
120107
# matplotlib
121108
# pandas
122-
python-multipart==0.0.9
109+
python-multipart==0.0.20
123110
# via -r requirements/base.in
124111
pytz==2024.2
125112
# via pandas
126113
pyyaml==6.0.2
127114
# via
128115
# huggingface-hub
129-
# layoutparser
130116
# timm
131117
# transformers
132-
rapidfuzz==3.9.7
118+
rapidfuzz==3.11.0
133119
# via -r requirements/base.in
134-
regex==2024.9.11
120+
regex==2024.11.6
135121
# via transformers
136122
requests==2.32.3
137123
# via
138124
# huggingface-hub
139125
# transformers
140-
safetensors==0.4.5
126+
safetensors==0.5.2
141127
# via
142128
# timm
143129
# transformers
144130
scipy==1.13.1
145-
# via layoutparser
146-
six==1.16.0
131+
# via -r requirements/base.in
132+
six==1.17.0
147133
# via python-dateutil
148-
sympy==1.13.2
134+
sympy==1.13.1
149135
# via
150136
# onnxruntime
151137
# torch
152-
timm==1.0.9
138+
timm==1.0.13
153139
# via -r requirements/base.in
154-
tokenizers==0.19.1
140+
tokenizers==0.21.0
155141
# via transformers
156-
torch==2.4.1
142+
torch==2.5.1
157143
# via
158144
# -r requirements/base.in
159145
# timm
160146
# torchvision
161-
torchvision==0.19.1
147+
torchvision==0.20.1
162148
# via timm
163-
tqdm==4.66.5
149+
tqdm==4.67.1
164150
# via
165151
# huggingface-hub
166-
# iopath
167152
# transformers
168-
transformers==4.44.2
153+
transformers==4.48.0
169154
# via -r requirements/base.in
170155
typing-extensions==4.12.2
171156
# via
172157
# huggingface-hub
173-
# iopath
174158
# torch
175-
tzdata==2024.1
159+
tzdata==2024.2
176160
# via pandas
177-
urllib3==2.2.3
161+
urllib3==2.3.0
178162
# via requests
179-
zipp==3.20.2
163+
zipp==3.21.0
180164
# via importlib-resources

0 commit comments

Comments
 (0)