Skip to content

Commit d3600dd

Browse files
quedMthwRobinson
andauthored
build(deps): update inference version (#662)
Updated to the the latest version of unstructured-inference. detectron2 now gets implemented with onnxruntime, yay! --------- Co-authored-by: Matt Robinson <[email protected]>
1 parent d23e0d6 commit d3600dd

File tree

28 files changed

+400
-881
lines changed

28 files changed

+400
-881
lines changed

.github/workflows/ci.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,6 @@ jobs:
132132
- name: Test
133133
run: |
134134
source .venv/bin/activate
135-
make install-detectron2
136135
sudo apt-get update
137136
sudo apt-get install -y libmagic-dev poppler-utils libreoffice
138137
make install-pandoc
@@ -173,7 +172,6 @@ jobs:
173172
DISCORD_TOKEN: ${{ secrets.DISCORD_TOKEN }}
174173
run: |
175174
source .venv/bin/activate
176-
make install-detectron2
177175
sudo apt-get update
178176
sudo apt-get install -y libmagic-dev poppler-utils libreoffice pandoc
179177
sudo add-apt-repository -y ppa:alex-p/tesseract-ocr5

.gitignore

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,6 @@ tramp
151151

152152
## https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
153153
.vscode/*
154-
!.vscode/settings.json
155154
!.vscode/tasks.json
156155
!.vscode/launch.json
157156
!.vscode/extensions.json

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
## 0.6.12-dev1
1+
## 0.7.0
22

33
### Enhancements
44

5+
* Installing `detectron2` from source is no longer required when using the `local-inference` extra.
56
* Updates `.pptx` parsing to include text in tables.
67

78
### Features

Dockerfile

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,7 @@ RUN python3.8 -m pip install pip==${PIP_VERSION} && \
2929
pip install --no-cache -r requirements/ingest-slack.txt && \
3030
pip install --no-cache -r requirements/ingest-wikipedia.txt && \
3131
pip install --no-cache -r requirements/local-inference.txt && \
32-
scl enable devtoolset-9 bash && \
33-
pip install --no-cache "detectron2@git+https://github.com/facebookresearch/detectron2.git@e2ce8dc#egg=detectron2"
32+
scl enable devtoolset-9 bash
3433

3534
COPY example-docs example-docs
3635
COPY unstructured unstructured

Makefile

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ install-detectron2: install-tensorboard
103103

104104
## install-local-inference: installs requirements for local inference
105105
.PHONY: install-local-inference
106-
install-local-inference: install install-unstructured-inference install-detectron2
106+
install-local-inference: install install-unstructured-inference
107107

108108
.PHONY: install-pandoc
109109
install-pandoc:
@@ -116,9 +116,6 @@ pip-compile:
116116
pip-compile --upgrade requirements/base.in
117117
# Extra requirements for huggingface staging functions
118118
pip-compile --upgrade requirements/huggingface.in
119-
# NOTE(robinson) - We want the dependencies for detectron2 in the requirements.txt, but not
120-
# the detectron2 repo itself. If detectron2 is in the requirements.txt file, an order of
121-
# operations issue related to the torch library causes the install to fail
122119
pip-compile --upgrade requirements/test.in
123120
pip-compile --upgrade requirements/dev.in
124121
pip-compile --upgrade requirements/build.in

docker/ubuntu-22/Dockerfile

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ SHELL ["/bin/bash", "-c"]
1616
RUN source ~/.bashrc && pyenv virtualenv 3.8.15 unstructured && \
1717
source ~/.pyenv/versions/unstructured/bin/activate && \
1818
make install-ci && \
19-
make install-detectron2 && \
2019
make install-ingest-s3 && \
2120
make install-ingest-azure && \
2221
make install-ingest-github && \

docs/source/installing.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,7 @@ installation.
1717
* ``libreoffice`` (MS Office docs)
1818
* ``pandocs`` (EPUBs, RTFs and Open Office docs)
1919

20-
* If you are parsing PDFs, run the following to install the ``detectron2`` model, which ``unstructured`` uses for layout detection:
21-
* ``pip install "detectron2@git+https://github.com/facebookresearch/detectron2.git@e2ce8dc#egg=detectron2"``
20+
* Follow the instructions `here <https://github.com/Unstructured-IO/unstructured-inference#detectron2>`_ to install ``detectron2``. This is required if you would like to use custom models from the `LayoutParser Model Zoo <https://github.com/Unstructured-IO/unstructured-inference#using-models-from-the-layoutparser-model-zoo>`_.
2221

2322
At this point, you should be able to run the following code:
2423

requirements/base.txt

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
#
55
# pip-compile requirements/base.in
66
#
7-
anyio==3.6.2
7+
anyio==3.7.0
88
# via httpcore
99
argilla==1.7.0
1010
# via -r requirements/base.in
@@ -30,12 +30,14 @@ click==8.1.3
3030
# typer
3131
commonmark==0.9.1
3232
# via rich
33-
cryptography==40.0.2
33+
cryptography==41.0.0
3434
# via pdfminer-six
35-
deprecated==1.2.13
35+
deprecated==1.2.14
3636
# via argilla
3737
et-xmlfile==1.1.0
3838
# via openpyxl
39+
exceptiongroup==1.1.1
40+
# via anyio
3941
h11==0.14.0
4042
# via httpcore
4143
httpcore==0.16.3
@@ -117,6 +119,8 @@ sniffio==1.3.0
117119
# anyio
118120
# httpcore
119121
# httpx
122+
tabulate==0.9.0
123+
# via unstructured (setup.py)
120124
tqdm==4.65.0
121125
# via
122126
# argilla
@@ -138,7 +142,7 @@ wrapt==1.14.1
138142
# deprecated
139143
xlrd==2.0.1
140144
# via -r requirements/base.in
141-
xlsxwriter==3.1.1
145+
xlsxwriter==3.1.2
142146
# via python-pptx
143147
zipp==3.15.0
144148
# via importlib-metadata

requirements/dev.txt

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
#
55
# pip-compile requirements/dev.in
66
#
7-
anyio==3.6.2
7+
anyio==3.7.0
88
# via
99
# -c requirements/base.txt
1010
# jupyter-server
@@ -54,6 +54,11 @@ defusedxml==0.7.1
5454
# via nbconvert
5555
distlib==0.3.6
5656
# via virtualenv
57+
exceptiongroup==1.1.1
58+
# via
59+
# -c requirements/base.txt
60+
# -c requirements/test.txt
61+
# anyio
5762
executing==1.2.0
5863
# via stack-data
5964
fastjsonschema==2.17.1
@@ -265,7 +270,7 @@ pyyaml==6.0
265270
# -c requirements/test.txt
266271
# jupyter-events
267272
# pre-commit
268-
pyzmq==25.0.2
273+
pyzmq==25.1.0
269274
# via
270275
# ipykernel
271276
# jupyter-client

requirements/ingest-azure.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ charset-normalizer==3.1.0
4040
# -c requirements/base.txt
4141
# aiohttp
4242
# requests
43-
cryptography==40.0.2
43+
cryptography==41.0.0
4444
# via
4545
# -c requirements/base.txt
4646
# azure-identity

0 commit comments

Comments
 (0)