Skip to content

Commit c295a50

Browse files
authored
feat!: remove api (#111)
Removed code related to the unstructured-inference api, as we are doing inference purely through the general unstructured API, and if we were to again serve an inference-only API, we wouldn't do it through this code, or store the code in this repo.
1 parent 5a8237f commit c295a50

File tree

13 files changed

+84
-518
lines changed

13 files changed

+84
-518
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ target/
7676

7777
# Jupyter Notebook
7878
.ipynb_checkpoints
79+
nbs/
7980

8081
# IPython
8182
profile_default/

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
## 0.4.5
1+
## 0.5.0
22

33
* Preserve image format in PIL.Image.Image when loading
44
* Added ONNX version of Detectron2 and make default model
5+
* Remove API code, we don't serve this as a standalone API any more
56

67
## 0.4.4
78

Dockerfile

Lines changed: 0 additions & 57 deletions
This file was deleted.

Makefile

Lines changed: 0 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -54,43 +54,6 @@ pip-compile:
5454
pip-compile --upgrade requirements/test.in
5555
pip-compile --upgrade requirements/dev.in
5656

57-
##########
58-
# Docker #
59-
##########
60-
61-
# Docker targets are provided for convenience only and are not required in a standard development environment
62-
63-
# Note that the current working directory is mounted under
64-
# /home/notebook-user/local/ when the image is started with
65-
# docker-start-api
66-
67-
.PHONY: docker-build
68-
docker-build:
69-
PIP_VERSION=${PIP_VERSION} ./scripts/docker-build.sh
70-
71-
.PHONY: docker-start-api
72-
docker-start-api:
73-
docker run -p 8000:8000 --mount type=bind,source=$(realpath .),target=/home/notebook-user/local -t --rm --entrypoint uvicorn unstructured-inference-dev:latest ${PACKAGE_NAME}.api:app --log-config logger_config.yaml --host 0.0.0.0 --port 8000
74-
75-
76-
#########
77-
# Local #
78-
########
79-
80-
## run-app-dev: runs the FastAPI api with hot reloading
81-
.PHONY: run-app-dev
82-
run-app-dev:
83-
PYTHONPATH=. uvicorn unstructured_inference.api:app --log-config logger_config.yaml --reload
84-
85-
## start-app-local: runs FastAPI in the container with hot reloading
86-
.PHONY: start-app-local
87-
start-app-local:
88-
docker run --name=ml-inference-container -p 127.0.0.1:5000:5000 ml-inference-dev
89-
90-
## stop-app-local: stops the container
91-
.PHONY: stop-app-local
92-
stop-app-local:
93-
docker stop ml-inference-container | xargs docker rm
9457

9558
#################
9659
# Test and Lint #

README.md

Lines changed: 0 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -91,65 +91,6 @@ The `UnstructuredDetectronModel` class in `unstructured_inference.modelts.detect
9191

9292
Any detection model can be used for in the `unstructured_inference` pipeline by wrapping the model in the `UnstructuredObjectDetectionModel` class. To integrate with the `DocumentLayout` class, a subclass of `UnstructuredObjectDetectionModel` must have a `predict` method that accepts a `PIL.Image.Image` and returns a list of `LayoutElement`s, and an `initialize` method, which loads the model and prepares it for inference.
9393

94-
## API
95-
96-
To build the Docker container, run `make docker-build`. Note that Apple hardware with an M1 chip
97-
has trouble building `Detectron2` on Docker and for best results you should build it on Linux. To
98-
run the API locally, use `make start-app-local`. You can stop the API with `make stop-app-local`.
99-
The API will run at `http:/localhost:5000`.
100-
You can then `POST` a PDF file to the API endpoint to see its layout with the command:
101-
```
102-
curl -X 'POST' 'http://localhost:5000/layout/default/pdf' -F 'file=@<your_pdf_file>' | jq -C . | less -R
103-
```
104-
105-
You can also choose the types of elements you want to return from the output of PDF parsing by
106-
passing a list of types to the `include_elems` parameter. For example, if you only want to return
107-
`Text` elements and `Title` elements, you can curl:
108-
```
109-
curl -X 'POST' 'http://localhost:5000/layout/default/pdf' \
110-
-F 'file=@<your_pdf_file>' \
111-
-F include_elems=Text \
112-
-F include_elems=Title \
113-
| jq -C | less -R
114-
```
115-
If you are using an Apple M1 chip, use `make run-app-dev` instead of `make start-app-local` to
116-
start the API with hot reloading. The API will run at `http:/localhost:8000`.
117-
118-
View the swagger documentation at `http://localhost:5000/docs`.
119-
120-
### YoloX model
121-
122-
For using the YoloX model the endpoints are:
123-
```
124-
http://localhost:8000/layout/yolox/pdf
125-
http://localhost:8000/layout/yolox/image
126-
```
127-
For example:
128-
```
129-
curl -X 'POST' 'http://localhost:8000/layout/yolox/image' \
130-
-F 'file=@sample-docs/test-image.jpg' \
131-
| jq -C | less -R
132-
133-
curl -X 'POST' 'http://localhost:8000/layout/yolox/pdf' \
134-
-F 'file=@sample-docs/loremipsum.pdf' \
135-
| jq -C | less -R
136-
```
137-
138-
If your PDF file doesn't have text embedded you can force the use of OCR with
139-
the parameter force_ocr=True:
140-
```
141-
curl -X 'POST' 'http://localhost:8000/layout/yolox/pdf' \
142-
-F 'file=@sample-docs/loremipsum-flat.pdf' \
143-
-F force_ocr=true
144-
| jq -C | less -R
145-
```
146-
147-
or in local:
148-
149-
```
150-
layout = yolox_local_inference(filename, type="pdf")
151-
```
152-
15394
## Security Policy
15495

15596
See our [security policy](https://github.com/Unstructured-IO/unstructured-inference/security/policy) for

requirements/base.in

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,5 @@
1-
fastapi
21
layoutparser[layoutmodels,tesseract]
32
python-multipart
4-
uvicorn
53
huggingface-hub
64
opencv-python!=4.7.0.68
75
onnxruntime

requirements/base.txt

Lines changed: 25 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -6,52 +6,44 @@
66
#
77
antlr4-python3-runtime==4.9.3
88
# via omegaconf
9-
anyio==3.6.2
10-
# via starlette
11-
certifi==2022.12.7
9+
certifi==2023.5.7
1210
# via requests
1311
cffi==1.15.1
1412
# via cryptography
1513
charset-normalizer==3.1.0
1614
# via
1715
# pdfminer-six
1816
# requests
19-
click==8.1.3
20-
# via uvicorn
2117
coloredlogs==15.0.1
2218
# via onnxruntime
2319
contourpy==1.0.7
2420
# via matplotlib
25-
cryptography==40.0.1
21+
cryptography==40.0.2
2622
# via pdfminer-six
2723
cycler==0.11.0
2824
# via matplotlib
2925
effdet==0.3.0
3026
# via layoutparser
31-
fastapi==0.95.0
32-
# via -r requirements/base.in
33-
filelock==3.10.7
27+
filelock==3.12.0
3428
# via
3529
# huggingface-hub
3630
# torch
3731
# transformers
38-
flatbuffers==23.3.3
32+
flatbuffers==23.5.9
3933
# via onnxruntime
40-
fonttools==4.39.3
34+
fonttools==4.39.4
4135
# via matplotlib
42-
h11==0.14.0
43-
# via uvicorn
44-
huggingface-hub==0.13.3
36+
fsspec==2023.5.0
37+
# via huggingface-hub
38+
huggingface-hub==0.14.1
4539
# via
4640
# -r requirements/base.in
4741
# timm
4842
# transformers
4943
humanfriendly==10.0
5044
# via coloredlogs
5145
idna==3.4
52-
# via
53-
# anyio
54-
# requests
46+
# via requests
5547
importlib-resources==5.12.0
5648
# via matplotlib
5749
iopath==0.1.10
@@ -70,7 +62,7 @@ mpmath==1.3.0
7062
# via sympy
7163
networkx==3.1
7264
# via torch
73-
numpy==1.24.2
65+
numpy==1.24.3
7466
# via
7567
# contourpy
7668
# layoutparser
@@ -90,20 +82,20 @@ opencv-python==4.7.0.72
9082
# via
9183
# -r requirements/base.in
9284
# layoutparser
93-
packaging==23.0
85+
packaging==23.1
9486
# via
9587
# huggingface-hub
9688
# matplotlib
9789
# onnxruntime
9890
# pytesseract
9991
# transformers
100-
pandas==2.0.0
92+
pandas==2.0.1
10193
# via layoutparser
10294
pdf2image==1.16.3
10395
# via layoutparser
10496
pdfminer-six==20221105
10597
# via pdfplumber
106-
pdfplumber==0.8.0
98+
pdfplumber==0.9.0
10799
# via layoutparser
108100
pillow==9.5.0
109101
# via
@@ -115,14 +107,12 @@ pillow==9.5.0
115107
# torchvision
116108
portalocker==2.7.0
117109
# via iopath
118-
protobuf==4.22.1
110+
protobuf==4.23.1
119111
# via onnxruntime
120112
pycocotools==2.0.6
121113
# via effdet
122114
pycparser==2.21
123115
# via cffi
124-
pydantic==1.10.7
125-
# via fastapi
126116
pyparsing==3.0.9
127117
# via matplotlib
128118
pytesseract==0.3.10
@@ -142,36 +132,34 @@ pyyaml==6.0
142132
# omegaconf
143133
# timm
144134
# transformers
145-
regex==2023.3.23
135+
regex==2023.5.5
146136
# via transformers
147-
requests==2.28.2
137+
requests==2.30.0
148138
# via
149139
# huggingface-hub
150140
# torchvision
151141
# transformers
142+
safetensors==0.3.1
143+
# via timm
152144
scipy==1.10.1
153145
# via layoutparser
154146
six==1.16.0
155147
# via python-dateutil
156-
sniffio==1.3.0
157-
# via anyio
158-
starlette==0.26.1
159-
# via fastapi
160-
sympy==1.11.1
148+
sympy==1.12
161149
# via
162150
# onnxruntime
163151
# torch
164-
timm==0.6.13
152+
timm==0.9.2
165153
# via effdet
166-
tokenizers==0.13.2
154+
tokenizers==0.13.3
167155
# via transformers
168-
torch==2.0.0
156+
torch==2.0.1
169157
# via
170158
# effdet
171159
# layoutparser
172160
# timm
173161
# torchvision
174-
torchvision==0.15.1
162+
torchvision==0.15.2
175163
# via
176164
# effdet
177165
# layoutparser
@@ -181,21 +169,17 @@ tqdm==4.65.0
181169
# huggingface-hub
182170
# iopath
183171
# transformers
184-
transformers==4.27.4
172+
transformers==4.29.2
185173
# via -r requirements/base.in
186174
typing-extensions==4.5.0
187175
# via
188176
# huggingface-hub
189177
# iopath
190-
# pydantic
191-
# starlette
192178
# torch
193179
tzdata==2023.3
194180
# via pandas
195-
urllib3==1.26.15
181+
urllib3==2.0.2
196182
# via requests
197-
uvicorn==0.21.1
198-
# via -r requirements/base.in
199183
wand==0.6.11
200184
# via pdfplumber
201185
zipp==3.15.0

requirements/dev.in

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,8 @@
33
jupyter
44
ipython
55
pip-tools
6+
# NOTE(alan): Pinned to prevent errors that occur with newer versions, see
7+
# https://discourse.jupyter.org/t/jupyter-notebook-zmq-message-arrived-on-closed-channel-error/17869
8+
jupyter_client==7.3.4
9+
jupyter_server==1.23.6
10+
tornado==6.1

0 commit comments

Comments
 (0)