Skip to content

Commit abb0174

Browse files
Integration with the Google Cloud Vision API (#2902)
This PR adds a third OCR provider, alongside Tesseract and Paddle: the [Google Cloud Vision API](https://cloud.google.com/vision). It can be used similarly to other OCR methods: set the `OCR_AGENT` environment variable to the path to the OCR module (`unstructured.partition.utils.ocr_models.google_vision_ocr.OCRAgentGoogleVision`). You also need to set the credentials to use Google APIs, for instance by setting the `GOOGLE_APPLICATION_CREDENTIALS` environment variable. --------- Co-authored-by: christinestraub <[email protected]>
1 parent 05ff975 commit abb0174

24 files changed

+261
-36
lines changed

CHANGELOG.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
1-
## 0.13.4-dev0
1+
## 0.13.4-dev1
22

33
### Enhancements
44

55
### Features
66

7+
* **Add integration with the Google Cloud Vision API**. Adds a third OCR provider, alongside Tesseract and Paddle: the Google Cloud Vision API.
8+
79
### Fixes
810

911
* **Remove ElementMetadata.section field.**. This field was unused, not populated by any partitioners.

requirements/base.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ dataclasses-json==0.6.4
2525
# via -r ./base.in
2626
dataclasses-json-speakeasy==0.5.11
2727
# via unstructured-client
28-
emoji==2.11.0
28+
emoji==2.11.1
2929
# via -r ./base.in
3030
filetype==1.2.0
3131
# via -r ./base.in

requirements/dev.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ jupyterlab==4.1.6
185185
# via notebook
186186
jupyterlab-pygments==0.3.0
187187
# via nbconvert
188-
jupyterlab-server==2.26.0
188+
jupyterlab-server==2.27.0
189189
# via
190190
# jupyterlab
191191
# notebook

requirements/extra-paddleocr.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ idna==3.7
5353
# via
5454
# -c ./base.txt
5555
# requests
56-
imageio==2.34.0
56+
imageio==2.34.1
5757
# via
5858
# imgaug
5959
# scikit-image

requirements/extra-pdf-image.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ unstructured-inference==0.7.27
1313
# unstructured fork of pytesseract that provides an interface to allow for multiple output formats
1414
# from one tesseract call
1515
unstructured.pytesseract>=0.3.12
16+
google-cloud-vision

requirements/extra-pdf-image.txt

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@
66
#
77
antlr4-python3-runtime==4.9.3
88
# via omegaconf
9+
cachetools==5.3.3
10+
# via google-auth
911
certifi==2024.2.2
1012
# via
1113
# -c ././deps/constraints.txt
@@ -43,6 +45,24 @@ fsspec==2024.3.1
4345
# via
4446
# huggingface-hub
4547
# torch
48+
google-api-core[grpc]==2.18.0
49+
# via google-cloud-vision
50+
google-auth==2.29.0
51+
# via
52+
# google-api-core
53+
# google-cloud-vision
54+
google-cloud-vision==3.7.2
55+
# via -r ./extra-pdf-image.in
56+
googleapis-common-protos==1.63.0
57+
# via
58+
# google-api-core
59+
# grpcio-status
60+
grpcio==1.62.2
61+
# via
62+
# google-api-core
63+
# grpcio-status
64+
grpcio-status==1.62.2
65+
# via google-api-core
4666
huggingface-hub==0.22.2
4767
# via
4868
# timm
@@ -147,11 +167,26 @@ pillow-heif==0.16.0
147167
# via -r ./extra-pdf-image.in
148168
portalocker==2.8.2
149169
# via iopath
170+
proto-plus==1.23.0
171+
# via
172+
# google-api-core
173+
# google-cloud-vision
150174
protobuf==4.23.4
151175
# via
152176
# -c ././deps/constraints.txt
177+
# google-api-core
178+
# google-cloud-vision
179+
# googleapis-common-protos
180+
# grpcio-status
153181
# onnx
154182
# onnxruntime
183+
# proto-plus
184+
pyasn1==0.6.0
185+
# via
186+
# pyasn1-modules
187+
# rsa
188+
pyasn1-modules==0.4.0
189+
# via google-auth
155190
pycocotools==2.0.7
156191
# via
157192
# -c ././deps/constraints.txt
@@ -195,8 +230,11 @@ regex==2024.4.16
195230
requests==2.31.0
196231
# via
197232
# -c ./base.txt
233+
# google-api-core
198234
# huggingface-hub
199235
# transformers
236+
rsa==4.9
237+
# via google-auth
200238
safetensors==0.4.3
201239
# via
202240
# timm

requirements/ingest/astra.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,7 @@ hpack==4.0.0
4646
httpcore==1.0.5
4747
# via httpx
4848
httpx[http2]==0.27.0
49-
# via
50-
# astrapy
51-
# httpx
49+
# via astrapy
5250
hyperframe==6.0.1
5351
# via h2
5452
idna==3.7

requirements/ingest/azure.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,9 +80,7 @@ portalocker==2.8.2
8080
pycparser==2.22
8181
# via cffi
8282
pyjwt[crypto]==2.8.0
83-
# via
84-
# msal
85-
# pyjwt
83+
# via msal
8684
requests==2.31.0
8785
# via
8886
# -c ./ingest/../base.txt

requirements/ingest/box.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,7 @@ attrs==23.2.0
99
boxfs==0.3.0
1010
# via -r ./ingest/box.in
1111
boxsdk[jwt]==3.9.2
12-
# via
13-
# boxfs
14-
# boxsdk
12+
# via boxfs
1513
certifi==2024.2.2
1614
# via
1715
# -c ./ingest/../base.txt

requirements/ingest/chroma.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -214,9 +214,7 @@ urllib3==1.26.18
214214
# kubernetes
215215
# requests
216216
uvicorn[standard]==0.29.0
217-
# via
218-
# chromadb
219-
# uvicorn
217+
# via chromadb
220218
uvloop==0.19.0
221219
# via uvicorn
222220
watchfiles==0.21.0

0 commit comments

Comments
 (0)