Skip to content
Discussion options

You must be logged in to vote

In case anyone else sees this and misses the thread above. I had to put uploaded_file.seek(0) prior to trying to read the file in fitz.open(stream=uploaded_file.read(), filetype="pdf")

My final function looks like this:

def get_details_from_pcpt(uploaded_file):
    uploaded_file.seek(0)
    doc = fitz.open(stream=uploaded_file.read(), filetype="pdf")
    page = doc[0]  
    words = page.get_text("words")
    print(f"{words=}") # words=[
    (145.29600524902344, 255.85806274414062, 159.1846160888672, 269.5967102050781, 'SC', 0, 0, 0),
    (145.58399963378906, 272.92205810546875, 166.14193725585938, 286.66070556640625, 'SCS', 1, 0, 0),
    (145.29600524902344, 304.89007568359375, 151.965332…

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
5 replies
@mnoah66
Comment options

@JorjMcKie
Comment options

@JorjMcKie
Comment options

@mnoah66
Comment options

@mnoah66
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by mnoah66
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants