Skip to content
Discussion options

You must be logged in to vote

Hey Danny,
Good news!!
Found out the source of this issue,
It was the "PDF_EXTRACT_IMAGES" parameter in our .env file, this needs to be set to false,

This parameter gets read into the "extract_images" parameter in the pypdfloader library,

When this was set to True, it would take 45seconds or more per page to get through the loader.load() function in main.py in rag_api repo. When this is set to False, it completes this function in half a second or less.

Thank you so much for posting this video of you uploading it on yourside,
This was exactly the confirmation we needed to know it was definitely coming from our side!

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@ggomp2885
Comment options

@danny-avila
Comment options

@ggomp2885
Comment options

Answer selected by ggomp2885
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #4071 on September 17, 2024 13:40.