-
Notifications
You must be signed in to change notification settings - Fork 5k
Closed
Labels
visionRelated to the multimodal feature that can ingest figures and answer questions based off imagesRelated to the multimodal feature that can ingest figures and answer questions based off images
Description
This is replicable with the sample data. When running prepdocs, you'll see that several pages aren't represented in the sections uploaded, in that there are no sections with corresponding sourcepage equal to that page number, and thus no sections with an imageEmbedding corresponding to that sourcepage. That means some answers may be lower quality, as they don't find the relevant matching image.
Possible approaches:
- For certain document types, like slides, never chunk sections across pages. This was my original idea but then realized our sample document was a slide exported as a PDF, so I couldn't have a PPT-dependent condition. Thus, this isn't a full solution.
- Never let sections go across pages. This may not work well with many PDFs like research papers that legitimately have sections go across pages.
- Associate multiple sourcepage's with a single section. @mattgotteiner says that's possible by picking a delimeter. Not sure if multiple imageEmbedding's would also be possible? Otherwise we'd have to pick which imageEmbedding we thought was best.
- ...? Your idea here!
Metadata
Metadata
Assignees
Labels
visionRelated to the multimodal feature that can ingest figures and answer questions based off imagesRelated to the multimodal feature that can ingest figures and answer questions based off images