Making inline images embeddable and searchable

> Please provide us with the following information:
> ---------------------------------------------------------------

### This issue is for a: (mark with an `x`)
```
- [ ] bug report -> please search issues before submitting
- [X] feature request
- [X] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)
```
###
I want to use the GPT-4 Turbo with vision functionality to embed and index both text and (inline) images from PDF files for subsequent searching. 

The docs here (https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/gpt4v.md) describe the general pipeline, mentioning that:
-  docs are split into pages of PNG
-  text is extracted using OCR
-  embeddings are generated on text and images

It is unclear to me whether the entire page PNG is being embedded, or if this refers to embedding just the inline images that have been extracted from the PDF/page-PNG (e.g. inline figures/charts). If inline images are embedded (instead of the whole page), how does the Azure OCR tool detect them and separate from the unstructured text? Is there such a feature offered?

Any clarifications would be greatly appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Making inline images embeddable and searchable #1724

Please provide us with the following information:

This issue is for a: (mark with an `x`)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Making inline images embeddable and searchable #1724

Description

Please provide us with the following information:

This issue is for a: (mark with an x)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

This issue is for a: (mark with an `x`)