Skip to content

Commit 53feab1

Browse files
committed
update
1 parent 0974707 commit 53feab1

File tree

1 file changed

+175
-2
lines changed

1 file changed

+175
-2
lines changed

articles/ai-foundry/openai/how-to/responses.md

Lines changed: 175 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ Not every model is available in the regions supported by the responses API. Chec
6565
> - Images can't be uploaded as a file and then referenced as input. Coming soon.
6666
>
6767
> There's a known issue with the following:
68-
> - PDF as an input file is not yet supported.
68+
> - PDF as an input file [is now supported](), but setting file upload purpose to `user_data` is not currently supported.
6969
> - Performance when background mode is used with streaming. The issue is expected to be resolved soon.
7070
7171
### Reference documentation
@@ -694,6 +694,179 @@ response = client.responses.create(
694694
print(response)
695695
```
696696

697+
## File input
698+
699+
> [!NOTE]
700+
> - All extracted text and images are put into the model's context. Make sure you understand the pricing and token usage implications of using PDFs as input.
701+
>
702+
> - You can upload up to 100 pages and 32MB of total content in a single request to the API, across multiple file inputs.
703+
>
704+
> - Only models that support both text and image inputs, such as `gpt-4o`, `gpt-4o-mini`, or `o1`, can accept PDF files as input.
705+
>
706+
> - A `purpose` of `user_data` is currently not supported. As a temporary workaround you will need to set purpose to `assistants`.
707+
708+
### Convert PDF to Base64 and analyze
709+
710+
711+
712+
```python
713+
import base64
714+
from openai import AzureOpenAI
715+
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
716+
717+
token_provider = get_bearer_token_provider(
718+
DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
719+
)
720+
721+
client = AzureOpenAI(
722+
base_url = "https://YOUR-RESOURCE=NAME.openai.azure.com/openai/v1/",
723+
azure_ad_token_provider=token_provider,
724+
api_version="preview"
725+
)
726+
727+
with open("PDF-FILE-NAME.pdf", "rb") as f: # assumes PDF is in the same directory as the executing script
728+
data = f.read()
729+
730+
base64_string = base64.b64encode(data).decode("utf-8")
731+
732+
response = client.responses.create(
733+
model="gpt-4o-mini", # model deployment name
734+
input=[
735+
{
736+
"role": "user",
737+
"content": [
738+
{
739+
"type": "input_file",
740+
"filename": "PDF-FILE-NAME.pdf",
741+
"file_data": f"data:application/pdf;base64,{base64_string}",
742+
},
743+
{
744+
"type": "input_text",
745+
"text": "Summarize this PDF",
746+
},
747+
],
748+
},
749+
]
750+
)
751+
752+
print(response.output_text)
753+
```
754+
755+
### Upload PDF and analyze
756+
757+
Upload the PDF file. A `purpose` of `user_data` is currently not supported. As a workaround you will need to set purpose to `assistants`.
758+
759+
```python
760+
from openai import AzureOpenAI
761+
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
762+
763+
token_provider = get_bearer_token_provider(
764+
DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
765+
)
766+
767+
client = AzureOpenAI(
768+
azure_endpoint = "https://YOUR-RESOURCE=NAME.openai.azure.com/",
769+
azure_ad_token_provider=token_provider,
770+
api_version="2024-10-21"
771+
)
772+
773+
774+
# Upload a file with a purpose of "batch"
775+
file = client.files.create(
776+
file=open("nucleus_sampling.pdf", "rb"), # This assumes a .pdf file in the same directory as the executing script
777+
purpose="assistants"
778+
)
779+
780+
print(file.model_dump_json(indent=2))
781+
file_id = file.id
782+
```
783+
784+
**Output:**
785+
786+
```
787+
{
788+
"id": "assistant-KaVLJQTiWEvdz8yJQHHkqJ",
789+
"bytes": 4691115,
790+
"created_at": 1752174469,
791+
"filename": "nucleus_sampling.pdf",
792+
"object": "file",
793+
"purpose": "assistants",
794+
"status": "processed",
795+
"expires_at": null,
796+
"status_details": null
797+
}
798+
```
799+
800+
You will then take the value of the `id` and pass that to a model for processing under `file_id`:
801+
802+
```python
803+
from openai import AzureOpenAI
804+
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
805+
806+
token_provider = get_bearer_token_provider(
807+
DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
808+
)
809+
810+
client = AzureOpenAI(
811+
base_url = "https://YOUR-RESOURCE=NAME.openai.azure.com/openai/v1",
812+
azure_ad_token_provider=token_provider,
813+
api_version="preview"
814+
)
815+
816+
817+
response = client.responses.create(
818+
model="gpt-4o-mini",
819+
input=[
820+
{
821+
"role": "user",
822+
"content": [
823+
{
824+
"type": "input_file",
825+
"file_id":"assistant-KaVLJQTiWEvdz8yJQHHkqJ"
826+
},
827+
{
828+
"type": "input_text",
829+
"text": "Summarize this PDF",
830+
},
831+
],
832+
},
833+
]
834+
)
835+
836+
print(response.output_text)
837+
```
838+
839+
```bash
840+
curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/files?api-version=2024-10-21 \
841+
-H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
842+
-F purpose="assistants" \
843+
-F file="@your_file.pdf" \
844+
845+
curl https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses?api-version=preview \
846+
-H "Content-Type: application/json" \
847+
-H "Authorization: Bearer $AZURE_OPENAI_AUTH_TOKEN" \
848+
-d '{
849+
"model": "gpt-4.1",
850+
"input": [
851+
{
852+
"role": "user",
853+
"content": [
854+
{
855+
"type": "input_file",
856+
"file_id": "assistant-123456789"
857+
},
858+
{
859+
"type": "input_text",
860+
"text": "ASK SOME QUESTION RELATED TO UPLOADED PDF"
861+
}
862+
]
863+
}
864+
]
865+
}'
866+
```
867+
868+
869+
697870
## Using remote MCP servers
698871

699872
You can extend the capabilities of your model by connecting it to tools hosted on remote Model Context Protocol (MCP) servers. These servers are maintained by developers and organizations and expose tools that can be accessed by MCP-compatible clients, such as the Responses API.
@@ -963,7 +1136,7 @@ while response.status in {"queued", "in_progress"}:
9631136
print(f"Final status: {response.status}\nOutput:\n{response.output_text}")
9641137
```
9651138

966-
You can cancel an in-progress background task using the cancel endpoint. Canceling is idempotent—subsequent calls will return the final response object.
1139+
You can cancel an in-progress background task using the `cancel` endpoint. Canceling is idempotent—subsequent calls will return the final response object.
9671140

9681141
```bash
9691142
curl -X POST https://YOUR-RESOURCE-NAME.openai.azure.com/openai/v1/responses/resp_1234567890/cancel?api-version=preview \

0 commit comments

Comments
 (0)