You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
One could call it "lazy evaluation model" or "progressive disclosure pattern".
The idea is to keep the Input minimal (in line with the task), but when additional info is needed, allow the Model (LLM) to request it, grow the Input, and try the call again (from the beginning).
Problem:
Instructor [1] forces the Output to always use tool_calls, and I think it feels more natural to do like:
If extra info is needed:
call the tool that brings it
otherwise:
just respond (plain text output)
I do process the text-response with Instructor, but in a 2-step process that I won't describe here.
While exploring the possibility to tweak Instructor to allow this dual-response-mode I had to extract and verify some concepts/knowledge that I present below.
Note that the output of UserInfo.model_json_schema() is very similar but not the same.
{'description': 'Ad-hoc model for user information',
'properties': {
'name': {'description': "User's name",'title': 'Name','type': 'string'},
'age': {'description': "User's age", 'title': 'Age', 'type': 'integer'}},
'required': ['name', 'age'],
'title': 'UserInfo',
'type': 'object'}
In particular, in the API, the description moves up one level, to be under function > description
where in the json_schema it was originally at the same level as properties.
As we will see later, if you use response_model, regardless of what you provide in response_model (List, Union, ...) Instructor will always force the response to be a call to a function. This is specified in
If you want the Model to be able to choose, then you need
'tool_choice': "auto",
Providing a description for our class
The (original) function description ('Correctly extracted UserInfo with all the required parameters with correct types') may not be very helpful, but that is because it was set automatically by Instructor.
classUserInfo(BaseModel):
"Ad-hoc model for user information"# < Our ad-hoc descriptionname: str=Field(description="User's name")
age: int=Field(description="User's age")
Which will end up on the API call at the same level as name and parameters (not inside parameters, which is where Pydantic places it).
Note: Pydantic introduces 'title' keys in many places. The API does not need or expects them, therefore I remove them to reduce cognitive-load, token, garbage, ... But that is another story.
Using Union
When you provide
response_model=Union[UserInfo, GetUserInfo],
The API gets:
'tool_choice': {'type': 'function', 'function': {'name': 'Response'}},
'tools': [{'type': 'function',
'function': {'name': 'Response',
'description': 'Correctly Formatted and Extracted Response.',
'parameters': {'$defs': {'GetUserInfo': {'description': 'Function to request user information.',
'properties': {'name': {'description': "User's name from who to get information.",
'title': 'Name',
'type': 'string'}},
'required': ['name'],
'title': 'GetUserInfo',
'type': 'object'},
'UserInfo': {'description': 'Model for storing/returning user information.',
'properties': {'name': {'description': "User's name",
'title': 'Name',
'type': 'string'},
'age': {'description': "User's age",
'title': 'Age',
'type': 'integer'}},
'required': ['name', 'age'],
'title': 'UserInfo',
'type': 'object'}},
'properties': {'content': {'anyOf': [{'$ref': '#/$defs/UserInfo'},
{'$ref': '#/$defs/GetUserInfo'}],
'title': 'Content'}},
'required': ['content'],
'type': 'object'}}}]
or in other words, it does as if we had a
classResponse(BaseModel):
"Correctly Formatted and Extracted Response."content: Union[UserInfo, GetUserInfo]
Allowing Object OR Text
If you provide
response_model=Union[UserInfo,str],
It does similarly to the 2 objects case (creating a Response class that joins them), but one of them is a str
classResponse(BaseModel):
"Correctly Formatted and Extracted Response."content: Union[UserInfo, str]
So if you don't want that Response be created for you, provide a SINGLE class in response_model, so you can set yourself the description or anything you want.
Something similar but different is what happens when you use Maybe
You give the choice of responding with the Object, or something else if anything is wrong.
Here is the end of my research tweaking Instructor. I ended up creating my own function to do that initial-API-call with the potential-possibility to call an auxiliary function or just respond.
[openai API] tool_choice: Optional[string or object]
Controls which (if any) tool is called by the model.
none means the model will not call any tool and instead generates a message.
auto means the model can pick between generating a message or calling one or more tools.
required means the model must call one or more tools.
none is the default when no tools are present. auto is the default if tools are present.
Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. < - This is what Instructor does, always. It forces the output to be an Object
Q: But what happens if you want it to be "auto" ?
If on your call you specify tool_choice="auto" it get overwritten by Instructor (no surprise).
Using hooks to overwrite it does not work either. My guess it that the hook allows you to SEE only, not MODIFY.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Idea
One could call it "lazy evaluation model" or "progressive disclosure pattern".
The idea is to keep the Input minimal (in line with the task), but when additional info is needed, allow the Model (LLM) to request it, grow the Input, and try the call again (from the beginning).
Problem:
Instructor [1] forces the Output to always use
tool_calls
, and I think it feels more natural to do like:I do process the text-response with Instructor, but in a 2-step process that I won't describe here.
While exploring the possibility to tweak Instructor to allow this dual-response-mode I had to extract and verify some concepts/knowledge that I present below.
[1] https://python.useinstructor.com/ and https://github.com/567-labs/instructor
Conclusion:
No, I couldn't find a way to, using Instructor, allow real dual-response-mode.
I know I could specify
response_model=Union[Function,str]
but that is not what I am looking for.Nonetheless, the lessons I learnt could be helpful for others or future-me.
Instructor
If you give to Instructor
response_model=UserInfo
It takes details from this
and converts them into this inside the API call
https://platform.openai.com/docs/api-reference/responses/create#responses-create-tool_choice
https://platform.openai.com/docs/api-reference/responses/create#responses-create-tools
Note that the output of
UserInfo.model_json_schema()
is very similar but not the same.In particular, in the API, the
description
moves up one level, to be underfunction > description
where in the json_schema it was originally at the same level as
properties
.As we will see later, if you use
response_model
, regardless of what you provide inresponse_model
(List, Union, ...) Instructor will always force the response to be a call to a function. This is specified inIf you want the Model to be able to choose, then you need
Providing a description for our class
The (original) function description ('Correctly extracted
UserInfo
with all the required parameters with correct types') may not be very helpful, but that is because it was set automatically by Instructor.We can modify it by adding a docstring to the Class (more info on https://python.useinstructor.com/concepts/fields/#general-notes-on-json-schema-generation )
Which will end up on the API call at the same level as
name
andparameters
(not inside parameters, which is where Pydantic places it).Note: Pydantic introduces
'title'
keys in many places. The API does not need or expects them, therefore I remove them to reduce cognitive-load, token, garbage, ... But that is another story.Using Union
When you provide
response_model=Union[UserInfo, GetUserInfo],
The API gets:
or in other words, it does as if we had a
Allowing Object OR Text
If you provide
response_model=Union[UserInfo,str],
It does similarly to the 2 objects case (creating a Response class that joins them), but one of them is a
str
equivalent to
So if you don't want that
Response
be created for you, provide a SINGLE class inresponse_model
, so you can set yourself the description or anything you want.Something similar but different is what happens when you use
Maybe
https://python.useinstructor.com/concepts/maybe/#defining-the-model
You give the choice of responding with the Object, or something else if anything is wrong.
Here is the end of my research tweaking Instructor. I ended up creating my own function to do that initial-API-call with the potential-possibility to call an auxiliary function or just respond.
[openai API] tool_choice: Optional[string or object]
https://platform.openai.com/docs/api-reference/chat/create#chat-create-tool_choice
Controls which (if any) tool is called by the model.
none is the default when no tools are present. auto is the default if tools are present.
Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. < - This is what Instructor does, always. It forces the output to be an Object
Q: But what happens if you want it to be "auto" ?
If on your call you specify
tool_choice="auto"
it get overwritten by Instructor (no surprise).Using hooks to overwrite it does not work either. My guess it that the hook allows you to SEE only, not MODIFY.
Beta Was this translation helpful? Give feedback.
All reactions