Skip to content

Commit 21a79bf

Browse files
committed
Update column search
1 parent 5329613 commit 21a79bf

File tree

3 files changed

+41
-93
lines changed

3 files changed

+41
-93
lines changed

text_2_sql/autogen/src/autogen_text_2_sql/custom_agents/sql_schema_selection_agent.py

Lines changed: 1 addition & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -101,28 +101,8 @@ async def on_messages_stream(
101101
if schema not in final_schemas:
102102
final_schemas.append(schema)
103103

104-
columns_for_filter = {}
105-
values_for_filter = {}
106-
for filter, column_value_result in zip(
107-
loaded_entity_result["filter_conditions"], column_value_results
108-
):
109-
columns_for_filter[filter] = []
110-
values_for_filter[filter] = []
111-
for column in column_value_result:
112-
if column["Column"] not in columns_for_filter[filter]:
113-
columns_for_filter[filter].append(column["Column"])
114-
115-
if column["Value"] not in values_for_filter[filter]:
116-
values_for_filter[filter].append(column["Value"])
117-
118-
num_all_values = [len(filter) for filter in values_for_filter]
119-
num_all_columns = [len(filter) for filter in columns_for_filter]
120-
121104
final_results = {
122-
"MANDATORY_DISAMBIGUATION": max(num_all_values) > 3
123-
or max(num_all_columns) > 3,
124-
"COLUMN_OPTIONS_FOR_FILTERS": columns_for_filter,
125-
"VALUE_OPTIONS_FOR_FILTERS": values_for_filter,
105+
"COLUMN_OPTIONS_AND_VALUES_FOR_FILTERS": column_value_results,
126106
"SCHEMA_OPTIONS": final_schemas,
127107
}
128108

text_2_sql/text_2_sql_core/src/text_2_sql_core/connectors/ai_search.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,10 +163,12 @@ async def get_column_values(
163163

164164
logging.info("Column Values: %s", column_values)
165165

166+
filter_to_column = {text: column_values}
167+
166168
if as_json:
167-
return json.dumps(column_values, default=str)
169+
return json.dumps(filter_to_column, default=str)
168170
else:
169-
return column_values
171+
return filter_to_column
170172

171173
async def get_entity_schemas(
172174
self,

text_2_sql/text_2_sql_core/src/text_2_sql_core/prompts/sql_disambiguation_agent.yaml

Lines changed: 36 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -13,30 +13,20 @@ system_message:
1313
</scope_of_user_query>
1414
1515
<instructions>
16-
- If 'MANDATORY_DISAMBIGUATION' is True, you must perform disambiguation on the terms with high cardinality. It is mandatory.
16+
- For every filter extracted from the user's question, you must:
1717
18-
- For every intent and filter condition in the question, map them to the columns in the schemas and the appropriate filter value. Use the whole context of the question and information already provided to do so.
18+
- If it is not a datetime or numerical filter, map it to:
19+
- A value from 'COLUMN_OPTIONS_FOR_FILTERS'
20+
- And a value from 'VALUE_OPTIONS_FOR_FILTERS'
1921
20-
- Do not ask for information already included in the question, schema, or what can reasonably be inferred from the question.
22+
- If the filter is a datetime or numerical filter, map it to:
23+
- A column from 'SCHEMA_OPTIONS'
2124
22-
- Only ask a follow-up question for Date and Numerical values if you are unsure which column to use or what the value means, e.g., does 100 in currency refer to 100 USD or 100 EUR.
25+
- Use the whole context of the question and information already provided to assist with your mapping.
2326
24-
<clear_context_handling>
25-
If the context of the question makes the mapping explicit, and the appropriate filter values can be found in 'column_values' directly map the terms to the relevant column FQN without generating disambiguation questions.
26-
27-
When evaluating questions:
28-
29-
Use the 'column_values' property to check for possible matching columns and compare these to the context of the question. ALWAYS CHECK THE 'column_values' PROPERTY THAT THE FILTER VALUE IS AVAILABLE.
30-
31-
If there are multiple values in 'column_values' that could match the filter, ask for clarification or to narrow down the filter value or column to use. If in doubt, use disambiguation questions to clarify.
32-
33-
Always consider the temporal and contextual phrases (e.g., \"June 2008\") in the question. If the context implies a direct match to a date column, do not request clarification unless multiple plausible columns exist.
34-
For geographical or categorical terms (e.g., \"country\"), prioritize unique matches or add context to narrow down ambiguities based on the schema.
35-
36-
If all mappings are clear, output the JSON with mappings only.
37-
38-
<example>
39-
Question: \"What are the total number of sales within 2008 for the mountain bike product line?\"
27+
<successful_mapping_entry>
28+
- If you can map it to an column and potential filter value:
29+
- Only map if you are reasonably sure of the user's intention.
4030
{
4131
\"filter_mapping\": {
4232
\"bike\": [
@@ -52,74 +42,50 @@ system_message:
5242
}
5343
]
5444
},
55-
\"intent_mapping\": {
56-
\"total number of sales\": \"SalesLT.SalesOrderHeader.SalesOrderID\"
57-
}
5845
}
59-
</example>
60-
</clear_context_handling>
61-
62-
<disambiguation_handling>
63-
If the term is ambiguous, there are multiple matching columns/questions in 'column_values', or the question lacks enough context to infer the correct mapping, then ask for clarification.
64-
65-
For ambiguous terms, evaluate the question context and schema relationships to narrow down matches.
66-
Populate the 'questions' field with the identified filter and relevant FQN, matching columns, and possible filter values.
67-
Include a clarification question in the 'question' field to request more information from the user.
68-
If the clarification is not related to a column or a filter value, populate the 'user_choices' field with the possible choices they can select.
69-
70-
Prioritize clear disambiguation based on:
71-
- Direct matches within schemas.
72-
- Additional context provided by the question (e.g., temporal, categorical, or domain-specific keywords).
73-
74-
Return all disambiguation questions in the 'questions' array. If multiple disambiguation questions are needed, include them all in the 'questions' array at once.
75-
76-
<example>
77-
User question: \"What country did we sell the most in June 2008?\"
78-
Schema contains multiple columns potentially related to \"country.\"
46+
<successful_mapping_entry>
7947
80-
If disambiguation is needed:
48+
<unsuccessful_mapping_entry>
49+
- If you cannot map it to a column, add en entry to the disambiguation list with the clarification question you need from the user:
50+
- If there are multiple possible options, or you are unsure how it maps, make sure to ask a clarification question.
8151
8252
{
83-
\"questions\": [
53+
\"disambiguation\": [
8454
{
8555
\"question\": \"What do you mean by 'country'?\",
8656
\"matching_columns\": [
8757
\"Sales.Country\",
8858
\"Customers.Country\"
8959
],
9060
\"matching_filter_values\": [],
91-
\"user_choices\": []
61+
\"other_user_choices\": []
9262
}
9363
]
9464
}
95-
</example>
9665
97-
<example 2>
98-
User question: \"What are the total sales for the mountain bike product line?\"
99-
'column_values' contains multiple columns potentially related to \"mountain bike.\"
100-
101-
If disambiguation is needed:
102-
{
103-
\"questions\": [
104-
{
105-
\"question\": \"What do you mean by 'mountain bike'?\",
106-
\"matching_columns\": [
107-
\"vProductModelCatalogDescription.Category\",
108-
\"vProductModelCatalogDescription.ProductLine\"
109-
],
110-
\"matching_filter_values\": [],
111-
\"user_choices\": []
112-
}
113-
]
114-
}
115-
</example>
116-
Always include either the 'matching_columns', 'matching_filter_values' or `user_choices` field in the 'questions' array.
117-
</disambiguation_handling>
118-
</instructions>
66+
<rules_for_disambiguation_questions>
67+
- Do not ask for information already included in the question, schema, or what can reasonably be inferred from the question.
68+
</rules_for_disambiguation_questions>
69+
<unsuccessful_mapping_entry>
70+
71+
- For every intent extracted from the user's question:
72+
- If you need to ask any clarification questions, add it to the clarification question list:
73+
74+
{
75+
\"clarification\": [
76+
{
77+
\"question\": \"What do the sales to customers or businesses?\",
78+
\"other_user_choices\": [
79+
\"customers\",
80+
\"businesses\",
81+
]
82+
}
83+
]
84+
}
11985
12086
<output_format>
12187
If all mappings are clear, output the 'mapping' JSON only.
122-
If disambiguation is required, output the disambiguation JSON followed by \"TERMINATE.\"
88+
If disambiguation or clarification is required, output the JSON request followed by \"TERMINATE.\"
12389
Do not provide explanations or reasoning in the output.
12490
</output_format>
12591
"

0 commit comments

Comments
 (0)