You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: text_2_sql/GETTING_STARTED.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,9 @@ To get started, perform the following steps:
5
5
1. Setup Azure OpenAI in your subscription with **gpt-4o-mini** & an embedding model, alongside a SQL Server sample database, AI Search and a storage account.
6
6
2. Clone this repository and deploy the AI Search text2sql indexes from `deploy_ai_search`.
7
7
3. Run `uv sync` within the text_2_sql directory to install dependencies.
8
+
- Install the optional dependencies if you need a database connector other than TSQL.
9
+
- See the supported connectors in `text_2_sql_core/src/text_2_sql_core/connectors`.
8
10
4. Create your `.env` file based on the provided sample `.env.example`. Place this file in the same place as the `.env.example`.
9
11
5. Generate a data dictionary for your target server using the instructions in the **Running** section of the `data_dictionary/README.md`.
10
-
6. Upload these data dictionaries to the relevant containers in your storage account. Wait for them to be automatically indexed with the included skillsets.
12
+
6. Upload these generated data dictionaries files to the relevant containers in your storage account. Wait for them to be automatically indexed with the included skillsets.
11
13
7. Navigate to `autogen` directory to view the AutoGen implementation. Follow the steps in `Iteration 5 - Agentic Vector Based Text2SQL.ipynb` to get started.
tool_name (str): The name of the tool to retrieve.
34
34
35
35
Returns:
36
-
FunctionToolAlias: The tool."""
36
+
FunctionTool: The tool."""
37
37
38
38
iftool_name=="sql_query_execution_tool":
39
-
returnFunctionToolAlias(
39
+
returnFunctionTool(
40
40
sql_helper.query_execution_with_limit,
41
41
description="Runs an SQL query against the SQL Database to extract information",
42
42
)
43
43
eliftool_name=="sql_get_entity_schemas_tool":
44
-
returnFunctionToolAlias(
44
+
returnFunctionTool(
45
45
sql_helper.get_entity_schemas,
46
46
description="Gets the schema of a view or table in the SQL Database by selecting the most relevant entity based on the search term. Extract key terms from the user input and use these as the search term. Several entities may be returned. Only use when the provided schemas in the message history are not sufficient to answer the question.",
47
47
)
48
48
eliftool_name=="sql_get_column_values_tool":
49
-
returnFunctionToolAlias(
49
+
returnFunctionTool(
50
50
sql_helper.get_column_values,
51
51
description="Gets the values of a column in the SQL Database by selecting the most relevant entity based on the search term. Several entities may be returned. Use this to get the correct value to apply against a filter for a user's question.",
Copy file name to clipboardExpand all lines: text_2_sql/data_dictionary/README.md
+13-21Lines changed: 13 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -207,10 +207,6 @@ This avoids having to index the fact tables, saving storage, and allows us to st
207
207
208
208
## Automatic Generation
209
209
210
-
> [!IMPORTANT]
211
-
>
212
-
> - The data dictionary generation scripts have been moved to `text_2_sql_core`. Documentation will be updated shortly.
213
-
214
210
Manually creating the `entities.json` is a time consuming exercise. To speed up generation, a mixture of SQL Queries and an LLM can be used to generate a initial version. Existing comments and descriptions in the database, can be combined with sample values to generate the necessary descriptions. Manual input can then be used to tweak it for the use case and any improvements.
215
211
216
212
`./text_2_sql_core/data_dictionary/data_dictionary_creator.py` contains a utility class that handles the automatic generation and selection of schemas from the source SQL database. It must be subclassed to the appropriate engine to handle engine specific queries and connection details.
@@ -222,28 +218,24 @@ The following Databases have pre-built scripts for them:
If there is no pre-built script for your database engine, take one of the above as a starting point and adjust it.
227
224
228
225
## Running
229
226
230
-
Fill out the `.env` template with connection details to your chosen database.
231
-
232
-
Package and install the `text_2_sql_core` library. See [build](https://docs.astral.sh/uv/concepts/projects/build/) if you want to build as a wheel and install on an agent. Or you can run from within a `uv` environment.
233
-
234
-
`data_dictionary <DATABASE ENGINE>`
235
-
236
-
You can pass the following command line arguements:
237
-
238
-
-`-- output_directory` or `-o`: Optional directory that the script will write the output files to.
239
-
-`-- single_file` or `-s`: Optional flag that writes all schemas to a single file.
240
-
-`-- generate_definitions` or `-gen`: Optional flag that uses OpenAI to generate descriptions.
241
-
242
-
If you need control over the following, run the file directly:
243
-
244
-
-`entities`: A list of entities to extract. Defaults to None.
245
-
-`excluded_entities`: A list of entities to exclude.
246
-
-`excluded_schemas`: A list of schemas to exclude.
227
+
1. Create your `.env` file based on the provided sample `.env.example`. Place this file in the same place as the `.env.example`.
228
+
2. Package and install the `text_2_sql_core` library. See [build](https://docs.astral.sh/uv/concepts/projects/build/) if you want to build as a wheel and install on an agent. Or you can run from within a `uv` environment.
229
+
3. Run `data_dictionary <DATABASE ENGINE>`
230
+
- You can pass the following command line arguements:
231
+
-`-- output_directory` or `-o`: Optional directory that the script will write the output files to.
232
+
-`-- single_file` or `-s`: Optional flag that writes all schemas to a single file.
233
+
-`-- generate_definitions` or `-gen`: Optional flag that uses OpenAI to generate descriptions.
234
+
- If you need control over the following, run the file directly:
235
+
-`entities`: A list of entities to extract. Defaults to None.
236
+
-`excluded_entities`: A list of entities to exclude.
237
+
-`excluded_schemas`: A list of schemas to exclude.
238
+
4. Upload these generated data dictionaries files to the relevant containers in your storage account. Wait for them to be automatically indexed with the included skillsets.
0 commit comments