Skip to content

Commit b5e2a74

Browse files
committed
Update readme
1 parent 99ab16d commit b5e2a74

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

text_2_sql/data_dictionary/README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,12 +85,17 @@ Below is a sample entry for a view / table that we which to expose to the LLM. T
8585

8686
A full data dictionary must be built for all the views / tables you which to expose to the LLM. The metadata provide directly influences the accuracy of the Text2SQL component.
8787

88+
8889
## Indexing
8990

9091
`./deploy_ai_search/text_2_sql.py` & `./deploy_ai_search/text_2_sql_query_cache.py` contains the scripts to deploy and index the data dictionary for use within the plugin. See instructions in `./deploy_ai_search/README.md`.
9192

9293
## Automatic Generation
9394

95+
> [!IMPORTANT]
96+
>
97+
> - The data dictioonary generation scripts have been moved to `text_2_sql_core`. Documentation will be updated shortly.
98+
9499
Manually creating the `entities.json` is a time consuming exercise. To speed up generation, a mixture of SQL Queries and an LLM can be used to generate a initial version. Existing comments and descriptions in the database, can be combined with sample values to generate the necessary descriptions. Manual input can then be used to tweak it for the use case and any improvements.
95100

96101
`./text_2_sql_core/data_dictionary/data_dictionary_creator.py` contains a utility class that handles the automatic generation and selection of schemas from the source SQL database. It must be subclassed to the appropriate engine to handle engine specific queries and connection details.

0 commit comments

Comments
 (0)