1- # 🧠 GenAI SQL Tools Suite
1+ # GenAI SQL Tools Suite
22
33A production-ready suite of modular, asynchronous tools for analyzing, refactoring, commenting, and auditing SQL code using Azure OpenAI (GPT-4o).
44
55> Designed with ** security** , ** performance** , ** auditability** , and ** HIPAA/HITECH compliance** in mind.
66
77---
88
9- ## 📁 Project Structure
9+ ## Project Structure
1010
1111```
1212sql_tools/
@@ -48,47 +48,47 @@ sql_tools/
4848
4949---
5050
51- ## ✅ Features
52-
53- - 🧹 Data Masking and Anonymization - Automatically masks sensitive data such as emails, phone numbers, credit card numbers, and SSNs.
54- - ⚙️ Modular task engine (comment, analyze, refactor, audit, explain, test)
55- - 🔍 Query Simulation and Validation
56- - 📋 Centralized prompt management via ` prompts/index.yaml `
57- - ⚡ Asynchronous OpenAI integration using ` httpx `
58- - 🧼 Sanitized output with ` --sanitize `
59- - 🔁 Directory recursion and batching with ` --recursive `
60- - 🧪 Preview results before modifying files with ` --dry-run `
61- - 🔐 Backup and HIPAA-safe logging
62- - 🔀 Git integration: auto-stage files with ` --git `
63- - 📤 Export to separate files using ` --output `
64- - 🌐 Support for multiple SQL dialects using nl_to_sql task (e.g., T-SQL, PostgreSQL, Oracle)
65- - 🚀 Natural Language to SQL Conversion
66- - 🔐 Enhanced security audits with SQL injection and role misuse detection
67- - 📊 Performance benchmarking and optimization
68- - 🔧 Data masking and anonymization
69- - 🎨 SQL Style Guide Enforcement
70- - 🛠 Dynamic SQL Detection
71- - 🧑🏫 SQL Education Mode (Interactive Tutorials)
51+ ## Features
52+
53+ - Data Masking and Anonymization - Automatically masks sensitive data such as emails, phone numbers, credit card numbers, and SSNs.
54+ - Modular task engine (comment, analyze, refactor, audit, explain, test)
55+ - Query Simulation and Validation
56+ - Centralized prompt management via ` prompts/index.yaml `
57+ - Asynchronous OpenAI integration using ` httpx `
58+ - Sanitized output with ` --sanitize `
59+ - Directory recursion and batching with ` --recursive `
60+ - Preview results before modifying files with ` --dry-run `
61+ - Backup and HIPAA-safe logging
62+ - Git integration: auto-stage files with ` --git `
63+ - Export to separate files using ` --output `
64+ - Support for multiple SQL dialects using nl_to_sql task (e.g., T-SQL, PostgreSQL, Oracle)
65+ - Natural Language to SQL Conversion
66+ - Enhanced security audits with SQL injection and role misuse detection
67+ - Performance benchmarking and optimization
68+ - Data masking and anonymization
69+ - SQL Style Guide Enforcement
70+ - Dynamic SQL Detection
71+ - SQL Education Mode (Interactive Tutorials)
7272
7373---
7474
75- ## 🧪 CLI Usage
75+ ## CLI Usage
7676
77- ## 🔒 Mask Sensitive Data in SQL Queries
77+ ## Mask Sensitive Data in SQL Queries
7878
7979``` bash
8080python app.py --task=mask --path=example.sql --output=masked_example.sql
8181```
8282
83- ### 🔍 What It Does:
83+ ### What It Does:
8484- ** Task** : ` mask ` — Automatically identifies and masks sensitive data such as:
8585 - Email addresses.
8686 - Phone numbers.
8787 - Credit card numbers.
8888 - Social Security Numbers (SSNs).
8989- ** ` --output=... ` ** : Writes the masked SQL to a new file.
9090
91- ### 🔧 Enforce SQL Style Guide
91+ ### Enforce SQL Style Guide
9292``` bash
9393python app.py --task=style_enforce --path=example.sql --sql_dialect=PostgreSQL --output=styled_example.sql
9494```
@@ -98,57 +98,57 @@ python app.py --task=style_enforce --path=example.sql --sql_dialect=PostgreSQL -
9898- ** ` --sql_dialect=... ` ** : Specifies the SQL dialect (e.g., PostgreSQL, T-SQL).
9999- ** ` --output=... ` ** : Writes the styled SQL to a new file.
100100
101- ### 🔧 Comment a SQL file
101+ ### Comment a SQL file
102102``` bash
103103python app.py --task=comment --path=example.sql
104104```
105105
106- ### 🧼 Clean and save to new file
106+ ### Clean and save to new file
107107``` bash
108108python app.py --task=comment --path=example.sql --sanitize --output=cleaned_example.sql
109109```
110110
111- ### 🔍 Preview refactored query (no overwrite)
111+ ### Preview refactored query (no overwrite)
112112``` bash
113113python app.py --task=refactor --path=query.sql --dry-run
114114```
115115
116- ### 🗃️ Process all .sql files in folder (with backups)
116+ ### Process all .sql files in folder (with backups)
117117``` bash
118118python app.py --task=analyze --path=./sql_scripts --recursive --backup
119119```
120120
121- ### 🔐 Run security audit and stage for Git
121+ ### Run security audit and stage for Git
122122``` bash
123123python app.py --task=audit --path=query.sql --git
124124```
125125
126- ### 🧪 Generate SQL test cases
126+ ### Generate SQL test cases
127127``` bash
128128python app.py --task=test --path=example.sql --dry-run
129129```
130130
131- ### 🚀 Natural Language to SQL Conversion (inline query)
131+ ### Natural Language to SQL Conversion (inline query)
132132``` bash
133133python app.py --task=nl_to_sql --path=" list all patients diagnosed with diabetes last month" --sql_dialect=" PostgreSQL" --schema_path=" schema/schema.json" --dry-run
134134```
135135
136- ### 🚀 Natural Language to SQL Conversion (query file)
136+ ### Natural Language to SQL Conversion (query file)
137137``` bash
138138python app.py --task=nl_to_sql --path=queries/nl_query.txt --sql_dialect=" T-SQL" --schema_path=" schema/HealthClaimsDW.json" --output=output/generated_query.sql
139139```
140140
141- ### 📊 Benchmark SQL query performance
141+ ### Benchmark SQL query performance
142142``` bash
143143python app.py --task=benchmark --path=example.sql --dry-run
144144```
145145
146- ### 📈 Visualize query execution plan
146+ ### Visualize query execution plan
147147``` bash
148148python app.py --task=visualize --path=example.sql
149149```
150150
151- ### 🔍 Dynamic SQL Detection
151+ ### Dynamic SQL Detection
152152
153153Detect dynamic SQL patterns and analyze risks/optimizations:
154154
@@ -165,7 +165,7 @@ python app.py --task=dynamic_sql --path="queries/" --recursive
165165
166166---
167167
168- ### 👩🏫 SQL Learning Mode (Interactive Tutorials)
168+ ### SQL Learning Mode (Interactive Tutorials)
169169The sql_learn_mode.py script provides an interactive platform for learning SQL concepts through quizzes, practice, and conversational guidance. It leverages an AI client (BaseAIClient) for generating dynamic SQL content, such as quiz questions and feedback on queries.
170170
171171### Run the Learning Mode directly:
@@ -177,7 +177,7 @@ Optionally, you can **Set as startup file** from within VS Professional and clic
177177
178178---
179179
180- ## 🛠 Configuration
180+ ## Configuration
181181
182182Edit ` core/config_loader.py ` to match your Azure OpenAI deployment:
183183
@@ -190,15 +190,15 @@ AOPAI_DEPLOY_MODEL = "gpt-4o-dev"
190190
191191---
192192
193- ## 📦 Install Requirements
193+ ## Install Requirements
194194
195195``` bash
196196pip install -r requirements.txt
197197```
198198
199199---
200200
201- ## 📋 Centralized Prompt Management
201+ ## Centralized Prompt Management
202202
203203All prompts are defined in a single ` index.yaml ` file, which maps specific tasks to their associated prompt templates. This design enables:
204204
@@ -208,7 +208,7 @@ All prompts are defined in a single `index.yaml` file, which maps specific tasks
208208
209209Each task class dynamically loads its associated prompt using metadata from this file.
210210
211- ### 🧾 Example ` index.yaml ` Entry
211+ ### Example ` index.yaml ` Entry
212212
213213``` yaml
214214commenter.add_comments :
@@ -237,7 +237,7 @@ commenter.add_comments:
237237
238238---
239239
240- ## 🔐 Security & Compliance
240+ ## Security & Compliance
241241
242242- Logs are stored per task under the ` logs/` directory
243243- Safe use of T-SQL comments (`--`, `/* ... */`)
@@ -246,21 +246,21 @@ commenter.add_comments:
246246
247247---
248248
249- # # 📄 License
249+ # # License
250250
251251This project is licensed under the [MIT License](./LICENSE).
252252
253253---
254254
255- # # 🤝 Contributing (Comming Soon)
255+ # # Contributing (Coming Soon)
256256
257257- *Fork and open a PR*
258258- *Follow modular design and solid principles*
259259- *Ensure proper logging, error handling, and secure configuration*
260260
261261---
262262
263- # # 🙌 Authors & Acknowledgments
263+ # # Authors & Acknowledgments
264264
265265- Vision and engineering by **Hans Esquivel**
266266- Powered by **Python & Azure OpenAI**
0 commit comments