NL2SQL: Natural Language to SQL Converter

This project lets users upload their own SQL databases and query them using natural language, powered by LLMs (Large Language Models) like Google Gemini.

Features

Upload any database (SQLite, MySQL, PostgreSQL, etc.) and query it with natural language.
Automatic schema analysis and dynamic table/column selection for each question.
Mini-schema optimization: Only relevant tables/columns are sent to the LLM for each query, making it fast and scalable for large databases.
Persistent conversation memory (with SQLite) for context-aware, multi-turn chat.
Summarization for long conversations.
Performance monitoring and timing logs for each major step (table selection, LLM call, SQL execution).
Robust fallback: If semantic table selection fails, falls back to keyword-based selection.
Configurable LLM model: Easily change the Gemini model in main.py via config.MODEL_NAME.

Quick Start

Install dependencies
```
pip install -r requirements.txt
```
Set up your LLM provider
- Google Gemini (default):
  - Get a Google Gemini API key
  - Set it in your environment or .env file:
```
GOOGLE_API_KEY=your-key-here
```
  - The default model is gemini-1.5-pro-002. You can change this in main.py:
```
config.MODEL_NAME = "gemini-1.5-pro-002"  # or another supported model
```
- Other providers:
  - (Planned) You can add support for OpenAI, local models, or user-supplied keys in the future.
Run the app
```
python main.py
```
- The Gradio UI will be available at http://localhost:7860
Upload or connect to your database
- Use the UI to upload a SQLite file or enter a connection string for MySQL/PostgreSQL/etc.
Ask questions!
- Try queries like "How many movies has Tom Hanks been in? Name them."

Current Limitations & Notes

Gemini API Quota:
- The free tier has strict per-minute and per-day limits. If you see 429 errors, you have exceeded your quota. Wait for reset, use a different key, or upgrade your plan.
- For deployment, consider letting users supply their own API keys or supporting multiple providers.
Semantic Table Selection:
- If semantic selection fails (e.g., due to vectorstore/embedding issues), the system falls back to keyword-based selection and logs the error.
LLM Model Configuration:
- Change the model by editing config.MODEL_NAME in main.py.
Performance:
- Timing logs are printed for table selection, mini-schema construction, LLM call, and SQL execution to help you profile and optimize.

Next Steps / TODO

Add support for user-supplied API keys in the UI.
Add multi-provider fallback (Gemini, OpenAI, local models).
Integrate dynamic few-shot example selection for each question.
Implement caching for table selection and LLM outputs.
Add more robust error handling and user feedback for quota issues.

Contributions and feedback are welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
main2.py		main2.py
mysql_connection.txt		mysql_connection.txt
requirements.txt		requirements.txt
sample_data.csv		sample_data.csv
setup_sakila.py		setup_sakila.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

NL2SQL: Natural Language to SQL Converter

Features

Quick Start

Current Limitations & Notes

Next Steps / TODO

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

AdiD-code/nl2sql

Folders and files

Latest commit

History

Repository files navigation

NL2SQL: Natural Language to SQL Converter

Features

Quick Start

Current Limitations & Notes

Next Steps / TODO

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages