Skip to content

Convert natural language queries into SQL using LangChain and Google's Gemini model. This notebook demonstrates integration with Cloud SQL, query generation, automated execution, and response rephrasing. Ideal for AI-powered database interactions and automated analytics workflows.

Notifications You must be signed in to change notification settings

VisvaV/Natural-Language-to-SQL-Query-Generation-with-Gemini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Natural Language to SQL Query Generation using Gemini and LangChain

Project Overview

This project demonstrates how to convert natural language questions into SQL queries using Google's Gemini AI and LangChain. It integrates with Google Cloud SQL to execute queries automatically, providing efficient and intuitive database interactions using natural language.

Key Features

  • Natural Language Processing (NLP): Converts plain English into SQL queries.
  • AI Integration: Uses Google's Gemini AI model (gemini-2.0-flash) to generate SQL queries.
  • Automated Query Execution: Automatically executes generated queries on Google Cloud SQL.
  • Result Rephrasing: Clearly conveys results in readable form.
  • Error Handling: Comprehensive diagnostics and SQL query sanitization.

Technical Stack

  • Python 3
  • LangChain
  • Gemini AI (Google Generative AI)
  • Google Cloud SQL
  • SQLAlchemy
  • PyMySQL

Setup Instructions

1. Clone the Repository

git clone [my repo](https://github.com/VisvaV/Natural-Language-to-SQL-Query-Generation-with-Gemini)
cd [my repo](Natural-Language-to-SQL-Query-Generation-with-Gemini)

2. Install Dependencies

pip install langchain langchain-openai langchain-google-genai sqlalchemy pymysql google-cloud-sql-connector langsmith

3. Configure Environment Variables

Set these variables:

export GOOGLE_API_KEY='your_google_api_key'
export LANGSMITH_API_KEY='your_langsmith_api_key'
export LANGCHAIN_ENDPOINT='https://api.smith.langchain.com'
export LANGCHAIN_PROJECT='your_langchain_project_name'

4. Google Cloud Authentication

Place your Google Cloud credentials JSON file in the project directory and set the path:

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/your/credentials.json"

Database Schema Description

Include your database schema description CSV in the repository root as database_table_descriptions.csv. This file helps the model accurately generate queries.

database_table_descriptions.csv

This CSV should describe tables, columns, data types, constraints, and relationships.

Workflow and Concepts Explained

LangChain Pipelines

LangChain manages NLP inputs, the Gemini model, and database execution. Workflow steps:

  1. Prompt Templating
  2. SQL Query Generation by Gemini
  3. Query Cleaning
  4. Execution on Google Cloud SQL
  5. Result Rephrasing

Prompt Engineering

Detailed schema descriptions and clear task definitions greatly enhance query accuracy.

Error Handling and Debugging

The project includes detailed error handling:

  • KeyError: Ensuring consistent input naming (input, question, table_info).
  • SQL Syntax Errors: Handled through SQLAlchemy, ensuring adherence to schema.

Troubleshooting

Common issues:

  • Credential Errors: Verify Google Cloud credentials and paths.
  • KeyError Issues: Ensure correct input keys.
  • SQL Query Errors: Verify queries against provided schema.

Use Cases

Applicable for:

  • Business Intelligence Automation
  • AI-powered Database Interfaces
  • Automated Analytics and Reporting
  • Quick Data Exploration

Future Enhancements

  • Implement caching mechanisms.
  • Support additional SQL dialects and databases.
  • Optimize prompt engineering for accuracy.

Contribution

Contributions welcomed. Submit issues or pull requests via GitHub.

About

Convert natural language queries into SQL using LangChain and Google's Gemini model. This notebook demonstrates integration with Cloud SQL, query generation, automated execution, and response rephrasing. Ideal for AI-powered database interactions and automated analytics workflows.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published