1+ <!DOCTYPE html>
2+ < html lang ="en ">
3+ < head >
4+ < meta charset ="UTF-8 " />
5+ < meta name ="viewport " content ="width=device-width,initial-scale=1 " />
6+ < title > LLMSQL Project — Text-to-SQL Benchmark</ title >
7+
8+ < link rel ="stylesheet " href ="_static/styles/front_page.css ">
9+
10+
11+ </ head >
12+ < body >
13+
14+ <!-- Sidebar -->
15+ < div class ="sidebar ">
16+ < div class ="sidebar-content ">
17+ < a class ="sidebar-button " href ="https://llmsql.github.io/llmsql-benchmark/docs " target ="_blank " rel ="noopener "> 📚 Documentation</ a >
18+ < input type ="text " placeholder ="Search... " class ="sidebar-search "/>
19+ </ div >
20+ </ div >
21+
22+ <!-- Right panel -->
23+ < aside class ="on-this-page " aria-label ="On this page ">
24+ < h4 > ON THIS PAGE</ h4 >
25+ < ul >
26+ < li > < a href ="#description "> Description</ a > </ li >
27+ < li > < a href ="#improvements "> Key improvements</ a > </ li >
28+ < li > < a href ="#documentation "> Documentation</ a > </ li >
29+ < li > < a href ="#quick-start "> Quick Start</ a > </ li >
30+ < li > < a href ="#links "> Links</ a > </ li >
31+ < li > < a href ="#contributing "> Contributing</ a > </ li >
32+ < li > < a href ="#leaderboard "> Leaderboard</ a > </ li >
33+ < li > < a href ="#citation "> Citation</ a > </ li >
34+ </ ul >
35+ </ aside >
36+
37+ <!-- Main content -->
38+ < main >
39+ < div class ="center-content ">
40+ < h1 > Welcome to LLMSQL Project</ h1 >
41+
42+ < div class ="badges " aria-hidden ="true ">
43+ < img src ="https://img.shields.io/badge/pypi-v0.1.11-555 " alt ="pypi v0.1.11 ">
44+ < img src ="https://img.shields.io/github/stars/LLMSQL/llmsql-benchmark?style=social " alt ="GitHub stars ">
45+ < img src ="https://img.shields.io/badge/docs-online-blue " alt ="docs online ">
46+ < img src ="https://img.shields.io/badge/dataset-HuggingFace-orange " alt ="dataset HuggingFace ">
47+ </ div >
48+
49+ < p style ="margin-top:10px; "> LLMSQL is a Python package for SQL reasoning with LLMs and vLLM inference.</ p >
50+ </ div >
51+
52+ < h2 id ="description "> 💡 Description</ h2 >
53+ < p > < strong > LLMSQL Benchmark</ strong > is an < strong > open-source framework</ strong > providing a < strong > modernized, cleaned, and extended version of the original WikiSQL dataset</ strong > , specifically designed for evaluating and fine-tuning < strong > Large Language Models (LLMs)</ strong > on < strong > Text-to-SQL</ strong > tasks.</ p >
54+
55+ < h2 id ="improvements "> Key improvements</ h2 >
56+ < ul >
57+ < li > < strong > Data Cleaning:</ strong > Fixed errors (type mismatches, case sensitivity) causing 41% empty results.</ li >
58+ < li > < strong > LLM-Ready Format:</ strong > Replaced numeric placeholders with standard SQL, improving training consistency.</ li >
59+ </ ul >
60+
61+ < h2 id ="documentation "> 📚 Documentation</ h2 >
62+ < div class ="note-box ">
63+ < p > < strong > Note:</ strong > Documentation pages (installation guide, API reference) are < strong > under construction</ strong > . See < strong > Quick Start</ strong > below.</ p >
64+ </ div >
65+
66+ < h2 id ="quick-start "> ⚡ Quick Start</ h2 >
67+ < div class ="custom-highlight-box ">
68+ < p > < strong > ⚠️ WARNING — Reproducibility</ strong > </ p >
69+
70+ < p >
71+ vLLM and HuggingFace Transformers may produce < strong > different results</ strong > even with the same
72+ settings (e.g., temperature=0). This is due to differences in implementation, computation precision,
73+ and batching mechanisms.
74+ </ p >
75+
76+ < p > < strong > Recommendation:</ strong > when comparing model quality, use the < strong > same backend</ strong >
77+ (either only vLLM or only Transformers).</ p >
78+
79+ < p > < strong > Sources:</ strong > < br />
80+ • vLLM FAQ:
81+ < a href ="https://docs.vllm.ai/en/latest/usage/faq/ " target ="_blank "> FAQ</ a > < br />
82+ • Model Support Policy:
83+ < a href ="https://docs.vllm.ai/en/latest/models/supported_models/#embedding " target ="_blank ">
84+ Supported Models
85+ </ a >
86+ </ p >
87+ </ div >
88+
89+ < h3 > Installation</ h3 >
90+ < pre > < code > pip3 install llmsql</ code > </ pre >
91+
92+ < h3 > Recommended Workflow (vLLM)</ h3 >
93+ < pre > < code > pip install llmsql[vllm]
94+ llmsql evaluate --model gpt-4 --dataset llmsql_dev</ code > </ pre >
95+
96+ < h3 > Evaluation API (Python)</ h3 >
97+ < pre > < code > from llmsql import LLMSQLEvaluator
98+
99+ evaluator = LLMSQLEvaluator(workdir_path="llmsql_workdir")
100+ report = evaluator.evaluate(outputs_path="path_to_your_outputs.jsonl")
101+ print(report)
102+ </ code > </ pre >
103+
104+ < h2 id ="links "> 🔗 Resources</ h2 >
105+ < table >
106+ < thead > < tr > < th > Resource</ th > < th > Details</ th > </ tr > </ thead >
107+ < tbody >
108+ < tr > < td > 📦 < strong > PyPI Project</ strong > </ td > < td > < a href ="https://pypi.org/project/llmsql/ "> llmsql on PyPI</ a > </ td > </ tr >
109+ < tr > < td > 💾 < strong > Dataset on Hugging Face</ strong > </ td > < td > < a href ="https://huggingface.co/datasets/llmsql-bench/llmsql-benchmark "> llmsql-bench dataset</ a > </ td > </ tr >
110+ < tr > < td > 💻 < strong > Source Code</ strong > </ td > < td > < a href ="https://github.com/LLMSQL/llmsql-benchmark "> GitHub repo</ a > </ td > </ tr >
111+ </ tbody >
112+ </ table >
113+
114+ < h2 id ="leaderboard "> 📊 Leaderboard [in progress]</ h2 >
115+ < div class ="custom-highlight-box ">
116+ < p > The official Leaderboard is currently empty and < strong > in progress</ strong > . Submit your model results to be the first on the ranking!</ p >
117+ </ div >
118+
119+ < h2 id ="citation "> 📄 Citation</ h2 >
120+ < pre > < code > @inproceedings{llmsql_bench,
121+ title={LLMSQL: Upgrading WikiSQL for the LLM Era of Text-to-SQL},
122+ author={Pihulski, Dzmitry and Charchut, Karol and Novogrodskaia, Viktoria and Koco{'n}, Jan},
123+ booktitle={2025 IEEE ICDMW},
124+ year={2025},
125+ organization={IEEE}
126+ }
127+ </ code > </ pre >
128+
129+ < div class ="center-content small ">
130+ 💬 Made with ❤️ by the LLMSQL Team< br >
131+ </ div >
132+ </ main >
133+
134+ < script src ="_static/scripts/front_page.js "> </ script >
135+
136+ </ body >
137+ </ html >
0 commit comments