You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: readme.md
+21-15Lines changed: 21 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -97,8 +97,6 @@ You need the following to run the QueryCraft pipeline:
97
97
98
98
====
99
99
100
-
======
101
-
102
100
## Features/Modules in QueryCraft
103
101
The QueryCraft pipeline is built of 8 modules/components.
104
102
@@ -169,7 +167,8 @@ Configure your environment and services by editing the `simpleConfig.ini` and `
169
167
`SuperKnowa-QueryCraft` provides the capability to run the whole pipeline (Context Retriever -> Fine-tuning -> Inference -> Query Correction -> Evaluation -> Query Analysis dashboard) together and also you can run each component individually.
170
168
171
169
## Step by Step instructions
172
-
Please read the [detailed documentation](/document/Setting%20up%20environment.md) for step by step instructions
170
+
You need a GPU environment for running and fine-tuning LLM using QueryCraft framework. Read [Setting up environment](/document/Setting%20up%20environment.md) for more details.
171
+
Please read the [detailed documentation](/document/) for step by step instructions
173
172
174
173
## Step 0. Instruct dataset
175
174
@@ -179,26 +178,19 @@ There are three options for using your dataset to finetune/evaluate the Text to
179
178
1. Curate the golden query dataset using our annotation tool: <https://annotator.superknowa.tsglwatson.buildlab.cloud/>
180
179
1. Use the example datasets provided below for testing: Spider and KaggleDBQA
181
180
182
-
### Golden Query Annotation:
183
-
1. Go to our annotation tool. <https://annotator.superknowa.tsglwatson.buildlab.cloud/>
184
-
185
-

186
-
187
-
2. Click on the Instruction Manual and follow the instructions for curating the golden queries dataset. <https://annotator.superknowa.tsglwatson.buildlab.cloud/documentation>
Please read the [Step 0. Golden Query Dataset Annotation](/document/Step%200.%20Golden%20Query%20Dataset%20Annotation.md) for step by step instructions for the Instruct dataset.
190
182
191
183
## Step 1. Data Ingestion
192
184
You have 3 options for Data Ingestion.
193
185
1. Bring Your Own Data
194
186
- If you have both databases and instruct set (golden query)
195
187
- If you only have database and not instruct set then use above annotation tool
196
188
2. Use the example set
197
-
This comes with both source dataset and instruct Db
189
+
- This comes with both source dataset and Instruct Db
198
190
199
-
Read the detailed steps for Data ingestion in [documentation](/image/readme.md)
191
+
Please read the [Step 1. Data Ingestion](/document/Step%201.%20Data%20Ingestion.md) for step by step instructions for the Instruct dataset.
200
192
201
-
Run the Data Ingestion module of the QueryCraft pipeline using the `runQueryCraft.sh`, file with the `dataIngestion` option after setting the `simpleConfig.ini` file to insert `salary.csv` into the `querycraft_db2_testing_13march` table in db2.
193
+
Run the Data Ingestion module of the QueryCraft pipeline using the `runQueryCraft.sh`, file with the `dataIngestion` option after setting the `simpleConfig.ini` file to insert `salary.csv` into the a table in db2.
202
194
203
195
```bash
204
196
sh runQueryCraft.sh
@@ -210,6 +202,8 @@ dataIngestion
210
202
211
203
## Step 2. Context Retriever
212
204
205
+
Please read the [Step 2. Context Retriever](/document/Step%202.%20Context%20Retriever.md) for step by step instructions for the Instruct dataset.
206
+
213
207
Execute the context retriever using the following command.
214
208
215
209
```bash
@@ -224,6 +218,8 @@ contextRetriever
224
218
225
219
## Step 3. Fine-Tuning
226
220
221
+
Please read the [Step 3. Finetuning](/document/Step%203.%20Finetuning.md) for step by step instructions for the Instruct dataset.
222
+
227
223
To start fine-tuning your LLM for the Text to SQL task, run the below command.
228
224
229
225
```bash
@@ -238,6 +234,8 @@ Follow the prompts to specify your dataset and model configuration.
238
234
239
235
## Step 4. Inference
240
236
237
+
Please read the [Step 4. Inference](/document/Step%204.%20Inference.md) for step by step instructions for the Instruct dataset.
238
+
241
239
To generate SQL queries using your fine-tuned or pre-trained model, execute:
242
240
243
241
```bash
@@ -250,6 +248,8 @@ inference
250
248
251
249
## Step 5. Query Correction
252
250
251
+
Please read the [Step 5. Query Correction](/document/Step%205.%20Query%20Correction.md) for step by step instructions for the Instruct dataset.
252
+
253
253
```bash
254
254
sh runQueryCraft.sh
255
255
```
@@ -260,6 +260,8 @@ querycorrection
260
260
261
261
## Step 6. Evaluation
262
262
263
+
Please read the [Step 6. Evaluation](/document/Step%206.%20Evaluation.md) for step by step instructions for the Instruct dataset.
264
+
263
265
Evaluate the performance of your model against the SQLite database or DB2 by running the below command:
264
266
265
267
```bash
@@ -273,6 +275,8 @@ evaluation
273
275
274
276
## Step 7. Query Analysis Dashboard
275
277
278
+
Please read the [Step 7. Query Analysis](/document/Step%207.%20Query%20Analysis.md) for step by step instructions for the Instruct dataset.
279
+
276
280
For a visual analysis of your fine-tuning experiments and generated SQL queries, launch the streamlit dashboard:
277
281
278
282
```bash
@@ -290,7 +294,9 @@ queryanalysisDashboard
290
294
291
295
292
296
## Step 8. Run pipeline (all)
293
-
To run all components together, you can change the required parameters in `simpleConfig.ini`. You must set the default path as shown in the designated section below.
297
+
Please read the [Step 8. Run Full Pipeline](/document/Step%208.%20Run%20Full%20Pipeline.md) for step by step instructions for the Instruct dataset.
298
+
299
+
To run all components together, you can change the required parameters in `ssimpleConfig.ini`. You must set the default path as shown in the designated section below.
0 commit comments