Chunk Size Analysis

The idea behind this repo comes from this LlamaIndex project. The main changes are:

Using Groq Llama3.2 11b for Q&A and Llama3 70b for evaluation
Testing LangChain TextSplitter encapsulated in LlamaIndex framework
making 3 runs of the Q&A script, storing the statistics output at output/qa_output.md
Removing deprecated content and questions datasets
Asking the model to create test cases based on an User Story with different chunk sizes(Besides the Q&A statistics part), storing the model outputs at output/ct_gen_output.md

In order to run it:

Run the installation requirements command bellow

pip install llama-index llama-index-embeddings-huggingface llama-index-llms-groq spacy langchain

(Optional) Set context for Q&A script by replacing the pdf file in data/qa
(Optional) Set context for Test Case Generation by replacing the pdf file in data/ct_gen/context

4.1. If you do this, remember to also replace the JSON file in data/ct_gen/us with your own made up User Story. Remember to use the same fields names on it.
Run the main script

5.1. If you want to evaluate question answering, use the command below:
```
python main.py -qa
```
5.2. If you want to generate test cases, use the command below:
```
python main.py -tc
```

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
scripts		scripts
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback