Here is a breakdown of the files included
reduce2 - reducing audio quality to make it easier to process
Convert2 - converting raw audio data to text and then prompt response pairs
CorpusCounter - getting simple analytics on raw dataset
Final - where model is defined and finetuned
FinalTest - where the model is tested (getting responsees from finetuneed GPT)
The actual model wiegths could not be pushed to github
The dataset as well we have hidden for ethical reasons