Exploration project for different RAG components and setup.
Comparison of two identical RAG setups, one cross-lingual (data DE, queries DE/FR/IT) and one multilingual (data and queries DE/FR/IT), using SWISS train operation regulation documents with very specific terminology.
- explore_notebook.ipynb: data loading, chunking and exploration including terminology density
- rag_notebook.ipynb: pilot RAG setup to test different generation models (only DE data)
- compare_embeddings_notebook.ipynb: comparison of two multilingual embedding models to test embedding quality (queries in DE/FR/IT)
Results and insights are used to create the final project CAS Project tRAG