This repository contains the material for the "Training and Fine-Tuning Large Language Models" workshop held at DHS 2024. The workshop is divided into several modules, each focusing on different aspects of working with large language models (LLMs).
- Module_01_Install_Requirements.ipynb: A notebook to set up the required environment and dependencies for the exercises in this module.
- Module_01_LC1_Simple_Word_Embedding_Models_Exercise.ipynb: Introduction to simple word embedding models, providing hands-on exercises to understand their workings.
- Module_01_LC2_Contextual_Embeddings_and_Semantic_Search_Engines_with_Transformers_Exercise.ipynb: Explores the use of contextual embeddings and the implementation of semantic search engines using Transformer models.
- Module_01_LC3_Real_World_Applications_with_Fine_tuned_Transformers_Exercise.ipynb: Demonstrates real-world applications of fine-tuned Transformers in various domains.
- Module_01_LC4_Prompt_Engineering_with_Local_Open_LLMs_Exercise.ipynb: Covers techniques for prompt engineering with locally hosted open-source LLMs.
- Module_01_LC5_BONUS_Comparing_Llama_3_1_vs_GPT_4o_mini_Walkthrough.ipynb: A bonus walkthrough comparing Llama 3.1 and GPT-4o mini models in different tasks.
- Module_02_Install_Requirements.ipynb: A notebook to install necessary dependencies for this module.
- Module_02_LC1_Pre-training_GPT-2_on_Custom_Data_Exercise.ipynb: Hands-on exercise for pre-training GPT-2 on custom datasets, focusing on the customization and specialization of models.
- Module_02_LC2_Full-fine-tuning_BERT_for_Classification_Exercise.ipynb: Covers the process of fully fine-tuning a BERT model for classification tasks, with practical examples.
- Module_03_Install_Requirements.ipynb: A notebook to set up the environment for parameter-efficient fine-tuning.
- Module_03_LC1_Parameter-Efficient_fine-tuning_BERT_for_Classification_with_QLoRA_Exercise.ipynb: Exercise on fine-tuning BERT for classification tasks using QLoRA for parameter efficiency.
- Module_03_LC2_Parameter-Efficient_fine-tuning_BERT_for_Named_Entity_Recognition_QLoRA_Exercise.ipynb: Focuses on using QLoRA for fine-tuning BERT in named entity recognition (NER) tasks.
- Module_03_LC3_Parameter-Efficient_fine-tuning_Switching_LoRA_Adapters_Walkthrough.ipynb: Walkthrough of switching LoRA adapters to demonstrate flexible and efficient model fine-tuning.
- Module_04_Install_Requirements.ipynb: Setup notebook for installing dependencies required in this module.
- Module_04_LC1_Supervised_Fine_tuning_TinyLlama_1B_for_Text2SQL_Exercise.ipynb: Supervised fine-tuning exercise of TinyLlama 1B for converting text into SQL queries.
- Module_04_LC2_Dataset_Preparation_for_RAG_fine-tuning_Exercise.ipynb: A guide to preparing datasets for Retrieval-Augmented Generation (RAG) fine-tuning.
- Module_04_LC3_Fine-tune_Embedder_Model_for_RAG_Exercise.ipynb: Exercise on fine-tuning an embedder model specifically for RAG tasks.
- Module_04_LC4_Supervised_Fine_tuning_Llama_3_LLM_for_RAG_Exercise.ipynb: Fine-tuning Llama 3 LLM for RAG with supervised methods.
- Module_04_LC5_Building_Custom_RAG_Systems_with_Fine_tuned_Models_Exercise.ipynb: Comprehensive exercise on building custom RAG systems using fine-tuned models.
- Module_05_Install_Requirements.ipynb: Installs the necessary tools and libraries for the exercises in this module.
- Module_05_Aligning_GPT2_to_Positive_Content_Generation_with_RLHF_and_PPO_Exercise.ipynb: Exercise on aligning GPT-2 for generating positive content using Reinforcement Learning with Human Feedback (RLHF) and Proximal Policy Optimization (PPO).
- Module_05_Aligning_Llama_3_LLM_with_human_preferences_using_DPO_Exercise.ipynb: Aligns Llama 3 LLM with human preferences using Direct Preference Optimization (DPO).
- Module_05_Aligning_Llama_3_LLM_with_human_preferences_using_ORPO_Walkthrough.ipynb: A detailed walkthrough on using Odds Ratio Preference Optimization (ORPO) for aligning Llama 3 LLM.