ContextLab
diff --git a/‎.github/workflows/build-slides.yml‎
Lines changed: 166 additions & 134 deletions b/‎.github/workflows/build-slides.yml‎
Lines changed: 166 additions & 134 deletions
diff --git a/‎.gitignore‎
Lines changed: 13 additions & 0 deletions b/‎.gitignore‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎admin/DartmouthRuzicka-Bold.ttf‎
223 KB b/‎admin/DartmouthRuzicka-Bold.ttf‎
223 KB
diff --git a/‎admin/DartmouthRuzicka-BoldItalic.ttf‎
217 KB b/‎admin/DartmouthRuzicka-BoldItalic.ttf‎
217 KB
diff --git a/‎admin/DartmouthRuzicka-Regular.ttf‎
217 KB b/‎admin/DartmouthRuzicka-Regular.ttf‎
217 KB
diff --git a/‎admin/DartmouthRuzicka-RegularItalic.ttf‎
206 KB b/‎admin/DartmouthRuzicka-RegularItalic.ttf‎
206 KB
diff --git a/‎admin/syllabus.md‎
Lines changed: 46 additions & 17 deletions b/‎admin/syllabus.md‎
Lines changed: 46 additions & 17 deletions
@@ -172,3 +172,16 @@ demos/11-analogies/data/glove.6B.*.txt
 node_modules/
 package-lock.json
 demos/15-chatbot-evolution/alice-aiml-original/
+
+# LaTeX auxiliary files
+*.aux
+*.log
+*.nav
+*.out
+*.snm
+*.toc
+*.vrb
+*.fdb_latexmk
+*.fls
+*.synctex.gz
+luatex.*/
@@ -1,10 +1,12 @@
 ---
-title: "PSYC 51.17: Models of Language and Conversation"
+title: "PSYC 51.07: Models of Language and Communication"
 geometry: margin=1in
 header-includes:
   - \usepackage{fontspec}
   - \usepackage{booktabs}
-  - \setmainfont{Berkeley Mono}
+  - \directlua{luaotfload.add_fallback("emojifallback", {"NotoColorEmoji:mode=harf"})}
+  - \defaultfontfeatures{RawFeature={fallback=emojifallback}}
+  - \setmainfont{Fira Code}
 output: pdf
 ---
 
@@ -109,22 +111,26 @@ We strive to create an inclusive learning environment where all students feel su
 ### Week 1: Introduction & String Manipulation (January 5--9)
 
 **Monday, January 5** (Lecture 1): Course Introduction, Is ChatGPT Conscious?
+
   - Topics: Course overview, capabilities of LLMs, consciousness debate
   - Discussion: What is consciousness? Can machines be conscious?
   - Reading: [Fedorenko et al. (2024)](https://www.nature.com/articles/s41593-024-01711-5); [Schrimpf et al. (2021)](https://www.pnas.org/doi/10.1073/pnas.2105646118)
   - Slides: [\href{https://contextlab.github.io/llm-course/slides/week1/lecture1.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week1/lecture1.html}{HTML}]
 
 **Wednesday, January 7** (Lecture 2): Pattern Matching & ELIZA
+
   - Topics: String operations in Python, regular expressions, pattern matching
   - Reading: [Weizenbaum (1966)](https://dl.acm.org/doi/10.1145/365153.365168)
   - Slides: [\href{https://contextlab.github.io/llm-course/slides/week1/lecture2.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week1/lecture2.html}{HTML}]
 
 **Thursday, January 8** (X-hour 1): ELIZA Deep Dive
+
   - Topics: Extended discussion of pattern matching, implementation strategies
   - Hands-on: Start Assignment 1
   - Demo: [\href{https://contextlab.github.io/llm-course/slides/week1/xhour_eliza_demo.html}{Interactive Notebook}]
 
 **Friday, January 9** (Lecture 3): ELIZA Implementation & The ELIZA Effect
+
   - Topics: Implementing ELIZA from scratch, psychological implications
   - **📝 Assignment 1 Released:** [\href{https://github.com/ContextLab/llm-course/blob/main/assignments/Assignment\%201\%3A\%20ELIZA/README.md}{Building the ELIZA Chatbot}]
   - Reading: [Natale (2021)](https://www.tandfonline.com/doi/full/10.1080/24701475.2020.1814847)
@@ -133,21 +139,25 @@ We strive to create an inclusive learning environment where all students feel su
 ### Week 2: Computational Linguistics (January 12--16)
 
 **Monday, January 12** (Lecture 4): Data Cleaning & Preprocessing
+
   - Topics: Web scraping with Beautiful Soup, data cleaning, text normalization
   - Reading: HuggingFace NLP Course Chapter 2
   - Slides: [\href{https://contextlab.github.io/llm-course/slides/week2/lecture4.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week2/lecture4.html}{HTML}]
 
 **Wednesday, January 14** (Lecture 5): Tokenization
+
   - Topics: Byte-Pair Encoding (BPE), WordPiece, SentencePiece
   - Reading: [Sennrich et al. (2016)](https://aclanthology.org/P16-1162/); [Kudo & Richardson (2018)](https://aclanthology.org/D18-2012/)
   - Slides: [\href{https://contextlab.github.io/llm-course/slides/week2/lecture5.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week2/lecture5.html}{HTML}]
 
 **Thursday, January 15** (X-hour 2): Text Classification Workshop
+
   - Topics: Building classifiers, feature engineering for text
   - Hands-on: Explore different classification approaches
   - Demo: [\href{https://contextlab.github.io/llm-course/slides/week2/xhour_classification_demo.html}{Interactive Notebook}]
 
 **Friday, January 16** (Lecture 6): POS Tagging & Sentiment Analysis
+
   - Topics: Part-of-speech tagging, named entity recognition, sentiment analysis
   - **📝 Assignment 2 Released:** [\href{https://github.com/ContextLab/llm-course/blob/main/assignments/Assignment\%202\%3A\%20SPAM\%20classifier/README.md}{SPAM Classifier}]
   - **✅ Assignment 1 Due**
@@ -158,16 +168,19 @@ We strive to create an inclusive learning environment where all students feel su
 **Monday, January 19**: Martin Luther King Jr. Day (No Class)
 
 **Wednesday, January 21** (Lecture 7): Classic Embeddings
+
   - Topics: Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA)
   - Reading: [Landauer & Dumais (1997)](https://psycnet.apa.org/record/1997-02478-006); [Blei et al. (2003)](https://www.jmlr.org/papers/v3/blei03a)
   - Slides: [\href{https://contextlab.github.io/llm-course/slides/week3/lecture7.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week3/lecture7.html}{HTML}]
 
 **Thursday, January 22** (X-hour 3): Embeddings Workshop
+
   - Topics: Implementing classic embeddings (LSA, LDA)
   - Hands-on: Compare embedding methods on real data
   - Demo: [\href{https://contextlab.github.io/llm-course/slides/week3/xhour_embeddings_demo.html}{Interactive Notebook}]
 
 **Friday, January 23** (Lecture 8): Word Embeddings
+
   - Topics: Word2Vec (CBOW and Skip-gram), GloVe, FastText
   - **📝 Assignment 3 Released:** [\href{https://github.com/ContextLab/llm-course/blob/main/assignments/Assignment\%203\%3A\%20Wikipedia/README.md}{Wikipedia Embeddings Comparison}]
   - **✅ Assignment 2 Due**
@@ -177,71 +190,83 @@ We strive to create an inclusive learning environment where all students feel su
 ### Week 4: Text Embeddings II (January 26--30)
 
 **Monday, January 26** (Lecture 9): Contextual Embeddings
+
   - Topics: ELMo, Universal Sentence Encoder, BERT embeddings
   - Reading: [Peters et al. (2018)](https://aclanthology.org/N18-1202/); [Cer et al. (2018)](https://arxiv.org/abs/1803.11175)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week3-4/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week3-4/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week3-4/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week3-4/lecture.html}{HTML}]
 
 **Wednesday, January 28** (Lecture 10): Dimensionality Reduction
+
   - Topics: PCA, t-SNE, UMAP for visualizing embeddings
   - Reading: [McInnes et al. (2018)](https://arxiv.org/abs/1802.03426)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week3-4/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week3-4/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week3-4/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week3-4/lecture.html}{HTML}]
 
 **Friday, January 30** (Lecture 11): Cognitive Models of Semantic Representation
+
   - Topics: Distributional semantics, neural representation of meaning
   - Reading: [Anderson et al. (2016)](https://www.jneurosci.org/content/36/45/11444)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week3-4/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week3-4/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week3-4/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week3-4/lecture.html}{HTML}]
 
 ### Week 5: Transformers & Attention (February 2--6)
 
 **Monday, February 2** (Lecture 12): Attention Mechanisms
+
   - Topics: Sequence-to-sequence models, attention mechanism fundamentals
   - **✅ Assignment 3 Due**
   - Reading: [Bahdanau et al. (2015)](https://arxiv.org/abs/1409.0473); [Vaswani et al. (2017)](https://arxiv.org/abs/1706.03762)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week5-6/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.html}{HTML}]
 
 **Wednesday, February 4** (Lecture 13): Transformer Architecture
+
   - Topics: Multi-head attention, positional encoding, transformer blocks
   - Reading: [Vaswani et al. (2017)](https://arxiv.org/abs/1706.03762); HuggingFace NLP Course Chapter 3
-  - Slides: [\href{https://contextlab.github.io/llm-course/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week5-6/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.html}{HTML}]
 
 **Friday, February 6** (Lecture 14): Training Transformers
+
   - Topics: Pre-training objectives, masked language modeling, next token prediction
   - **📝 Assignment 4 Released:** [\href{https://github.com/ContextLab/llm-course/blob/main/assignments/Assignment\%204\%3A\%20Customer\%20Service\%20Chatbot/README.md}{Context-Aware Customer Service Chatbot}]
-  - Slides: [\href{https://contextlab.github.io/llm-course/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week5-6/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.html}{HTML}]
 
 ### Week 6: Encoder Models (February 9--13)
 
 **Monday, February 9** (Lecture 15): BERT Deep Dive
+
   - Topics: BERT architecture, bidirectional pre-training, fine-tuning
   - Reading: [Devlin et al. (2019)](https://aclanthology.org/N19-1423/); HuggingFace NLP Course Chapter 4
-  - Slides: [\href{https://contextlab.github.io/llm-course/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week5-6/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.html}{HTML}]
 
 **Wednesday, February 11** (Lecture 16): BERT Variants
+
   - Topics: RoBERTa, ALBERT, DistilBERT, and other encoder models
   - Reading: [Liu et al. (2019)](https://arxiv.org/abs/1907.11692); [Sanh et al. (2019)](https://arxiv.org/abs/1910.01108)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week5-6/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.html}{HTML}]
 
 **Friday, February 13** (Lecture 17): Applications of Encoder Models
+
   - Topics: Classification, NER, question answering with BERT
   - **✅ Assignment 4 Due**
-  - Slides: [\href{https://contextlab.github.io/llm-course/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week5-6/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week5-6/lecture.html}{HTML}]
 
 ### Week 7: Decoder Models & GPT (February 16--20)
 
 **Monday, February 16** (Lecture 18): GPT Architecture
+
   - Topics: Autoregressive language models, GPT-1 and GPT-2
   - Reading: [Radford et al. (2018)](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf); [Radford et al. (2019)](https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week7/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week7/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week7/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week7/lecture.html}{HTML}]
 
 **Wednesday, February 18** (Lecture 19): Scaling Up to GPT-3 and Beyond
+
   - Topics: GPT-3, in-context learning, few-shot prompting, GPT-4 and Claude
   - **📝 Assignment 5 Released:** [\href{https://github.com/ContextLab/llm-course/blob/main/assignments/Assignment\%205\%3A\%20GPT/README.md}{Build and Train a GPT Model}]
   - Reading: [Brown et al. (2020)](https://arxiv.org/abs/2005.14165); [OpenAI (2023)](https://arxiv.org/abs/2303.08774)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week7/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week7/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week7/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week7/lecture.html}{HTML}]
 
 **Friday, February 20** (Lecture 20): Implementing GPT from Scratch
+
   - Topics: Building GPT architecture, training considerations
-  - Slides: [\href{https://contextlab.github.io/llm-course/week7/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week7/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week7/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week7/lecture.html}{HTML}]
 
 ### Week 8: No Classes (February 23--27)
 
@@ -250,24 +275,28 @@ We strive to create an inclusive learning environment where all students feel su
 ### Week 9: Advanced Topics (March 2--6)
 
 **Monday, March 2** (Lecture 21): Retrieval Augmented Generation (RAG)
+
   - Topics: Vector databases, retrieval mechanisms, RAG architectures
   - **✅ Assignment 5 Due**
   - Reading: [Lewis et al. (2020)](https://arxiv.org/abs/2005.11401)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week9/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week9/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week9/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week9/lecture.html}{HTML}]
 
 **Wednesday, March 4** (Lecture 22): Mixture of Experts & Efficiency
+
   - Topics: MoE architectures, model compression, distillation
   - Reading: [Fedus et al. (2022)](https://arxiv.org/abs/2101.03961); [Jiang et al. (2024)](https://arxiv.org/abs/2401.04088)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week9/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week9/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week9/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week9/lecture.html}{HTML}]
 
 **Friday, March 6** (Lecture 23): Ethics, Bias, and Safety
+
   - Topics: Bias in LLMs, alignment, safety considerations
   - Reading: [Bender et al. (2021)](https://dl.acm.org/doi/10.1145/3442188.3445922)
-  - Slides: [\href{https://contextlab.github.io/llm-course/week9/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/week9/lecture.html}{HTML}]
+  - Slides: [\href{https://contextlab.github.io/llm-course/slides/week9/lecture.pdf}{PDF}][\href{https://contextlab.github.io/llm-course/slides/week9/lecture.html}{HTML}]
 
 ### Week 10: Final Projects (March 9)
 
 **Monday, March 9** (Lecture 24): Final Project Presentations & Wrap-up
+
   - Final project presentations (all teams)
   - Course wrap-up and reflections
   - Last day of classes