Skip to content

xufengduan/Awesome-Language-MI-Survey

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

🔎 Mechanistic Interpretability for Understanding Language Abilities in Large Language Models: A Survey

We will continue to update this repository.

If you enjoy or benefit from the project, a star ⭐ on GitHub would be greatly appreciated and will help you stay informed about future updates.

📖 Table of Contents

📖 Overview

MI Survey - graph.pdf

📖 Taxonomy

Taxonomy Figure

📖 Paper List

1. Benchmarks

2. Probing

3. Vocabulary Projection (Logit Lens)

4. Sparse AutoEncoders (SAE)

5. Activation Patching

6. Neuron Analysis

7. Circuit Discovery

📧 Contact

Feel free to open an issue or contact us if you have any questions or want to include your work in this list!

Author: Xufeng Duan (xufengduan@cuhk.edu.hk) and Zhaoqian Yao (zhaoqianyao@cuhk.edu.hk)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors