Skip to content

ayushi-agarwall/most-influential-ai-research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Most Influential AI Research

License: CC0-1.0 Papers Years AI Research Machine Learning Deep Learning NLP Computer Vision

A curated collection of foundational and influential papers in artificial intelligence, spanning from 1943 to 2025 covering foundational theory, deep learning, NLP, computer vision, and modern foundation models.

Papers Collection

Year Title Citations Key Contribution Authors Link
1943 A Logical Calculus of the Ideas Immanent in Nervous Activity Classic (~25k) First formal model of artificial neurons Warren McCulloch, Walter Pitts Link
1950 Computing Machinery and Intelligence Classic (~15k) Turing Test; framing machine intelligence Alan Turing Link
1956 The Logic Theorist Classic First AI program; symbolic reasoning as search Allen Newell, Herbert Simon, Cliff Shaw Link
1958 Perceptron: A Probabilistic Model for Information Storage Classic (~8k) Early neural network learning rule Frank Rosenblatt Link
1959 Programs with Common Sense (Advice Taker) Classic Knowledge-based reasoning, symbolic AI John McCarthy Link
1961 Steps Toward Artificial Intelligence Classic (~5k) Search, heuristics, symbolic AI agenda Marvin Minsky Link
1969 Perceptrons Classic (~10k) Limits of single-layer perceptrons Marvin Minsky, Seymour Papert Link
1975 A Framework for Representing Knowledge Classic (~8k) Frames; structured knowledge representation Marvin Minsky Link
1977 Knowledge Representation and Reasoning Classic Formal KR foundations John McCarthy Link
1986 Induction of Decision Trees ~20k ID3; decision tree learning J. Ross Quinlan Link
1986 Learning Representations by Back-Propagating Errors ~60k Backpropagation for multilayer nets Rumelhart, Hinton, Williams Link
1986 Explanation-Based Learning Classic (~3k) Symbolic learning from explanation Tom Mitchell et al. Link
1989 Q-learning ~40k Model-free reinforcement learning Christopher Watkins Link
1989 A Tutorial on Hidden Markov Models ~35k Sequence modeling, speech recognition Lawrence Rabiner Link
1992 Reinforcement Learning: An Introduction Very High (~50k+) Formal RL framework Richard Sutton, Andrew Barto Link
1993 Keeping the neural networks simple by minimizing the description length of the weights ~3k MDL principle for neural nets Geoffrey E. Hinton, Drew van Camp Link
1995 Artificial Intelligence: A Modern Approach Very High (~40k+) Unified rational-agent view Stuart Russell, Peter Norvig Link
1995 Support-Vector Networks ~50k Margin-based learning (SVMs) Corinna Cortes, Vladimir Vapnik Link
1997 Long Short-Term Memory ~80k Solved long-term dependency problem Sepp Hochreiter, Jürgen Schmidhuber Link
1998 Boosting the Margin ~25k AdaBoost theory Robert Schapire et al. Link
1998 Gradient-Based Learning Applied to Document Recognition ~52k Convolutional nets & backprop for vision Yann LeCun et al. Link
2001 Random Forests ~90k Ensemble learning Leo Breiman Link
2003 Latent Dirichlet Allocation ~45k Probabilistic topic modeling David Blei, Andrew Ng, Michael Jordan Link
2004 A Tutorial Introduction to the Minimum Description Length Principle ~6k MDL principle overview Peter Grünwald Link
2006 A Fast Learning Algorithm for Deep Belief Nets ~35k Deep unsupervised pretraining Geoffrey Hinton et al. Link
2008 MapReduce: Simplified Data Processing on Large Clusters ~35k Distributed data processing paradigm Jeffrey Dean, Sanjay Ghemawat Link
2008 Machine Super Intelligence ~500 AIXI and universal intelligence Shane Legg Link
2009 ImageNet: A Large-Scale Hierarchical Image Database ~45k Data-driven deep learning era Jia Deng et al. Link
2011 Scikit-Learn: Machine Learning in Python ~70k Premier ML library for practitioners Fabian Pedregosa et al. Link
2011 The First Law of Complexodynamics ~200 Complexity evolution in systems Scott Aaronson Link
2012 A Few Useful Things to Know About Machine Learning ~15k Practical ML principles Pedro Domingos Link
2012 ImageNet Classification with Deep Convolutional Neural Networks (AlexNet) ~180k Deep learning breakthrough in vision Alex Krizhevsky, Ilya Sutskever, Geoffrey Hinton Link
2013 Kolmogorov Complexity and Algorithmic Randomness ~800 Algorithmic information theory Alexander Shen, Vladimir Uspensky, Nikolay Vereshchagin Link
2013 Playing Atari with Deep Reinforcement Learning ~12k Deep Q-learning for games Volodymyr Mnih et al. Link
2014 Dropout: A Simple Way to Prevent Neural Networks from Overfitting ~70k Regularization for deep nets Nitish Srivastava et al. Link
2014 Adam: A Method for Stochastic Optimization ~230k Default deep learning optimizer Diederik Kingma, Jimmy Ba Link
2014 Generative Adversarial Networks ~130k Adversarial generative modeling Ian Goodfellow et al. Link
2014 Recurrent Neural Network Regularization ~7k RNN training techniques Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals Link
2014 Quantifying the Rise and Fall of Complexity in Closed Systems: the Coffee Automaton ~100 Complexity dynamics formalization Scott Aaronson, Sean Carroll, Lauren Ouellette Link
2014 Neural Turing Machines ~3k Memory-augmented neural nets Alex Graves, Greg Wayne, Ivo Danihelka Link
2014 DeepFace: Closing the Gap to Human-Level Performance in Face Verification ~12k Deep learning for face recognition Yaniv Taigman et al. Link
2014 Neural Machine Translation by Jointly Learning to Align and Translate ~45k Attention mechanism for NMT Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio Link
2014 Sequence to Sequence Learning with Neural Networks ~50k Seq2seq architecture Ilya Sutskever, Oriol Vinyals, Quoc Le Link
2014 Show and Tell: A Neural Image Caption Generator ~8k Image captioning with deep learning Oriol Vinyals et al. Link
2014 DeepSpeech: Scaling up end-to-end speech recognition ~4k End-to-end speech recognition Awni Hannun et al. Link
2015 Human-Level Control Through Deep Reinforcement Learning (DQN) ~40k Deep RL + perception Volodymyr Mnih et al. Link
2015 Deep Residual Learning for Image Recognition (ResNet) ~150k Very deep networks via skip connections Kaiming He et al. Link
2015 Very Deep Convolutional Networks (VGG) ~95k Deep convolutional vision models Karen Simonyan, Andrew Zisserman Link
2015 Batch Normalization ~45k Faster & stable deep training Sergey Ioffe, Christian Szegedy Link
2015 Deep Learning (Survey) ~60k Overview of deep representation learning Yann LeCun, Yoshua Bengio, Geoffrey Hinton Link
2015 Faster R-CNN: Towards Real-Time Object Detection ~55k Region proposal for detection Shaoqing Ren et al. Link
2015 The Unreasonable Effectiveness of Recurrent Neural Networks Blog Post RNN capabilities and applications Andrej Karpathy Link
2015 Understanding LSTM Networks Blog Post LSTM architecture explanation Christopher Olah Link
2015 Pointer Networks ~3k Attention-based output mechanism Oriol Vinyals, Meire Fortunato, Navdeep Jaitly Link
2015 Order Matters: Sequence to sequence for sets ~2k Set-to-sequence learning Oriol Vinyals, Samy Bengio, Manjunath Kudlur Link
2015 Multi-Scale Context Aggregation by Dilated Convolutions ~8k Dilated convolutions Fisher Yu, Vladlen Koltun Link
2015 Deep Speech 2: End-to-End Speech Recognition in English and Mandarin ~3k Multilingual speech recognition Baidu Research Link
2015 A Neural Algorithm of Artistic Style ~15k Neural style transfer Leon Gatys, Alexander Ecker, Matthias Bethge Link
2015 Deep Reinforcement Learning with Double Q-learning ~12k Improved Q-learning Hado van Hasselt, Arthur Guez, David Silver Link
2016 XGBoost: A Scalable Tree Boosting System ~35k Industrial-grade boosting Tianqi Chen, Carlos Guestrin Link
2016 TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems ~45k Scalable ML software platform Martín Abadi et al. Link
2016 Identity Mappings in Deep Residual Networks ~15k Improved ResNet design Kaiming He et al. Link
2016 WaveNet: A Generative Model for Raw Audio ~8k Audio generation with deep learning Aäron van den Oord et al. Link
2016 Neural Architecture Search with Reinforcement Learning ~6k Automated neural architecture design Barret Zoph, Quoc Le Link
2017 Attention Is All You Need ~180k Transformer architecture Ashish Vaswani et al. Link
2017 Proximal Policy Optimization (PPO) ~25k Stable policy-gradient RL John Schulman et al. Link
2017 Neural Message Passing for Quantum Chemistry ~4k Graph neural networks for chemistry Justin Gilmer et al. Link
2017 A Simple Neural Network Module for Relational Reasoning ~3k Relation networks Adam Santoro et al. Link
2017 Variational Lossy Autoencoder ~1k Improved VAE objective Xi Chen et al. Link
2017 A Survey of Deep Reinforcement Learning Techniques ~2k Deep RL overview Yuxi Li Link
2017 DeepFM: A Factorization-Machine based Neural Network for CTR Prediction ~5k CTR prediction with factorization Huifeng Guo et al. Link
2017 Neural Style Transfer: A Review ~800 NST comprehensive survey Yongcheng Jing et al. Link
2017 Deep Reinforcement Learning from Human Preferences ~3k RLHF foundation Paul Christiano et al. Link
2017 Deep Learning based Recommender System: A Survey and New Perspectives ~1.2k Recommender systems with deep learning Shuai Zhang, Lina Yao, Aixin Sun, Yi Tay Link
2017 Neural Collaborative Filtering ~5k Deep learning for collaborative filtering Xiangnan He et al. Link
2017 AlphaGo Zero: Mastering the game of Go without human knowledge ~15k Self-play RL without human data David Silver et al. Link
2017 VQ-VAE: Neural Discrete Representation Learning ~3k Discrete latent representations Aäron van den Oord, Oriol Vinyals, Koray Kavukcuoglu Link
2018 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding ~95k Bidirectional language pretraining Jacob Devlin et al. Link
2018 The Illustrated Transformer Blog Post Transformer visualization and explanation Jay Alammar Link
2018 Relational Recurrent Neural Networks ~1k Memory and relational reasoning Adam Santoro et al. Link
2018 YOLOv3: An Incremental Improvement ~40k Object detection improvements Joseph Redmon, Ali Farhadi Link
2019 GPT-2: Language Models are Unsupervised Multitask Learners ~25k Scaling transformers Alec Radford et al. Link
2019 The Bitter Lesson Essay Computation and learning vs hand-coding Rich Sutton Link
2019 GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism ~2k Model parallelism technique Yanping Huang et al. Link
2020 Language Models are Few-Shot Learners (GPT-3) ~50k Emergent in-context learning Tom Brown et al. Link
2020 An Image Is Worth 16×16 Words: Transformers for Image Recognition (ViT) ~35k Transformers for vision Alexey Dosovitskiy et al. Link
2020 Scaling Laws for Neural Language Models ~12k Predictable scaling behavior Jared Kaplan et al. Link
2020 Denoising Diffusion Probabilistic Models ~35k Diffusion-based generation Jonathan Ho, Ajay Jain, Pieter Abbeel Link
2020 Dense Passage Retrieval for Open-Domain Question Answering ~4k Dense retrieval for QA Vladimir Karpukhin et al. Link
2020 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks ~8k RAG architecture Patrick Lewis et al. Link
2021 Decision Transformer: Reinforcement Learning via Sequence Modeling ~6k RL as sequence modeling Lili Chen et al. Link
2021 Highly Accurate Protein Structure Prediction with AlphaFold ~35k Solved protein folding John Jumper et al. Link
2021 Zero-Shot Text-to-Image Generation (DALL-E) ~5k DALL-E model Aditya Ramesh et al. Link
2022 Training Language Models to Follow Instructions with Human Feedback (InstructGPT) ~18k RLHF alignment Long Ouyang et al. Link
2022 Chain-of-Thought Prompting Elicits Reasoning in Large Language Models ~12k Reasoning via prompting Jason Wei et al. Link
2022 PaLM: Scaling Language Modeling with Pathways ~8k Massive-scale LLMs Aakanksha Chowdhery et al. Link
2022 Constitutional AI: Harmlessness from AI Feedback ~7k Self-alignment via principles Yuntao Bai et al. Link
2022 Self-Instruct: Aligning language models with self-generated instructions ~4k Self-supervised instruction tuning Yizhong Wang et al. Link
2022 Chinchilla: Training Compute-Optimal Large Language Models ~5k Optimal scaling laws Jordan Hoffmann et al. Link
2022 Precise Zero-Shot Dense Retrieval Without Relevance Labels ~1k Hypothetical document embeddings Luyu Gao et al. Link
2023 Segment Anything ~8k Foundation models for vision Alexander Kirillov et al. Link
2023 LLaMA: Open and Efficient Foundation Language Models ~15k Open-weight LLM paradigm Hugo Touvron et al. Link
2023 Sparks of Artificial General Intelligence: Early experiments with GPT-4 ~6k Emergent general capabilities Sébastien Bubeck et al. Link
2023 Direct Preference Optimization: Your Language Model is Secretly a Reward Model ~5k Simpler alignment than RLHF Rafael Rafailov et al. Link
2023 Understanding Deep Learning Textbook Comprehensive deep learning textbook Simon J.D. Prince Link
2023 Zephyr: Direct Distillation of LM Alignment ~1k Distilled alignment models Lewis Tunstall et al. Link
2023 Lost in the Middle: How Language Models Use Long Contexts ~2k Context window utilization Nelson F. Liu et al. Link
2023 Alpaca: A Strong, Replicable Instruction-Following Model ~3k Low-cost instruction tuning Stanford CRFM Link
2023 Llama 2: Open Foundation and Fine-Tuned Chat Models ~10k Open commercial LLMs Hugo Touvron et al. Link
2023 LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models ~800 Extended context training Yukang Chen et al. Link
2023 Are Emergent Abilities of Large Language Models a Mirage? ~1k Critique of emergence claims Rylan Schaeffer et al. Link
2023 Mamba: Linear-Time Sequence Modeling with Selective State Spaces ~2k State space models for sequences Albert Gu, Tri Dao Link
2023 QLoRA: Efficient Finetuning of Quantized LLMs ~3k Quantized model fine-tuning Tim Dettmers et al. Link
2023 Reflexion: Language Agents with Verbal Reinforcement Learning ~1k Self-reflection for agents Noah Shinn et al. Link
2023 Explainability for Large Language Models: A Survey ~800 LLM interpretability overview Haiyan Zhao et al. Link
2024 Gemini: A Family of Highly Capable Multimodal Models Rapidly growing (~5k) Native multimodal foundation models Google DeepMind Link
2024 AlphaFold 3 ~3k Molecular & interaction prediction DeepMind Link
2024 MiniGPT-4: Enhancing Vision-Language Understanding ~2k Open multimodal LLMs Deyao Zhu et al. Link
2024 Representation Engineering: A Top-Down Approach to AI Transparency Emerging (~500) Steering internal representations Andy Zou et al. Link
2024 Better & Faster Large Language Models Via Multi-token Prediction ~800 Multi-token prediction training Fabian Gloeckle et al. Link
2024 KAN: Kolmogorov-Arnold Networks ~1k Alternative to MLPs Ziming Liu et al. Link
2025 LLMs Will Always Hallucinate, and We Need to Live With This ~200 Establishes the mathematical certainty of hallucinations Banerjee et al. Link
2025 DINOv3 ~200 Vision Foundation Models O Siméoni et al. Link

License

CC0-1.0 License - See LICENSE file for details. Individual papers remain under their respective copyrights.

Releases

No releases published

Packages

 
 
 

Contributors