Most Influential AI Research

A curated collection of foundational and influential papers in artificial intelligence, spanning from 1943 to 2025 covering foundational theory, deep learning, NLP, computer vision, and modern foundation models.

Papers Collection

Year	Title	Citations	Key Contribution	Authors	Link
1943	A Logical Calculus of the Ideas Immanent in Nervous Activity	Classic (~25k)	First formal model of artificial neurons	Warren McCulloch, Walter Pitts	Link
1950	Computing Machinery and Intelligence	Classic (~15k)	Turing Test; framing machine intelligence	Alan Turing	Link
1956	The Logic Theorist	Classic	First AI program; symbolic reasoning as search	Allen Newell, Herbert Simon, Cliff Shaw	Link
1958	Perceptron: A Probabilistic Model for Information Storage	Classic (~8k)	Early neural network learning rule	Frank Rosenblatt	Link
1959	Programs with Common Sense (Advice Taker)	Classic	Knowledge-based reasoning, symbolic AI	John McCarthy	Link
1961	Steps Toward Artificial Intelligence	Classic (~5k)	Search, heuristics, symbolic AI agenda	Marvin Minsky	Link
1969	Perceptrons	Classic (~10k)	Limits of single-layer perceptrons	Marvin Minsky, Seymour Papert	Link
1975	A Framework for Representing Knowledge	Classic (~8k)	Frames; structured knowledge representation	Marvin Minsky	Link
1977	Knowledge Representation and Reasoning	Classic	Formal KR foundations	John McCarthy	Link
1986	Induction of Decision Trees	~20k	ID3; decision tree learning	J. Ross Quinlan	Link
1986	Learning Representations by Back-Propagating Errors	~60k	Backpropagation for multilayer nets	Rumelhart, Hinton, Williams	Link
1986	Explanation-Based Learning	Classic (~3k)	Symbolic learning from explanation	Tom Mitchell et al.	Link
1989	Q-learning	~40k	Model-free reinforcement learning	Christopher Watkins	Link
1989	A Tutorial on Hidden Markov Models	~35k	Sequence modeling, speech recognition	Lawrence Rabiner	Link
1992	Reinforcement Learning: An Introduction	Very High (~50k+)	Formal RL framework	Richard Sutton, Andrew Barto	Link
1993	Keeping the neural networks simple by minimizing the description length of the weights	~3k	MDL principle for neural nets	Geoffrey E. Hinton, Drew van Camp	Link
1995	Artificial Intelligence: A Modern Approach	Very High (~40k+)	Unified rational-agent view	Stuart Russell, Peter Norvig	Link
1995	Support-Vector Networks	~50k	Margin-based learning (SVMs)	Corinna Cortes, Vladimir Vapnik	Link
1997	Long Short-Term Memory	~80k	Solved long-term dependency problem	Sepp Hochreiter, Jürgen Schmidhuber	Link
1998	Boosting the Margin	~25k	AdaBoost theory	Robert Schapire et al.	Link
1998	Gradient-Based Learning Applied to Document Recognition	~52k	Convolutional nets & backprop for vision	Yann LeCun et al.	Link
2001	Random Forests	~90k	Ensemble learning	Leo Breiman	Link
2003	Latent Dirichlet Allocation	~45k	Probabilistic topic modeling	David Blei, Andrew Ng, Michael Jordan	Link
2004	A Tutorial Introduction to the Minimum Description Length Principle	~6k	MDL principle overview	Peter Grünwald	Link
2006	A Fast Learning Algorithm for Deep Belief Nets	~35k	Deep unsupervised pretraining	Geoffrey Hinton et al.	Link
2008	MapReduce: Simplified Data Processing on Large Clusters	~35k	Distributed data processing paradigm	Jeffrey Dean, Sanjay Ghemawat	Link
2008	Machine Super Intelligence	~500	AIXI and universal intelligence	Shane Legg	Link
2009	ImageNet: A Large-Scale Hierarchical Image Database	~45k	Data-driven deep learning era	Jia Deng et al.	Link
2011	Scikit-Learn: Machine Learning in Python	~70k	Premier ML library for practitioners	Fabian Pedregosa et al.	Link
2011	The First Law of Complexodynamics	~200	Complexity evolution in systems	Scott Aaronson	Link
2012	A Few Useful Things to Know About Machine Learning	~15k	Practical ML principles	Pedro Domingos	Link
2012	ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)	~180k	Deep learning breakthrough in vision	Alex Krizhevsky, Ilya Sutskever, Geoffrey Hinton	Link
2013	Kolmogorov Complexity and Algorithmic Randomness	~800	Algorithmic information theory	Alexander Shen, Vladimir Uspensky, Nikolay Vereshchagin	Link
2013	Playing Atari with Deep Reinforcement Learning	~12k	Deep Q-learning for games	Volodymyr Mnih et al.	Link
2014	Dropout: A Simple Way to Prevent Neural Networks from Overfitting	~70k	Regularization for deep nets	Nitish Srivastava et al.	Link
2014	Adam: A Method for Stochastic Optimization	~230k	Default deep learning optimizer	Diederik Kingma, Jimmy Ba	Link
2014	Generative Adversarial Networks	~130k	Adversarial generative modeling	Ian Goodfellow et al.	Link
2014	Recurrent Neural Network Regularization	~7k	RNN training techniques	Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals	Link
2014	Quantifying the Rise and Fall of Complexity in Closed Systems: the Coffee Automaton	~100	Complexity dynamics formalization	Scott Aaronson, Sean Carroll, Lauren Ouellette	Link
2014	Neural Turing Machines	~3k	Memory-augmented neural nets	Alex Graves, Greg Wayne, Ivo Danihelka	Link
2014	DeepFace: Closing the Gap to Human-Level Performance in Face Verification	~12k	Deep learning for face recognition	Yaniv Taigman et al.	Link
2014	Neural Machine Translation by Jointly Learning to Align and Translate	~45k	Attention mechanism for NMT	Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio	Link
2014	Sequence to Sequence Learning with Neural Networks	~50k	Seq2seq architecture	Ilya Sutskever, Oriol Vinyals, Quoc Le	Link
2014	Show and Tell: A Neural Image Caption Generator	~8k	Image captioning with deep learning	Oriol Vinyals et al.	Link
2014	DeepSpeech: Scaling up end-to-end speech recognition	~4k	End-to-end speech recognition	Awni Hannun et al.	Link
2015	Human-Level Control Through Deep Reinforcement Learning (DQN)	~40k	Deep RL + perception	Volodymyr Mnih et al.	Link
2015	Deep Residual Learning for Image Recognition (ResNet)	~150k	Very deep networks via skip connections	Kaiming He et al.	Link
2015	Very Deep Convolutional Networks (VGG)	~95k	Deep convolutional vision models	Karen Simonyan, Andrew Zisserman	Link
2015	Batch Normalization	~45k	Faster & stable deep training	Sergey Ioffe, Christian Szegedy	Link
2015	Deep Learning (Survey)	~60k	Overview of deep representation learning	Yann LeCun, Yoshua Bengio, Geoffrey Hinton	Link
2015	Faster R-CNN: Towards Real-Time Object Detection	~55k	Region proposal for detection	Shaoqing Ren et al.	Link
2015	The Unreasonable Effectiveness of Recurrent Neural Networks	Blog Post	RNN capabilities and applications	Andrej Karpathy	Link
2015	Understanding LSTM Networks	Blog Post	LSTM architecture explanation	Christopher Olah	Link
2015	Pointer Networks	~3k	Attention-based output mechanism	Oriol Vinyals, Meire Fortunato, Navdeep Jaitly	Link
2015	Order Matters: Sequence to sequence for sets	~2k	Set-to-sequence learning	Oriol Vinyals, Samy Bengio, Manjunath Kudlur	Link
2015	Multi-Scale Context Aggregation by Dilated Convolutions	~8k	Dilated convolutions	Fisher Yu, Vladlen Koltun	Link
2015	Deep Speech 2: End-to-End Speech Recognition in English and Mandarin	~3k	Multilingual speech recognition	Baidu Research	Link
2015	A Neural Algorithm of Artistic Style	~15k	Neural style transfer	Leon Gatys, Alexander Ecker, Matthias Bethge	Link
2015	Deep Reinforcement Learning with Double Q-learning	~12k	Improved Q-learning	Hado van Hasselt, Arthur Guez, David Silver	Link
2016	XGBoost: A Scalable Tree Boosting System	~35k	Industrial-grade boosting	Tianqi Chen, Carlos Guestrin	Link
2016	TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems	~45k	Scalable ML software platform	Martín Abadi et al.	Link
2016	Identity Mappings in Deep Residual Networks	~15k	Improved ResNet design	Kaiming He et al.	Link
2016	WaveNet: A Generative Model for Raw Audio	~8k	Audio generation with deep learning	Aäron van den Oord et al.	Link
2016	Neural Architecture Search with Reinforcement Learning	~6k	Automated neural architecture design	Barret Zoph, Quoc Le	Link
2017	Attention Is All You Need	~180k	Transformer architecture	Ashish Vaswani et al.	Link
2017	Proximal Policy Optimization (PPO)	~25k	Stable policy-gradient RL	John Schulman et al.	Link
2017	Neural Message Passing for Quantum Chemistry	~4k	Graph neural networks for chemistry	Justin Gilmer et al.	Link
2017	A Simple Neural Network Module for Relational Reasoning	~3k	Relation networks	Adam Santoro et al.	Link
2017	Variational Lossy Autoencoder	~1k	Improved VAE objective	Xi Chen et al.	Link
2017	A Survey of Deep Reinforcement Learning Techniques	~2k	Deep RL overview	Yuxi Li	Link
2017	DeepFM: A Factorization-Machine based Neural Network for CTR Prediction	~5k	CTR prediction with factorization	Huifeng Guo et al.	Link
2017	Neural Style Transfer: A Review	~800	NST comprehensive survey	Yongcheng Jing et al.	Link
2017	Deep Reinforcement Learning from Human Preferences	~3k	RLHF foundation	Paul Christiano et al.	Link
2017	Deep Learning based Recommender System: A Survey and New Perspectives	~1.2k	Recommender systems with deep learning	Shuai Zhang, Lina Yao, Aixin Sun, Yi Tay	Link
2017	Neural Collaborative Filtering	~5k	Deep learning for collaborative filtering	Xiangnan He et al.	Link
2017	AlphaGo Zero: Mastering the game of Go without human knowledge	~15k	Self-play RL without human data	David Silver et al.	Link
2017	VQ-VAE: Neural Discrete Representation Learning	~3k	Discrete latent representations	Aäron van den Oord, Oriol Vinyals, Koray Kavukcuoglu	Link
2018	BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	~95k	Bidirectional language pretraining	Jacob Devlin et al.	Link
2018	The Illustrated Transformer	Blog Post	Transformer visualization and explanation	Jay Alammar	Link
2018	Relational Recurrent Neural Networks	~1k	Memory and relational reasoning	Adam Santoro et al.	Link
2018	YOLOv3: An Incremental Improvement	~40k	Object detection improvements	Joseph Redmon, Ali Farhadi	Link
2019	GPT-2: Language Models are Unsupervised Multitask Learners	~25k	Scaling transformers	Alec Radford et al.	Link
2019	The Bitter Lesson	Essay	Computation and learning vs hand-coding	Rich Sutton	Link
2019	GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism	~2k	Model parallelism technique	Yanping Huang et al.	Link
2020	Language Models are Few-Shot Learners (GPT-3)	~50k	Emergent in-context learning	Tom Brown et al.	Link
2020	An Image Is Worth 16×16 Words: Transformers for Image Recognition (ViT)	~35k	Transformers for vision	Alexey Dosovitskiy et al.	Link
2020	Scaling Laws for Neural Language Models	~12k	Predictable scaling behavior	Jared Kaplan et al.	Link
2020	Denoising Diffusion Probabilistic Models	~35k	Diffusion-based generation	Jonathan Ho, Ajay Jain, Pieter Abbeel	Link
2020	Dense Passage Retrieval for Open-Domain Question Answering	~4k	Dense retrieval for QA	Vladimir Karpukhin et al.	Link
2020	Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks	~8k	RAG architecture	Patrick Lewis et al.	Link
2021	Decision Transformer: Reinforcement Learning via Sequence Modeling	~6k	RL as sequence modeling	Lili Chen et al.	Link
2021	Highly Accurate Protein Structure Prediction with AlphaFold	~35k	Solved protein folding	John Jumper et al.	Link
2021	Zero-Shot Text-to-Image Generation (DALL-E)	~5k	DALL-E model	Aditya Ramesh et al.	Link
2022	Training Language Models to Follow Instructions with Human Feedback (InstructGPT)	~18k	RLHF alignment	Long Ouyang et al.	Link
2022	Chain-of-Thought Prompting Elicits Reasoning in Large Language Models	~12k	Reasoning via prompting	Jason Wei et al.	Link
2022	PaLM: Scaling Language Modeling with Pathways	~8k	Massive-scale LLMs	Aakanksha Chowdhery et al.	Link
2022	Constitutional AI: Harmlessness from AI Feedback	~7k	Self-alignment via principles	Yuntao Bai et al.	Link
2022	Self-Instruct: Aligning language models with self-generated instructions	~4k	Self-supervised instruction tuning	Yizhong Wang et al.	Link
2022	Chinchilla: Training Compute-Optimal Large Language Models	~5k	Optimal scaling laws	Jordan Hoffmann et al.	Link
2022	Precise Zero-Shot Dense Retrieval Without Relevance Labels	~1k	Hypothetical document embeddings	Luyu Gao et al.	Link
2023	Segment Anything	~8k	Foundation models for vision	Alexander Kirillov et al.	Link
2023	LLaMA: Open and Efficient Foundation Language Models	~15k	Open-weight LLM paradigm	Hugo Touvron et al.	Link
2023	Sparks of Artificial General Intelligence: Early experiments with GPT-4	~6k	Emergent general capabilities	Sébastien Bubeck et al.	Link
2023	Direct Preference Optimization: Your Language Model is Secretly a Reward Model	~5k	Simpler alignment than RLHF	Rafael Rafailov et al.	Link
2023	Understanding Deep Learning	Textbook	Comprehensive deep learning textbook	Simon J.D. Prince	Link
2023	Zephyr: Direct Distillation of LM Alignment	~1k	Distilled alignment models	Lewis Tunstall et al.	Link
2023	Lost in the Middle: How Language Models Use Long Contexts	~2k	Context window utilization	Nelson F. Liu et al.	Link
2023	Alpaca: A Strong, Replicable Instruction-Following Model	~3k	Low-cost instruction tuning	Stanford CRFM	Link
2023	Llama 2: Open Foundation and Fine-Tuned Chat Models	~10k	Open commercial LLMs	Hugo Touvron et al.	Link
2023	LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models	~800	Extended context training	Yukang Chen et al.	Link
2023	Are Emergent Abilities of Large Language Models a Mirage?	~1k	Critique of emergence claims	Rylan Schaeffer et al.	Link
2023	Mamba: Linear-Time Sequence Modeling with Selective State Spaces	~2k	State space models for sequences	Albert Gu, Tri Dao	Link
2023	QLoRA: Efficient Finetuning of Quantized LLMs	~3k	Quantized model fine-tuning	Tim Dettmers et al.	Link
2023	Reflexion: Language Agents with Verbal Reinforcement Learning	~1k	Self-reflection for agents	Noah Shinn et al.	Link
2023	Explainability for Large Language Models: A Survey	~800	LLM interpretability overview	Haiyan Zhao et al.	Link
2024	Gemini: A Family of Highly Capable Multimodal Models	Rapidly growing (~5k)	Native multimodal foundation models	Google DeepMind	Link
2024	AlphaFold 3	~3k	Molecular & interaction prediction	DeepMind	Link
2024	MiniGPT-4: Enhancing Vision-Language Understanding	~2k	Open multimodal LLMs	Deyao Zhu et al.	Link
2024	Representation Engineering: A Top-Down Approach to AI Transparency	Emerging (~500)	Steering internal representations	Andy Zou et al.	Link
2024	Better & Faster Large Language Models Via Multi-token Prediction	~800	Multi-token prediction training	Fabian Gloeckle et al.	Link
2024	KAN: Kolmogorov-Arnold Networks	~1k	Alternative to MLPs	Ziming Liu et al.	Link
2025	LLMs Will Always Hallucinate, and We Need to Live With This	~200	Establishes the mathematical certainty of hallucinations	Banerjee et al.	Link
2025	DINOv3	~200	Vision Foundation Models	O Siméoni et al.	Link

License

CC0-1.0 License - See LICENSE file for details. Individual papers remain under their respective copyrights.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Most Influential AI Research

Papers Collection

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Most Influential AI Research

Papers Collection

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages