Awesome Adaptation of Agentic AI
A curated list of papers on adaptation strategies of agentic AI systems. This repository accompanies the paper "Adaptation of Agentic AI" (Ongoing Work).
Cite this paper:
@article{jiang2025adaptation,
title = {Adaptation of Agentic AI},
author = {Jiang, Pengcheng and Lin, Jiacheng and Shi, Zhiyi and Wang, Zifeng and He, Luxi and Wu, Yichen and Zhong, Ming and Zhang, Qizheng and Song, Peiyang and Wang, Heng and Xu, Xueqiang and Xu, Hanwen and Han, Pengrui and Zhang, Dylan and Sun, Jiashuo and Yang, Chaoqi and Qian, Kun and Wang, Tian and Hu, Changran and Li, Manling and Li, Quanzheng and Wang, Sheng and Peng, Hao and You, Jiaxuan and Liu, Liyuan and Lu, Pan and Zhang, Yu and Ji, Heng and Choi, Yejin and Song, Dawn and Sun, Jimeng and Han, Jiawei},
howpublished = {https://github.com/pat-jj/Awesome-Adaptation-of-Agentic-AI},
year = {2025}
}
A1: Tool Execution Signaled
Time
Method
Venue
Task(s)
Tool(s)
Agent Backbone
Tuning
2025.11
Orion
arXiv Paper
IR
Retrievers
LFM2
GRPO
2025.10
olmOCR2
arXiv Paper Code
Document OCR
Synthetic Document Verifier
Qwen2.5-VL
SFT, GRPO
2025.10
ToolExpander
arXiv Paper
Tool-Calling
Various APIs
Qwen2.5
SFT + GRPO
2025.09
WebGen-Agent
arXiv Paper Code
Website Generation
VLM, GUI Agent, Code Executor
Various Models
SFT, Step-GRPO
2025.09
Tool-R1
arXiv Paper Code
General Tool-Augmented Reasoning, Multimodal QA
Code Execution, Multimedia Tools
Qwen2.5
GRPO
2025.08
FTRL
arXiv Paper Code
Multi-Step Tool-Use
Simulated APIs
Qwen3
GRPO
2025.06
Router-R1
NeurIPS'25 Paper Code
Multi-Round Routing
LLM Routing Pool
Qwen2.5, LLaMA3.2
PPO
2025.05
R1-Code-Interpreter
arXiv Paper Code
Coding
Code Execution Sandbox
Qwen2.5
GRPO
2025.05
Tool-N1
arXiv Paper Code
Tool-Calling
Various APIs
Qwen2.5
GRPO
2025.04
SQL-R1
NeurIPS'25 Paper Code
Text2SQL Search
SQL Engine
Qwen2.5, OmniSQL
SFT, GRPO
2025.03
Rec-R1
TMLR'25 Paper Code
Recommendation Optimization
Recommendation System
Qwen2.5, LLaMA3.2
GRPO
2025.03
ReZero
arXiv Paper Code
Web Search, IR
Web Search Engine
LLaMA3.2
GRPO
2025.03
Code-R1
--- Code
Coding
Code Executor
Qwen2.5
GRPO
2025.02
DeepRetrieval
COLM'25 Paper Code
Web Search, IR, Text2SQL
Search Engine, Retrievers, SQL exec.
Qwen2.5, LLaMA3.2
PPO, GRPO
2025.01
DeepSeek-R1-Zero (Code)
Nature Paper
Coding
Code Executor
DeepSeek-V3-Base
GRPO
2024.10
RLEF
ICML'25 Paper
Coding
Code Executor
LLaMA3.1
PPO
2024.05
LeDex
NeurIPS'24 Paper
Coding
Code Executor
StarCoder & CodeLlaMA
SFT, PPO
Time
Method
Venue
Task(s)
Tool(s)
Agent Backbone
Tuning
2024.10
LeReT
ICLR'25 Paper Code
IR
Dense Retriever
LLaMA3, Gemma2
DPO-like (IPO)
2024.10
ToolFlow
NAACL'25 Paper
Tool-Calling
Various APIs
LLaMA3.1
SFT
2024.06
TP-LLaMA
NeurIPS'24 Paper
Tool-Calling
Various APIs
LLaMA2
SFT, DPO
2024.05
AutoTools
WWW'25 Paper Code
Automated Tool-Calling
Various APIs
GPT4, LLaMA3, Mistral
SFT
2024.03
CYCLE
OOPSLA'24 Paper
Coding
Code Executor
CodeGen, StarCoder
SFT
2024.02
RetPO
NAACL'25 Paper Code
IR
Retriever
LLaMA2-7B
SFT, DPO
2024.02
CodeAct
ICML'24 Paper Code
Coding
Code Executor
LLaMA2, Mistral
SFT
2024.01
NExT
ICML'24 Paper
Program Repair
Code Executor
PaLM2
SFT
2023.07
ToolLLM
ICLR'24 Paper Code
Tool-Calling, API Planning, Multi-Tool Reasoning
Real-World APIs
LLaMA, Vicuna
SFT
2023.06
ToolAlpaca
arXiv Paper Code
Multi-Turn Tool-Use
Simulated APIs
Vicuna
SFT
2023.05
Gorilla
NeurIPS'24 Paper Code
Tool-Calling, API Retrieval
Various APIs
LLaMA
SFT
2023.05
TRICE
NAACL'24 Paper Code
Math Reasoning, QA, Multilingual QA, Knowledge Retrieval
Calculator, WikiSearch, Atlas QA Model, NLLB Translator
ChatGLM, Alpaca, Vicuna
SFT
2023.02
Toolformer
NeurIPS'23 Paper Code
QA, Math
Calculator, QA system, Search Engine, Translation System, Calendar
GPT-J
SFT
A2: Agent Output Signaled
Time
Method
Venue
Task(s)
Tool(s)
Agent Backbone
Tuning
2025.10
TT-SI
arXiv Paper
Tool Calling
Various APIs
Qwen2.5
Test-Time Fine-Tuning
2025.10
A²FM
arXiv Paper Code
Web Navigation, Math, QA
Search Engine, Crawl, Code Executor
Qwen2.5
APO, GRPO
2025.08
MedResearcher-R1
arXiv Paper Code
Medical Multi-hop QA
Medical Retriever, Web Search API, Document Reader
MedResearcher-R1
SFT, GRPO
2025.08
Agent Lightning
arXiv Paper Code
Text-to-SQL, RAG, Math
SQL Executor, Retriever, Calculator
LLaMA3.2
LightningRL
2025.07
CodePRM
ACL'25 Paper
Coding
Code Executor
Qwen2.5-Coder
SFT
2025.07
DynaSearcher
arXiv Paper Code
Multi-Hop QA, RAG
Document Search, KG Search
Qwen2.5, LLaMA3.1
GRPO
2025.06
MMSearch-R1
arXiv Paper Code
Web Browsing, QA, Multimodal Search
Image Search, Web Browsing, Retriever
Qwen2.5
REINFORCE, SFT
2025.06
Self-Challenging
arXiv Paper
Web Browsing, Calculation, Retail, Airline
Code Interpreter, Web Browser, Database APIs
LLaMA3.1
REINFORCE, SFT
2025.05
StepSearch
EMNLP'25 Paper Code
Multi-Hop QA
Search Engine, Retriever
Qwen2.5
StePPO
2025.05
ZeroSearch
arXiv Paper Code
Multi-Hop QA, QA
Search Engine, Web Search
Qwen2.5, LLaMA3.2
REINFORCE, GPRO, PPO, SFT
2025.05
AutoRefine
NeurIPS'25 Paper Code
Multi-Hop QA, QA
Retriever
Qwen2.5
GRPO
2025.04
ReTool
arXiv Paper Code
Math
Code Interpreter
Qwen2.5
PPO
2025.04
ToolRL
arXiv Paper Code
Tool Calling
Various Tools
Various Models
GRPO
2025.04
DeepResearcher
arXiv Paper Code
QA, Multi-Hop Reasoning, Deep Research
Web Search API, Web Browser
Qwen2.5
GRPO
2025.03
ReSearch
NeurIPS'25 Paper Code
QA
Search Engine, Retriever
Qwen2.5
GRPO
2025.03
Search-R1
COLM'25 Paper Code
QA
Search Engine, Retriever
Qwen2.5
PPO, GRPO
2025.03
R1-Searcher
arXiv Paper Code
QA
Retriever
LLaMA3.1, Qwen2.5
REINFORCE++
2025.02
RAS
arXiv Paper Code
QA
Retriever
LLaMA2, LLaMA3.2
SFT
2025.01
Agent-R
arXiv Paper Code
Various Tasks
Monte Carlo Tree Search
Qwen2.5, LLaMA3.2
SFT
2024.06
Re-ReST
EMNLP'24 Paper Code
Multi-Hop QA, VQA, Sequential Decision, Coding
Various APIs
Various Models
DPO
2024.06
RPG
EMNLP'24 Paper Code
RAG, QA, Multi-hop Reasoning
Search Engine, Retriever
LLaMA2, GPT3.5
SFT
2023.10
Self-RAG
ICLR'24 Paper Code
RAG, QA, Fact Verification
Retriever
LLaMA2
SFT
2023.10
FireAct
arXiv Paper Code
QA
Search API
GPT3.5, LLaMA2, CodeLLaMA
SFT
Time
Method
Venue
Task(s)
Tool(s)
Agent Backbone
Tuning
2025.10
Empower
arXiv Paper Code
Coding
---
Gemma3
SFT
2025.10
KnowRL
arXiv Paper Code
Knowledge calibration
---
LLaMA3.1, Qwen2.5
REINFORCE++
2025.10
GRACE
arXiv Paper Code
Embedding Tasks
---
Qwen2.5, Qwen3, LLaMA3.2
GRPO
2025.06
Magistral
arXiv Paper
Math, Coding
---
Magistral
PPO, GRPO
2025.05
EHRMind
arXiv Paper Code
EHR-based Reasoning
---
LLaMA3
SFT, GRPO
2025.01
Kimi k1.5
arXiv Paper Code
Math, Coding
---
Kimi k1.5
GRPO
2025.01
DeepSeek-R1-Zero (Math)
Nature Paper
Math
---
DeepSeek-V3
GRPO
2024.09
SCoRe
ICLR'25 Paper Code
Math, Coding, QA
---
Gemini1.0 Pro, Gemini1.5 Flash
REINFORCE
2024.07
RISE
NeurIPS'24 Paper Code
Math
---
LLaMA2, LLaMA3, Mistral
SFT
2024.06
TextGrad
Nature Paper Code
Various Tasks
---
GPT3.5, GPT4o
Prompt Tuning
2023.03
Self-Refine
NeurIPS'23 Paper Code
Dialogue, Math, Coding
---
GPT3.5, GPT4, CODEX
Test-Time Prompting
T1: Agent-Agnostic Tool Adaptation
Foundational Systems and Architectures
Year.Month
Method Name
Venue
Paper Name
2021.08
Neural Operators
JMLR'23 Paper
Neural Operator: Learning Maps Between Function Spaces
2023.09
HuggingGPT
NeurIPS'23 Paper Code
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
2023.08
ViperGPT
ICCV'23 Paper Code
ViperGPT: Visual Inference via Python Execution for Reasoning
2025.07
SciToolAgent
Nature Comp. Sci.'25 Paper
SciToolAgent: A Knowledge-Graph-Driven Scientific Agent for Multitool Integration
Categories and Training Methods
Year.Month
Method Name
Venue
Paper Name
2021.01
CLIP
ICML'21 Paper Code
Learning Transferable Visual Models from Natural Language Supervision
2023.04
SAM
ICCV'23 Paper Code
Segment Anything
2024.06
SAM-CLIP
CVPR'24 Paper
SAM-CLIP: Merging Vision Foundation Models Towards Semantic and Spatial Understanding
2023.12
Whisper
ICML'23 Paper Code
Robust Speech Recognition via Large-Scale Weak Supervision
2024.10
CodeAct
ICML'24 Paper Code
Executable Code Actions Elicit Better LLM Agents
2020.04
DPR
EMNLP'20 Paper Code
Dense Passage Retrieval for Open-Domain Question Answering
2020.04
ColBERT
SIGIR'20 Paper Code
ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
2021.12
Contriever
TMLR'22 Paper Code
Unsupervised Dense Information Retrieval with Contrastive Learning
2022.12
e5
arXiv Paper Code
Text Embeddings by Weakly-Supervised Contrastive Pre-training
2021.07
AlphaFold2
Nature Paper Code
Highly Accurate Protein Structure Prediction with AlphaFold
2023.03
ESMFold
Science Paper
Evolutionary-Scale Prediction of Atomic-Level Protein Structure with a Language Model
T2: Agent-Supervised Tool Adaptation
Time
Method
Venue
Task(s)
Tool Backbone
Agent Backbone
Tuning
2025.10
QAgent
arXiv Paper Code
QA, RAG
Qwen2.5-3B
Qwen-7B
GRPO
2025.10
AgentFlow
arXiv Paper Code
Web Search, Planning, Reasoning, Math
Qwen2.5-7B
Qwen2.5-7B
Flow-GRPO
2025.10
Advisor Models
arXiv Paper Code
Math, Reasoning
Qwen2.5-7B, Qwen3-8B
GPT-4o-Mini, GPT-5, Claude4-Sonnet, GPT-4.1-Mini
GRPO
2025.10
AutoGraph-R1
arXiv Paper Code
KG Construction, RAG
KG Constructor (Qwen2.5-3B/7B)
Frozen RAG Generator (Qwen2.5-7B)
GRPO
2025.10
MAE
arXiv Paper Code
Math, Coding, Commonsense Reasoning
Qwen2.5-3B
Qwen2.5-3B
REINFORCE++
2025.09
Mem-α
arXiv Paper Code
Retrieval, Test-Time Learning, Long-Range Understanding
Qwen3-4B
Qwen3-4B, Qwen3-32B, GPT-4.1-Mini
GRPO
2025.08
AI-SearchPlanner
arXiv Paper
Web QA
Qwen3-32b
Qwen2.5-7B
PPO
2025.08
Memento
arXiv Paper Code
Long-Horizon Reasoning, Web Research, QA, Academic Reasoning
Q-function (two-layer MLPs)
GPT-4.1
Soft Q-Learning
2025.08
R-Zero
arXiv Paper Code
Math, Reasoning
Qwen3-4B, Qwen3-8B, OctoThinker-3B, OctoThinker-8B
Qwen3-4B, Qwen3-8B, OctoThinker-3B, OctoThinker-8B
GRPO
2025.06
Sysformer
arXiv Paper
QA, RAG
Small Transformer
LLaMA-2-7B, LLaMA-3.1-8B, Mistral-7B, Phi-3.5-mini, Zephyr-7B-beta
Supervised Learning
2025.05
s3
EMNLP'25 Paper Code
QA, RAG
Qwen2.5-7B
Qwen2.5-7B, Qwen2.5-14B, Claude-3-Haiku
PPO
2024.10
Matryoshka Pilot
NeurIPS'25 Paper Code
Math, Planning, Reasoning
LLaMA3-8B, Qwen2.5-7B
GPT-4o-Mini, GPT-3.5-Turbo
DPO, IDPO
2024.06
CoBB
EMNLP'24 Paper Code
QA, Math
Mistral-7b-inst-v2
GPT-3.5-Turbo, Claude-3-Haiku, Phi-3-mini-4k-inst, Gemma-1.1-7B-it, Mistral-7B-inst-v2
SFT, ORPO
2024.05
Medadapter
EMNLP'24 Paper Code
Medical QA, NLI, RQE
BERT-Base-Uncased
GPT-3.5-Turbo
SFT, BPO
2024.03
BLADE
AAAI'25 Paper Code
Domain-Specific QA
BLOOMZ-1b7
ChatGPT, ChatGLM, Baichuan, Qwen
SFT, BPO
2024.02
ARL2
ACL'24 Paper Code
QA
LLaMA2-7B
GPT-3.5-Turbo
Contrastive Learning
2024.02
EVOR
EMNLP'24 Paper Code
RAG-based Coding
GPT-3.5-Turbo
GPT-3.5-Turbo, CodeLLaMA
Prompt Engineering
2024.02
Bbox-Adapter
ICML'24 Paper Code
QA
DeBERTa-v3-base (0.1B), DeBERTa-v3-large (0.3B)
GPT-3.5-Turbo, Mixtral-8x7B
Contrastive Learning
2024.01
Proxy-Tuning
COLM'24 Paper Code
QA, Math, Code
LLaMA2-7B
LLaMA2-70B
Proxy-Tuning
2024.01
BGM
ACL'24 Paper
QA, Personalized Generation (NQ, HotpotQA, Email, Book)
T5-XXL-11B
PaLM2-S
SFT, PPO
2023.10
RA-DIT
ICLR'24 Paper
Knowledge-Intensive Tasks (MMLU, NQ, TQA, ELI5, HotpotQA, etc.)
DRAGON+
LLaMA-65B
SFT, LSR
2023.06
LLM-R
EACL'24 Paper Code
Zero-shot NLU (Reading Comprehension, QA, NLI, Paraphrase, Sentiment, Summarization)
E5-base
GPT-Neo-2.7B, LLaMA-13B, GPT-3.5-Turbo
Contrastive Learning
2023.05
AAR
ACL'23 Paper Code
Zero-Shot Generalization (MMLU, PopQA)
ANCE, Contriever
Flan-T5-Small, InstructGPT
Contrastive Learning
2023.05
ToolkenGPT
NeurIPS'23 Paper Code
Numerical Reasoning, QA, Plan Generation
Token Embedding
GPT-J 6B, OPT-6.7B, OPT-13B
Proxy-Tuning
2023.03
UPRISE
EMNLP'23 Paper Code
Zero-shot NLU (Reading Comprehension, QA, NLI, Paraphrase, Sentiment, Summarization)
GPT-Neo-2.7B
BLOOM-7.1B, OPT-66B, GPT-3-175B
Contrastive Learning
2023.01
REPLUG
NAACL'24 Paper Code
QA
Contriever
GPT3-175B, PaLM, Codex, LLaMA-13B
Proxy-Tuning, LSR
If you find this repository useful, please consider citing our survey:
@article{jiang2025adaptation,
title = {Adaptation of Agentic AI},
author = {Jiang, Pengcheng and Lin, Jiacheng and Shi, Zhiyi and Wang, Zifeng and He, Luxi and Wu, Yichen and Zhong, Ming and Zhang, Qizheng and Song, Peiyang and Wang, Heng and Xu, Xueqiang and Xu, Hanwen and Han, Pengrui and Zhang, Dylan and Sun, Jiashuo and Yang, Chaoqi and Qian, Kun and Wang, Tian and Hu, Changran and Li, Manling and Li, Quanzheng and Wang, Sheng and Peng, Hao and You, Jiaxuan and Liu, Liyuan and Lu, Pan and Zhang, Yu and Ji, Heng and Choi, Yejin and Song, Dawn and Sun, Jimeng and Han, Jiawei},
howpublished = {https://github.com/pat-jj/Awesome-Adaptation-of-Agentic-AI},
year = {2025}
}
We welcome contributions! Please feel free to submit a Pull Request to add new papers or update existing entries.
(ノ◕ヮ◕)ノ*:・゚✧ Keep exploring the awesome world of agentic AI! ✧゚・: *ヽ(◕ヮ◕ヽ)