Yingxuan Yang1 Mulei Ma2 Yuxuan Huang3 Huacan Chai1 Chenyu Gong2 Haoran Geng4 Yuanjian Zhou5 Ying Wen1 Meng Fang3 Muhao Chen6 Shangding Gu4* Ming Jin7 Costas Spanos4 Yang Yang2 Pieter Abbeel4 Dawn Song4 Weinan Zhang1,5* Jun Wang8*
1Shanghai Jiao Tong University 2The Hong Kong University of Science and Technology, Guangzhou 3University of Liverpool 4University of California, Berkeley 5Shanghai Innovation Institute 6University of California, Davis 7Virginia Tech 8University College London
* Corresponding authors (Project Lead).
The repository is for Agentic Web research, in which we investigate various agentic web studies. If any authors do not want their paper to be listed here, please feel free to contact [email protected]. (This repository is under actively development. We appreciate any constructive comments and suggestions)
You are more than welcome to update this list! If you find a paper about agentic web which is not listed here, please
- fork this repository, add it and merge back;
- or report an issue here;
- or email [email protected]
Content
- Agentic Web Development
- Information Retrieval
- Recommendation
- Agent Planning
- Multi-Agent Learning
- Safety and Security
- Benchmark
- Citation
- A Survey of AI Agent Registry Solutions by Aditi Singh, Abul Ehtesham, Ramesh Raskar, Mahesh Lambe, Pradyumna Chari, Jared James Grogan, Abhishek Singh, Saket Kumar. 2025
- Using the NANDA Index Architecture in Practice: An Enterprise Perspective by Sichao Wang, Ramesh Raskar, Mahesh Lambe, Pradyumna Chari, Rekha Singhal, Shailja Gupta, Rajesh Ranjan, Ken Huang. 2025
- Web3 x AI Agents: Landscape, Integrations, and Foundational Challenges by Yiming Shen, Jiashuo Zhang, Zhenzhe Shao, Wenxuan Luo, Yanlin Wang, Ting Chen, Zibin Zheng, Jiachi Chen. 2025
- Plan-and-act: Improving planning of agents for long-horizon tasks by Erdogan, Lutfi Eren, Nicholas Lee, Sehoon Kim, Suhong Moon, Hiroki Furuta, Gopala Anumanchipalli, Kurt Keutzer, and Amir Gholami. 2025
- A survey of webagents: Towards next-generation ai agents for web automation with large foundation models by Ning, Liangbo, Ziran Liang, Zhuohang Jiang, Haohao Qu, Yujuan Ding, Wenqi Fan, Xiao-yong Wei et al. 2025
- WebDancer: Towards Autonomous Information Seeking Agency by Wu, Jialong, Baixuan Li, Runnan Fang, Wenbiao Yin, Liwen Zhang, Zhengwei Tao, Dingchu Zhang et al. 2025
- From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents by Zhang, Weizhi, Yangning Li, Yuanchen Bei, Junyu Luo, Guancheng Wan, Liangwei Yang, Chenxuan Xie et al. 2025
- MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning by Nguyen, Thang, Peter Chin, and Yu-Wing Tai. 2025
- Deep Research Agents: A Systematic Examination And Roadmap by Huang, Yuxuan, Yihang Chen, Haozheng Zhang, Kang Li, Meng Fang, Linyi Yang, Xiaoguang Li et al. 2025
- Smartagent: Chain-of-user-thought for embodied personalized agent in cyber world by Zhang, Jiaqi, Chen Gao, Liyuan Zhang, Yong Li, and Hongzhi Yin. 2025
- ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation by Wang, Shu, Yixiang Fang, Yingli Zhou, Xilin Liu, and Yuchi Ma. 2025
- Macrec: A multi-agent collaboration framework for recommendation by Wang, Zhefan, Yuanqing Yu, Wendi Zheng, Weizhi Ma, and Min Zhang. 2024
- Webarena: A realistic web environment for building autonomous agents by Zhou, Shuyan, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng et al. 2023
- Toolllm: Facilitating large language models to master 16000+ real-world apis by Qin, Yujia, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin et al. 2023
- Api-bank: A comprehensive benchmark for tool-augmented llms by Li, Minghao, Yingxiu Zhao, Bowen Yu, Feifan Song, Hangyu Li, Haiyang Yu, Zhoujun Li, Fei Huang, and Yongbin Li. 2023
- React: Synergizing reasoning and acting in language models by Yao, Shunyu, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023
- Voyager: An open-ended embodied agent with large language models by Wang, Guanzhi, Yuqi Xie, Yunfan Jiang, Ajay Mandlekar, Chaowei Xiao, Yuke Zhu, Linxi Fan, and Anima Anandkumar. 2023
- Toolformer: Language models can teach themselves to use tools by Schick, Timo, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023
- Swe-bench: Can language models resolve real-world github issues? by Jimenez, Carlos E., John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. 2023
- Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face by Shen, Yongliang, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, and Yueting Zhuang. 2023
- Training language models to follow instructions with human feedback by Ouyang, Long, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang et al. 2022
- Webgpt: Browser-assisted question-answering with human feedback by Nakano, Reiichiro, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse et al. 2021
- Deep reinforcement learning for list-wise recommendations by Zhao, Xiangyu, Liang Zhang, Long Xia, Zhuoye Ding, Dawei Yin, and Jiliang Tang. 2019
- SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets. by Ie, Eugene, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Tushar Chandra, and Craig Boutilier. 2019
- Agentic information retrieval by Zhang, Weinan, Junwei Liao, Ning Li, Kounianhua Du, and Jianghao Lin. 2025
- Large language models for generative information extraction: A survey by Xu, Derong, Wei Chen, Wenjun Peng, Chao Zhang, Tong Xu, Xiangyu Zhao, Xian Wu, Yefeng Zheng, Yang Wang, and Enhong Chen. 2024
- Bias and unfairness in information retrieval systems: New challenges in the llm era by Dai, Sunhao, Chen Xu, Shicheng Xu, Liang Pang, Zhenhua Dong, and Jun Xu. 2024
- Retrieval-augmented code generation for universal information extraction by Guo, Yucan, Zixuan Li, Xiaolong Jin, Yantao Liu, Yutao Zeng, Wenxuan Liu, Xiang Li et al. 2024
- Large language models for information retrieval: A survey by Zhu, Yutao, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Haonan Chen, Zheng Liu, Zhicheng Dou, and Ji-Rong Wen. 2023
- Inpars-v2: Large language models as efficient dataset generators for information retrieval by Jeronymo, Vitor, Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Roberto Lotufo, Jakub Zavrel, and Rodrigo Nogueira. 2023
- Unified structure generation for universal information extraction by Lu, Yaojie, Qing Liu, Dai Dai, Xinyan Xiao, Hongyu Lin, Xianpei Han, Le Sun, and Hua Wu. 2022
- An Adversarial Imitation Click Model for Information Retrieval by Dai, Xinyi, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, and Yong Yu. 2021
- A survey on deep matrix factorizations by De Handschutter, Pierre, Nicolas Gillis, and Xavier Siebert. 2021
- Deepcf: A unified framework of representation learning and matching function learning in recommender system by Deng, Zhi-Hong, Ling Huang, Chang-Dong Wang, Jian-Huang Lai, and Philip S. Yu. 2019
- On the Equilibrium of Query Reformulation and Document Retrieval by Zou, Shihao, Guanyu Tao, Jun Wang, Weinan Zhang, and Dell Zhang. 2018
- Irgan: A minimax game for unifying generative and discriminative information retrieval models by Wang, Jun, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017
- Neural collaborative filtering by He, Xiangnan, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017
- DeepFM: a factorization-machine based neural network for CTR prediction by Guo, Huifeng, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017
- Wide & deep learning for recommender systems by Cheng, Heng-Tze, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson et al. 2016
- Autorec: Autoencoders meet collaborative filtering by Sedhain, Suvash, Aditya Krishna Menon, Scott Sanner, and Lexing Xie. 2015
- Top-k Retrieval using Facility Location Analysis by Zuccon, Guido, Leif Azzopardi, Dell Zhang, and Jun Wang. 2012
- Mean-variance analysis: A new document ranking theory in information retrieval by Wang, Jun. 2009
- Portfolio theory of information retrieval by Wang, Jun, and Jianhan Zhu. 2009
- The probabilistic relevance framework: BM25 and beyond by Robertson, Stephen, and Hugo Zaragoza. 2009
- Matrix factorization techniques for recommender systems by Koren, Yehuda, Robert Bell, and Chris Volinsky. 2009
- Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords by Edelman, Benjamin, Michael Ostrovsky, and Michael Schwarz. 2007
- A user-item relevance model for log-based collaborative filtering by Wang, Jun, Arjen P. De Vries, and Marcel JT Reinders. 2006
- Item-based collaborative filtering recommendation algorithms by Sarwar, Badrul, George Karypis, Joseph Konstan, and John Riedl. 2001
- The PageRank citation ranking: Bringing order to the web by Page, Lawrence, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999
- Indexing by latent semantic analysis by Deerwester, Scott, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman. 1990
- A statistical interpretation of term specificity and its application in retrieval by Sparck Jones, Karen. 1972
- A survey of large language model empowered agents for recommendation and search: Towards next-generation information retrieval by Zhang, Yu, Shutong Qiao, Jiaqi Zhang, Tzu-Heng Lin, Chen Gao, and Yong Li. 2025
- A survey on llm-powered agents for recommender systems by Peng, Qiyao, Hongtao Liu, Hua Huang, Qing Yang, and Minglai Shao. 2025
- AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems by Shang, Yu, Peijie Liu, Yuwei Yan, Zijing Wu, Leheng Sheng, Yuanqing Yu, Chumeng Jiang et al. 2025
- Deep reinforcement learning based resource allocation for network slicing with massive MIMO by Yan, Dandan, Benjamin K. Ng, Wei Ke, and Chan-Tong Lam. 2024
- Macrec: A multi-agent collaboration framework for recommendation by Wang, Zhefan, Yuanqing Yu, Wendi Zheng, Weizhi Ma, and Min Zhang. 2024
- Raserec: Retrieval-augmented sequential recommendation by Zhao, Xinping, Baotian Hu, Yan Zhong, Shouzheng Huang, Zihao Zheng, Meng Wang, Haofen Wang, and Min Zhang. 2024
- Probing early modification of gravity with Planck, ACT and SPT by Abellán, Guillermo Franco, Matteo Braglia, Mario Ballardini, Fabio Finelli, and Vivian Poulin. 2023
- SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets by Ie, Eugene, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Tushar Chandra, and Craig Boutilier. 2019
- Novelty and diversity metrics for recommender systems: choice, discovery and relevance vy Castells, Pablo, Saúl Vargas, and Jun Wang. 2011
- Plangenllms: A modern survey of llm planning capabilities by Wei, Hui, Zihao Zhang, Shenghua He, Tian Xia, Shijia Pan, and Fei Liu. 2025
- Plan-and-act: Improving planning of agents for long-horizon tasks by Erdogan, Lutfi Eren, Nicholas Lee, Sehoon Kim, Suhong Moon, Hiroki Furuta, Gopala Anumanchipalli, Kurt Keutzer, and Amir Gholami. 2025
- Acpbench: Reasoning about action, change, and planning by Kokel, Harsha, Michael Katz, Kavitha Srinivas, and Shirin Sohrabi. 2025
- Natural plan: Benchmarking llms on natural language planning by Zheng, Huaixiu Steven, Swaroop Mishra, Hugh Zhang, Xinyun Chen, Minmin Chen, Azade Nova, Le Hou et al. 2024
- Adaplanner: Adaptive planning from feedback with language models by Sun, Haotian, Yuchen Zhuang, Lingkai Kong, Bo Dai, and Chao Zhang. 2023
- Toolllm: Facilitating large language models to master 16000+ real-world apis by Qin, Yujia, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin et al. 2023
- Webarena: A realistic web environment for building autonomous agents by Zhou, Shuyan, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng et al. 2023
- Realm-bench: A real-world planning benchmark for llms and multi-agent systems by Geng, Longling, and Edward Y. Chang. 2025
- Autogen: Enabling next-gen LLM applications via multi-agent conversations by Wu, Qingyun, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang et al. 2024
- Agentboard: An analytical evaluation board of multi-turn llm agents by Chang, Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, and Junxian He. 2024
- Learning to use tools via cooperative and interactive agents by Shi, Zhengliang, Shen Gao, Xiuyi Chen, Yue Feng, Lingyong Yan, Haibo Shi, Dawei Yin, Pengjie Ren, Suzan Verberne, and Zhaochun Ren. 2024
- Camel: Communicative agents for" mind" exploration of large language model society by Li, Guohao, Hasan Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. 2023
- Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents by Chen, Weize, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chen Qian, Chi-Min Chan et al. 2023
- Metagpt: Meta programming for multi-agent collaborative framework by Hong, Sirui, Xiawu Zheng, Jonathan Chen, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang et al. 2023
- Taxai: A dynamic economic simulator and benchmark for multi-agent reinforcement learning by Mi, Qirui, Siyu Xia, Yan Song, Haifeng Zhang, Shenghao Zhu, and Jun Wang. 2023
- A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems by Slumbers, Oliver, David Henry Mguni, Stefano B. Blumberg, Stephen Marcus Mcaleer, Yaodong Yang, and Jun Wang. 2023
- Chatdev: Communicative agents for software development by Qian, Chen, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang et al. 2023
- MarlRank: Multi-agent Reinforced Learning to Rank by Zou, Shihao, Zhonghua Li, Mohammad Akbari, Jun Wang, and Peng Zhang. 2019
- Magent: A many-agent reinforcement learning platform for artificial collective intelligence by Zheng, Lianmin, Jiacheng Yang, Han Cai, Ming Zhou, Weinan Zhang, Jun Wang, and Yong Yu. 2018
- Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising by Jin, Junqi, Chengru Song, Han Li, Kun Gai, Jun Wang, and Weinan Zhang. 2018
- Securing agentic ai: A comprehensive threat model and mitigation framework for generative ai agents by Narajala, Vineeth Sai, and Om Narayan. 2025
- Open challenges in multi-agent security: Towards secure systems of interacting ai agents by de Witt, Christian Schroeder. 2025
- Model context protocol (mcp): Landscape, security threats, and future research directions by Hou, Xinyi, Yanjie Zhao, Shenao Wang, and Haoyu Wang. 2025
- Ai agents under threat: A survey of key security challenges and future pathways by Deng, Zehang, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, and Yang Xiang. 2025
- Enterprise-grade security for the model context protocol (mcp): Frameworks and mitigation strategies by Narajala, Vineeth Sai, and Idan Habler. 2025
- Position: AI Safety Must Embrace an Antifragile Perspective by Jin, Ming, and Hyunin Lee. 2025
- Red-teaming llm multi-agent systems via communication attacks by He, Pengfei, Yupin Lin, Shen Dong, Han Xu, Yue Xing, and Hui Liu. 2025
- Skin-in-the-game: Decision making via multi-stakeholder alignment in llms by Sel, Bilgehan, Priya Shanmugasundaram, Mohammad Kachuee, Kun Zhou, Ruoxi Jia, and Ming Jin. 2024
- Ai safety in generative ai large language models: A survey by Chua, Jaymari, Yun Li, Shiyi Yang, Chen Wang, and Lina Yao. 2024
- Agent-safetybench: Evaluating the safety of llm agents by Zhang, Zhexin, Shiyao Cui, Yida Lu, Jingzhuo Zhou, Junxiao Yang, Hongning Wang, and Minlie Huang.
- Mart: Improving llm safety with multi-round automatic red-teaming by Ge, Suyu, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, and Yuning Mao. 2023
- Aart: Ai-assisted red-teaming with diverse data generation for new llm-powered applications by Radharapu, Bhaktipriya, Kevin Robinson, Lora Aroyo, and Preethi Lahoti. 2023
- Red teaming language models with language models by Perez, Ethan, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. 2022
- Improving alignment of dialogue agents via targeted human judgements by Glaese, Amelia, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh et al. 2022
- Adversarial training for high-stakes reliability by Ziegler, Daniel, Seraphina Nix, Lawrence Chan, Tim Bauman, Peter Schmidt-Nielsen, Tao Lin, Adam Scherlis et al. 2022
- Analyzing dynamic adversarial training data in the limit by Wallace, Eric, Adina Williams, Robin Jia, and Douwe Kiela. 2021
- Dynabench: Rethinking benchmarking in NLP by Kiela, Douwe, Max Bartolo, Yixin Nie, Divyansh Kaushik, Atticus Geiger, Zhengxuan Wu, Bertie Vidgen et al. 2021
- Beyond accuracy: Behavioral testing of NLP models with CheckList by Ribeiro, Marco Tulio, Tongshuang Wu, Carlos Guestrin, and Sameer Singh. 2020
- HateCheck: Functional tests for hate speech detection models by Röttger, Paul, Bertram Vidgen, Dong Nguyen, Zeerak Waseem, Helen Margetts, and Janet B. 2020
- Recipes for safety in open-domain chatbots by Xu, Jing, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, and Emily Dinan. 2020
- Counterfactual fairness in text classification through robustness by Garg, Sahaj, Vincent Perot, Nicole Limtiaco, Ankur Taly, Ed H. Chi, and Alex Beutel. 2019
- Avoiding reasoning shortcuts: Adversarial evaluation, training, and model development for multi-hop QA by Jiang, Yichen, and Mohit Bansal. 2019
- Build it break it fix it for dialogue safety: Robustness from adversarial human attack by Dinan, Emily, Samuel Humeau, Bharath Chintagunta, and Jason Weston. 2019
- Adversarial NLI: A new benchmark for natural language understanding by Nie, Yixin, Adina Williams, Emily Dinan, Mohit Bansal, Jason Weston, and Douwe Kiela. 2019
- The malicious use of artificial intelligence: Forecasting, prevention, and mitigation by Brundage, Miles, Shahar Avin, Jack Clark, Helen Toner, Peter Eckersley, Ben Garfinkel, Allan Dafoe et al. 2018
- Measuring and mitigating unintended bias in text classification by Dixon, Lucas, John Li, Jeffrey Sorensen, Nithum Thain, and Lucy Vasserman. 2018
- Adversarial examples for evaluating reading comprehension systems by Jia, Robin, and Percy Liang. 2017
- Concrete problems in AI safety by Amodei, Dario, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016
- Safearena: Evaluating the safety of autonomous web agents by Tur, Ada Defne, Nicholas Meade, Xing Han Lù, Alejandra Zambrano, Arkil Patel, Esin Durmus, Spandana Gella, Karolina Stańczak, and Siva Reddy. 2025
- An Illusion of Progress? Assessing the Current State of Web Agents by Xue, Tianci, Weijian Qi, Tianneng Shi, Chan Hee Song, Boyu Gou, Dawn Song, Huan Sun, and Yu Su. 2025
- Workarena: How capable are web agents at solving common knowledge work tasks? by Drouin, Alexandre, Maxime Gasse, Massimo Caccia, Issam H. Laradji, Manuel Del Verme, Tom Marty, Léo Boisvert et al. 2024
- The browsergym ecosystem for web agent research by Chezelles, De, Thibault Le Sellier, Sahar Omidi Shayegan, Lawrence Keunho Jang, Xing Han Lù, Ori Yoran, Dehan Kong et al. 2024
- AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks? by Yoran, Ori, Samuel Joseph Amouyal, Chaitanya Malaviya, Ben Bogin, Ofir Press, and Jonathan Berant. 2024
- VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks by Koh, Jing Yu, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, and Daniel Fried. 2024
- St-webagentbench: A benchmark for evaluating safety and trustworthiness in web agents by Levy, Ido, Ben Wiesel, Sami Marreed, Alon Oved, Avi Yaeli, and Segev Shlomov. 2024
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents by Yuan, Tongxin, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu et al. 2024
- Webcanvas: Benchmarking web agents in online environments by Pan, Yichen, Dehan Kong, Sida Zhou, Cheng Cui, Yifei Leng, Bing Jiang, Hangyu Liu et al. 2024
- WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models by He, Hongliang, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, and Dong Yu. 2024
- Mind2web: Towards a generalist agent for the web by Deng, Xiang, Yu Gu, Boyuan Zheng, Shijie Chen, Sam Stevens, Boshi Wang, Huan Sun, and Yu Su. 2023
- Webshop: Towards scalable real-world web interaction with grounded language agents by Yao, Shunyu, Howard Chen, John Yang, and Karthik Narasimhan. 2022
If you find the repository useful, please cite the study
@article{yang2025agentic,
title={Agentic Web: Weaving the Next Web with AI Agents},
author={Yang, Yingxuan and Ma, Mulei and Huang, Yuxuan and Chai, Huacan and Gong, Chenyu and Geng, Haoran and Zhou, Yuanjian and Wen, Ying and Fang, Meng and Chen, Muhao and others},
journal={arXiv preprint arXiv:2507.21206},
year={2025}
}