Idea2Paper is an end-to-end research agent framework that aims to systematically define and analyze the major stages of the contemporary research process, along with the core challenges inherent to each stage. Rather than treating paper writing as a monolithic generation problem, Idea2Paper explicitly decomposes scientific research into structured phases and identifies critical bottlenecks that hinder the transformation of raw ideas into coherent, submission-ready academic narratives. Through this analysis, Idea2Paper highlights that one of the most fundamental yet underexplored challenges lies in research paradigm generationβthe process of converting an underspecified research idea into a logically consistent, academically grounded research story. Existing systems often struggle to produce stable and reusable research paradigms, especially when reasoning is performed entirely at runtime and under limited contextual grounding.
To address these challenges in a principled and engineering-oriented manner, Idea2Paper adopts a modular system design. Instead of immediately building a fully end-to-end writing system, the project prioritizes the construction of targeted engineering submodules that tackle specific bottlenecks in the research pipeline. As the first and core engineering submodule, Idea2Story is introduced to directly address the problem of research paradigm generation. Idea2Story focuses on transforming underspecified research ideas into complete, coherent, and submission-ready scientific narrative skeletons. By providing a structured research story as an intermediate representation, Idea2Story establishes a stable foundation for downstream stages such as method development, experiment design, and paper writing.
Idea2Paper : https://www.researchgate.net/publication/400280248_Idea2Paper_What_Should_an_End-to-End_Research_Agent_Really_Do
Idea2Story : https://arxiv.org/abs/2601.20833
Idea2Story introduces a pre-computationβdriven framework that shifts literature understanding from runtime reasoning to offline knowledge graph construction, enabling more efficient and reliable autonomous scientific discovery.
- Knowledge-Driven: Uses ICLR data to build a comprehensive knowledge graph.
- Auditable Review: Implements an anchored multi-agent review system for objective feedback.
- Automated Refinement: Includes RAG deduplication and intelligent revision to enhance novelty.
- πΈοΈ Knowledge Graph: Built from ICLR data with Idea/Pattern/Domain/Paper nodes.
- π£ Advanced Retrieval: Three-path retrieval (Idea/Domain/Paper) with two-stage ranking (Jaccard + Embedding).
- π Idea2Story Generation: From pattern selection to story generation, anchored review, and smart correction.
- π€ Anchored Multi-Agent Review: Uses real review statistics as anchors for relative comparisons, producing deterministic and auditable 1-10 scores.
- π Comprehensive Logging: Per-run structured logs for full reproducibility and auditing.
- π
Paper-KG-Pipeline/output/final_story.json: Final structured Story (title/abstract/problem/method/contribs/experiments). - π
Paper-KG-Pipeline/output/pipeline_result.json: Full pipeline trace (reviews, corrections, audits). - π
log/run_.../: Structured logs for every run.
- Python 3.10+
pip install -r Paper-KG-Pipeline/requirements.txtNote: The embedding model is currently fixed to
Qwen/Qwen3-Embedding-8B(SiliconFlow) and cannot be changed yet. We plan to expand this to support more embedding models/providers in future updates.
- Copy
.env.exampleto.envand fill inSILICONFLOW_API_KEY. - (Optional) Copy
i2p_config.example.jsontoi2p_config.jsonto tweak settings.
python Paper-KG-Pipeline/scripts/idea2story_pipeline.py "your research idea"Run a minimal local UI to launch the pipeline and view only high-level stage + final results (no raw logs on screen).
python frontend/server/app.py --host 127.0.0.1 --port 8080Open in your browser:
http://127.0.0.1:8080/
- Run the same pipeline entrypoint (
idea2story_pipeline.py) from a web page. - Configure
SILICONFLOW_API_KEY,LLM_API_URL,LLM_MODELfor the current run (not persisted by the server). - Toggle Novelty / Verification.
- Download the current run logs as a zip.
For more details, see frontend/README.md.
output/
βββ final_story.json # Final generated paper story
βββ pipeline_result.json # Full pipeline results
βββ log.json # Detailed logs
Check final_story.json for the result and pipeline_result.json for the full process.
See the User Guide e for advanced configuration, troubleshooting, and detailed usage examples.
Got questions? You can contact us through any of the options listed below.
Discord: https://discord.gg/FfXtbREb

Scan the QR code with WeChat to communicate
Instead of arbitrary scores, this project uses anchored comparisons. We select anchor papers with known scores, ask LLMs to compare your target against these anchors (better/tie/worse), and then deterministically fit a final numeric score. This ensures the review process is auditable and grounded in real-world data.
- Core Code:
Paper-KG-Pipeline/src/idea2paper/ - Documentation:
| No. | Document | Content | Target Audience |
|---|---|---|---|
| 0 | Project Overview | Overall architecture, core modules, parameter configuration, execution workflow | Everyone |
| 1 | Knowledge Graph Construction | Data sources, node/edge definitions, LLM enhancement, how to run | Developers |
| 2 | Retrieval System | Three-way retrieval strategies, similarity computation, performance optimization | Developers |
| 3 | Idea2Story Pipeline | Pattern selection, Idea fusion, story reflection, critic review | Developers |
- Review Details: MULTIAGENT_REVIEW.md
We welcome PRs and Issues! Please follow the contribution guidelines. Licensed under the MIT License.
- Data Source: ICLR (see KG construction docs)
- Inspiration: Auditable, anchor-centered review processes.
- Community Support: agentAlpha Community
If you find Idea2Story useful, please cite:
@misc{xu2026idea2storyautomatedpipelinetransforming,
title={Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives},
author={Tengyue Xu and Zhuoyang Qian and Gaoge Liu and Li Ling and Zhentao Zhang and Biao Wu and Shuo Zhang and Ke Lu and Wei Shi and Ziqi Wang and Zheng Feng and Yan Luo and Shu Xu and Yongjin Chen and Zhibo Feng and Zhuo Chen and Bruce Yuan and Harry Wang and Kris Chen},
year={2026},
eprint={2601.20833},
archivePrefix={arXiv},
primaryClass={cs.CE},
url={https://arxiv.org/abs/2601.20833}
}
