Skip to content

drhiidden/Idea2Paper

Β 
Β 

Repository files navigation

logo

πŸ“Œ Table of Contents



πŸ“„ Idea2Paper

Idea2Paper is an end-to-end research agent framework that aims to systematically define and analyze the major stages of the contemporary research process, along with the core challenges inherent to each stage. Rather than treating paper writing as a monolithic generation problem, Idea2Paper explicitly decomposes scientific research into structured phases and identifies critical bottlenecks that hinder the transformation of raw ideas into coherent, submission-ready academic narratives. Through this analysis, Idea2Paper highlights that one of the most fundamental yet underexplored challenges lies in research paradigm generationβ€”the process of converting an underspecified research idea into a logically consistent, academically grounded research story. Existing systems often struggle to produce stable and reusable research paradigms, especially when reasoning is performed entirely at runtime and under limited contextual grounding.

To address these challenges in a principled and engineering-oriented manner, Idea2Paper adopts a modular system design. Instead of immediately building a fully end-to-end writing system, the project prioritizes the construction of targeted engineering submodules that tackle specific bottlenecks in the research pipeline. As the first and core engineering submodule, Idea2Story is introduced to directly address the problem of research paradigm generation. Idea2Story focuses on transforming underspecified research ideas into complete, coherent, and submission-ready scientific narrative skeletons. By providing a structured research story as an intermediate representation, Idea2Story establishes a stable foundation for downstream stages such as method development, experiment design, and paper writing.

Idea2Paper : https://www.researchgate.net/publication/400280248_Idea2Paper_What_Should_an_End-to-End_Research_Agent_Really_Do

Idea2Story : https://arxiv.org/abs/2601.20833

Idea2Story (Core Submodule of Idea2Paper)

Idea2Story introduces a pre-computation–driven framework that shifts literature understanding from runtime reasoning to offline knowledge graph construction, enabling more efficient and reliable autonomous scientific discovery.

🧠 Core Philosophy

  • Knowledge-Driven: Uses ICLR data to build a comprehensive knowledge graph.
  • Auditable Review: Implements an anchored multi-agent review system for objective feedback.
  • Automated Refinement: Includes RAG deduplication and intelligent revision to enhance novelty.
Idea2Paper Architecture
Idea2Story pipeline architecture (a core module within Idea2Paper)

✨ Key Features

  • πŸ•ΈοΈ Knowledge Graph: Built from ICLR data with Idea/Pattern/Domain/Paper nodes.
  • 🎣 Advanced Retrieval: Three-path retrieval (Idea/Domain/Paper) with two-stage ranking (Jaccard + Embedding).
  • πŸ“ Idea2Story Generation: From pattern selection to story generation, anchored review, and smart correction.
  • πŸ€– Anchored Multi-Agent Review: Uses real review statistics as anchors for relative comparisons, producing deterministic and auditable 1-10 scores.
  • πŸ“Š Comprehensive Logging: Per-run structured logs for full reproducibility and auditing.

πŸ“¦ Outputs

  • πŸ“„ Paper-KG-Pipeline/output/final_story.json: Final structured Story (title/abstract/problem/method/contribs/experiments).
  • πŸ” Paper-KG-Pipeline/output/pipeline_result.json: Full pipeline trace (reviews, corrections, audits).
  • πŸ“‚ log/run_.../: Structured logs for every run.

πŸš€ Getting Started

Prerequisites

  • Python 3.10+

Installation

pip install -r Paper-KG-Pipeline/requirements.txt

Note: The embedding model is currently fixed to Qwen/Qwen3-Embedding-8B (SiliconFlow) and cannot be changed yet. We plan to expand this to support more embedding models/providers in future updates.

Configuration

  1. Copy .env.example to .env and fill in SILICONFLOW_API_KEY.
  2. (Optional) Copy i2p_config.example.json to i2p_config.json to tweak settings.

Usage

python Paper-KG-Pipeline/scripts/idea2story_pipeline.py "your research idea"

🌐 Frontend (Local Web UI)

Run a minimal local UI to launch the pipeline and view only high-level stage + final results (no raw logs on screen).

Start

python frontend/server/app.py --host 127.0.0.1 --port 8080

Open in your browser:

http://127.0.0.1:8080/

What you can do in the UI

  • Run the same pipeline entrypoint (idea2story_pipeline.py) from a web page.
  • Configure SILICONFLOW_API_KEY, LLM_API_URL, LLM_MODEL for the current run (not persisted by the server).
  • Toggle Novelty / Verification.
  • Download the current run logs as a zip.

For more details, see frontend/README.md.

Output

output/
β”œβ”€β”€ final_story.json # Final generated paper story
β”œβ”€β”€ pipeline_result.json # Full pipeline results
└── log.json # Detailed logs

Check final_story.json for the result and pipeline_result.json for the full process.

πŸ“˜ Need More Help?

See the User Guide e for advanced configuration, troubleshooting, and detailed usage examples.

Got questions? You can contact us through any of the options listed below.

Discord: https://discord.gg/FfXtbREb


Scan the QR code with WeChat to communicate

## πŸ€– Anchored Multi‑Agent Review

Instead of arbitrary scores, this project uses anchored comparisons. We select anchor papers with known scores, ask LLMs to compare your target against these anchors (better/tie/worse), and then deterministically fit a final numeric score. This ensures the review process is auditable and grounded in real-world data.

πŸ“š Files & Docs

  • Core Code: Paper-KG-Pipeline/src/idea2paper/
  • Documentation:
No. Document Content Target Audience
0 Project Overview Overall architecture, core modules, parameter configuration, execution workflow Everyone
1 Knowledge Graph Construction Data sources, node/edge definitions, LLM enhancement, how to run Developers
2 Retrieval System Three-way retrieval strategies, similarity computation, performance optimization Developers
3 Idea2Story Pipeline Pattern selection, Idea fusion, story reflection, critic review Developers

🀝 Contributing & License

We welcome PRs and Issues! Please follow the contribution guidelines. Licensed under the MIT License.

πŸ™ Credits

  • Data Source: ICLR (see KG construction docs)
  • Inspiration: Auditable, anchor-centered review processes.
  • Community Support: agentAlpha Community

πŸ‘₯ Contributors

πŸ“‘ Citation (Idea2Story)

If you find Idea2Story useful, please cite:

@misc{xu2026idea2storyautomatedpipelinetransforming,
  title={Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives},
  author={Tengyue Xu and Zhuoyang Qian and Gaoge Liu and Li Ling and Zhentao Zhang and Biao Wu and Shuo Zhang and Ke Lu and Wei Shi and Ziqi Wang and Zheng Feng and Yan Luo and Shu Xu and Yongjin Chen and Zhibo Feng and Zhuo Chen and Bruce Yuan and Harry Wang and Kris Chen},
  year={2026},
  eprint={2601.20833},
  archivePrefix={arXiv},
  primaryClass={cs.CE},
  url={https://arxiv.org/abs/2601.20833}
}

πŸ“ˆ Star History

Star History Chart

About

Idea2Paper Offical Demo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.5%
  • JavaScript 1.2%
  • Other 1.3%