diff --git a/README.md b/README.md index 43774cd..6bd84a5 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@
-DataFlow-Agent Logo
+DataFlow-Agent Logo
# DataFlow-Agent @@ -33,19 +33,21 @@ ## 📑 目录 -- [🔥 News](#-news) -- [📐 项目架构](#-项目架构) -- [✨ 核心应用](#-核心应用) - - [Paper2Any - 论文多模态工作流](#1️⃣-paper2any---论文多模态工作流) - - [Easy-DataFlow - 数据治理管线](#2️⃣-easy-dataflow---数据治理管线) - - [DataFlow-Table - 多源数据分析](#3️⃣-dataflow-table---多源数据分析) -- [🚀 快速开始](#-快速开始) -- [📂 项目结构](#-项目结构) -- [🗺️ Roadmap](#️-roadmap) -- [🤝 贡献](#-贡献) + +- [🔥 News](#news) +- [📐 项目架构](#architecture) +- [✨ 核心应用](#core-apps) + - [Paper2Any - 论文多模态工作流](#paper2any) + - [Easy-DataFlow - 数据治理管线](#easy-dataflow) +- [🚀 快速开始](#quick-start) +- [📂 项目结构](#project-structure) +- [🗺️ Roadmap](#roadmap) +- [🤝 贡献](#contributing) --- + + ## 🔥 News @@ -67,23 +69,26 @@
- Paper2Figure Web UI - Paper2Figure Web UI (2) + Paper2Figure Web UI + Paper2Figure Web UI (2)
--- + ## 📐 项目架构
-项目架构图 -
DataFlow-Agent 延伸的三个核心应用:Paper2Any(论文多模态工作流)、Easy-DataFlow(数据治理管线)、DataFlow-Table(多源数据分析) +项目架构图 +
DataFlow-Agent 的核心应用:Paper2Any(论文多模态工作流)、Easy-DataFlow(数据治理管线)
--- + ## ✨ 核心应用 + ### 1️⃣ Paper2Any - 论文多模态工作流 > 从论文 PDF / 图片 / 文本出发,一键生成**可编辑**的科研绘图、演示文稿、视频脚本、学术海报等多模态内容。 @@ -135,6 +140,9 @@ Paper2Any 当前包含以下几个子能力: --- +
+展开查看 Paper2Figure Showcase + #### 📸 ShowCase - Paper2Figure ##### 模型架构图生成 @@ -147,15 +155,15 @@ Paper2Any 当前包含以下几个子能力: -输入:论文 PDF +输入:论文 PDF
📄 论文 PDF -生成的模型图 +生成的模型图
🎨 生成的模型架构图 -PPTX 截图 +PPTX 截图
📊 可编辑 PPTX @@ -166,15 +174,15 @@ Paper2Any 当前包含以下几个子能力: -输入:论文 PDF +输入:论文 PDF
📄 论文PDF -生成的模型图 +生成的模型图
🎨 生成的模型架构图 -PPTX 截图 +PPTX 截图
📊 可编辑 PPTX @@ -185,7 +193,7 @@ Paper2Any 当前包含以下几个子能力: -输入:论文 PDF +输入:论文 PDF
📄 输入核心段落 @@ -193,7 +201,7 @@ Paper2Any 当前包含以下几个子能力:
🎨 生成的模型架构图 -PPTX 截图 +PPTX 截图
📊 可编辑 PPTX @@ -222,15 +230,15 @@ Paper2Any 当前包含以下几个子能力: -输入:论文文本(中文) +输入:论文文本(中文)
📝 论文方法部分(中文) -技术路线图 SVG +技术路线图 SVG
🗺️ 技术路线图 SVG -PPTX 截图 +PPTX 截图
📊 可编辑 PPTX @@ -241,15 +249,15 @@ Paper2Any 当前包含以下几个子能力: -输入:论文文本(英文) +输入:论文文本(英文)
📝 论文方法部分(英文) -技术路线图 SVG +技术路线图 SVG
🗺️ 技术路线图 SVG -PPTX 截图 +PPTX 截图
📊 可编辑 PPTX @@ -278,15 +286,15 @@ Paper2Any 当前包含以下几个子能力: - 输入:实验结果截图 + 输入:实验结果截图
📄 输入:论文 PDF / 实验结果截图 - 输出:实验数据图(基础样式) + 输出:实验数据图(基础样式)
📈 输出:常规 Python 风格实验数据图 - 输出:实验数据图(精美样式) + 输出:实验数据图(精美样式)
🎨 输出:精美排版的实验数据图 @@ -300,6 +308,8 @@ Paper2Any 当前包含以下几个子能力: --- +
+ #### 🖥️ 使用方式 **方式一:Web 前端(推荐)** @@ -307,7 +317,7 @@ Paper2Any 当前包含以下几个子能力: (目前在线版只支持邀请用户体验)访问在线体验地址:[https://dcai-paper2any.cpolar.top/](https://dcai-paper2any.cpolar.top/)
-前端界面 +前端界面
**特点**: @@ -334,6 +344,7 @@ python gradio_app/app.py --- + ### 2️⃣ Easy-DataFlow - 数据治理管线 > 从任务描述到可执行数据处理管线,AI 驱动的数据治理全流程 @@ -350,12 +361,15 @@ python gradio_app/app.py --- +
+展开查看 Easy-DataFlow 功能截图 + #### 📸 功能展示 **管线推荐:从任务到代码**
-管线推荐 +管线推荐
💻 智能分析任务需求,自动推荐最优算子组合,生成可执行的 Python 管线代码
@@ -364,7 +378,7 @@ python gradio_app/app.py **算子编写:AI 辅助开发**
-算子编写 +算子编写
⚙️ 使用 LLM 辅助从功能描述自动生成算子代码,并在同一界面内完成测试与调试
@@ -373,7 +387,7 @@ python gradio_app/app.py **可视化编排:拖拽式构建**
-可视化编排 +可视化编排
🎨 通过可视化界面拖拽组合算子,自由搭建数据处理流程,所见即所得
@@ -382,7 +396,7 @@ python gradio_app/app.py **Prompt 优化:自动调优**
-Prompt 优化 +Prompt 优化
✨ 复用现有算子,自动书写 DataFlow 的算子 Prompt Template,智能优化提示词
@@ -391,29 +405,15 @@ python gradio_app/app.py **Web 采集:网页到数据**
-Web 采集 +Web 采集
📊 自动化网页数据采集与结构化转换,直接输出 DataFlow-ready 数据
--- -### 3️⃣ DataFlow-Table - 多源数据分析 - -> 一站式接入多源数据,自动化分析与洞察生成 - -#### 🚧 正在开发中 - -DataFlow-Table 正在积极开发中,敬请期待! - -**计划功能**: -- 📥 多数据源接入(数据库 / 文件 / Web / API) -- 🧹 智能清洗与标准化 -- 📊 基于 AI 的自动分析 -- 📝 自然语言分析报告生成 -- 📈 交互式图表与报表 - ---- +
+ ## 🚀 快速开始 ### 环境要求 @@ -445,6 +445,9 @@ pip install -e . Paper2Any 需要额外依赖(见 `requirements-paper.txt`),以及一些系统/conda 工具用于渲染与矢量图处理: +
+展开:Paper2Any 额外依赖安装 + ```bash # 安装 Paper2Any 依赖 pip install -r requirements-paper.txt @@ -457,6 +460,8 @@ sudo apt-get update sudo apt-get install -y inkscape ``` +
+ ### 配置环境 ```bash @@ -464,11 +469,13 @@ export DF_API_KEY=your_api_key_here export DF_API_URL=xxx # 如果需要使用第三方API中转站 ``` -第三方API中转站: +
+可选:第三方 API 中转站 -[https://api.apiyi.com/](https://api.apiyi.com/) +- https://api.apiyi.com/ +- http://123.129.219.111:3000/ -[http://123.129.219.111:3000/](http://123.129.219.111:3000/) +
--- @@ -481,18 +488,31 @@ export DF_API_URL=xxx **Web 前端(推荐)** +> 前端需要 Node.js 18+。 + ```bash -# 启动后端 API +# 启动后端 API(终端 1) cd fastapi_app -uvicorn main:app --host 0.0.0.0 --port 8000 +uvicorn main:app --host 0.0.0.0 --port 9999 -# 启动前端(新终端) +# 启动前端(终端 2) cd frontend-workflow npm install npm run dev +``` + +访问 `http://localhost:3000` + +> [!NOTE] +> `frontend-workflow/vite.config.ts` 默认已将 `/api` 代理到 `http://127.0.0.1:9999`。 + +
+如需修改前端代理端口(vite.config.ts) + +```ts +import { defineConfig } from 'vite' +import react from '@vitejs/plugin-react' -# 配置dev/DataFlow-Agent/frontend-workflow/vite.config.ts -# 修改 server.proxy 为: export default defineConfig({ plugins: [react()], server: { @@ -501,7 +521,7 @@ export default defineConfig({ allowedHosts: true, proxy: { '/api': { - target: 'http://127.0.0.1:8000', // FastAPI 后端地址 + target: 'http://127.0.0.1:9999', // FastAPI 后端地址 changeOrigin: true, }, }, @@ -509,7 +529,7 @@ export default defineConfig({ }) ``` -访问 `http://localhost:3000` +
> [!TIP] > **Paper2Figure 网页端内测说明** @@ -547,15 +567,7 @@ python gradio_app/app.py - 📝 支持批量处理 --- -> [!NOTE] -> **DataFlow-Table**:面向多源数据接入与探索式分析,目前仍在开发中。 - -#### 🔍 DataFlow-Table - 数据分析 - -🚧 **正在开发中,敬请期待!** - ---- - + ## 📂 项目结构 ``` @@ -584,8 +596,15 @@ DataFlow-Agent/ --- + ## 🗺️ Roadmap +> [!NOTE] +> Roadmap 表格较长,默认折叠;点击展开查看完整内容。 + +
+展开查看完整 Roadmap(表格) + ### 🎓 Paper 系列 @@ -671,16 +690,6 @@ DataFlow-Agent/ 完成 - - - - -
📊 DataFlow-Table
多源数据分析
开发中 -开发中
-开发中
-开发中
-开发中 -
--- @@ -714,12 +723,15 @@ DataFlow-Agent/
-Workflow Editor +Workflow Editor
🎨 Workflow 可视化编辑器预览
--- +
+ + ## 🤝 贡献 我们欢迎所有形式的贡献! diff --git a/README_EN.md b/README_EN.md index 51d076a..3007ca8 100644 --- a/README_EN.md +++ b/README_EN.md @@ -1,6 +1,6 @@
-DataFlow-Agent Logo
+DataFlow-Agent Logo
# DataFlow-Agent @@ -33,19 +33,35 @@ English | [中文](README.md) ## 📑 Table of Contents -- [🔥 News](#-news) -- [📐 Architecture](#-architecture) -- [✨ Core Applications](#-core-applications) - - [Paper2Any - Paper Multimodal Workflow](#1%EF%B8%8F%E2%83%A3-paper2any---paper-multimodal-workflow) - - [Easy-DataFlow - Data Governance Pipeline](#2%EF%B8%8F%E2%83%A3-easy-dataflow---data-governance-pipeline) - - [DataFlow-Table - Multi-source Data Analysis](#3%EF%B8%8F%E2%83%A3-dataflow-table---multi-source-data-analysis) -- [🚀 Quick Start](#-quick-start) -- [📂 Project Structure](#-project-structure) -- [🗺️ Roadmap](#%EF%B8%8F-roadmap) -- [🤝 Contributing](#-contributing) +- [⚡ TL;DR](#tldr) +- [🔥 News](#news) +- [📐 Architecture](#architecture) +- [✨ Core Applications](#core-apps) + - [Paper2Any - Paper Multimodal Workflow](#paper2any) + - [Easy-DataFlow - Data Governance Pipeline](#easy-dataflow) +- [🚀 Quick Start](#quick-start) +- [📂 Project Structure](#project-structure) +- [🗺️ Roadmap](#roadmap) +- [🤝 Contributing](#contributing) --- + +## ⚡ TL;DR + +> [!TIP] +> Fastest way to run Paper2Any Web locally: +> +> 1. Install: `pip install -r requirements.txt && pip install -e .` +> 2. Configure: `export DF_API_KEY=...` and `export DF_API_URL=...` +> 3. Backend: `uvicorn fastapi_app.main:app --host 0.0.0.0 --port 9999` +> 4. Frontend (Node.js 18+): `cd frontend-workflow && npm install && npm run dev` +> +> See [🚀 Quick Start](#quick-start) for full steps. + +--- + + ## 🔥 News @@ -67,23 +83,26 @@ One-click generation of multiple editable scientific figures, i
- Paper2Figure Web UI - Paper2Figure Web UI (2) + Paper2Figure Web UI + Paper2Figure Web UI (2)
--- + ## 📐 Architecture
-Project Architecture -
Three core applications extended from DataFlow-Agent: Paper2Any (Paper Multimodal Workflow), Easy-DataFlow (Data Governance Pipeline), DataFlow-Table (Multi-source Data Analysis) +Project Architecture +
Core applications: Paper2Any (Paper Multimodal Workflow), Easy-DataFlow (Data Governance Pipeline)
--- + ## ✨ Core Applications + ### 1️⃣ Paper2Any - Paper Multimodal Workflow > Starting from a paper PDF / image / text, generate **editable** multimodal outputs such as scientific figures, slide decks, video scripts, academic posters, and more. @@ -135,6 +154,9 @@ Paper2Any currently includes the following sub-modules: --- +
+Show Paper2Figure Showcase (image-heavy) + #### 📸 Showcase - Paper2Figure ##### Model Architecture Diagram Generation @@ -147,15 +169,15 @@ Paper2Any currently includes the following sub-modules: -Input: paper PDF +Input: paper PDF
📄 Paper PDF -Generated model diagram +Generated model diagram
🎨 Generated model architecture -PPTX screenshot +PPTX screenshot
📊 Editable PPTX @@ -166,15 +188,15 @@ Paper2Any currently includes the following sub-modules: -Input: paper PDF +Input: paper PDF
📄 Paper PDF -Generated model diagram +Generated model diagram
🎨 Generated model architecture -PPTX screenshot +PPTX screenshot
📊 Editable PPTX @@ -185,7 +207,7 @@ Paper2Any currently includes the following sub-modules: -Input: key paragraphs +Input: key paragraphs
📄 Input key paragraphs @@ -193,7 +215,7 @@ Paper2Any currently includes the following sub-modules:
🎨 Generated model architecture -PPTX screenshot +PPTX screenshot
📊 Editable PPTX @@ -222,15 +244,15 @@ Upload a paper PDF and choose the diagram difficulty (Easy/Medium/Hard). The sys -Input: paper text (Chinese) +Input: paper text (Chinese)
📝 Method section (Chinese) -Roadmap diagram SVG +Roadmap diagram SVG
🗺️ Roadmap diagram SVG -PPTX screenshot +PPTX screenshot
📊 Editable PPTX @@ -241,15 +263,15 @@ Upload a paper PDF and choose the diagram difficulty (Easy/Medium/Hard). The sys -Input: paper text (English) +Input: paper text (English)
📝 Method section (English) -Roadmap diagram SVG +Roadmap diagram SVG
🗺️ Roadmap diagram SVG -PPTX screenshot +PPTX screenshot
📊 Editable PPTX @@ -278,15 +300,15 @@ Paste the method section and select the language (Chinese/English). The system o - Input: experimental results screenshot + Input: experimental results screenshot
📄 Input: paper PDF / results screenshot - Output: standard plot + Output: standard plot
📈 Output: standard Python-style plot - Output: polished plot + Output: polished plot
🎨 Output: publication-ready styled plot @@ -300,6 +322,8 @@ Upload an experimental results screenshot/table. The system extracts key numbers --- +
+ #### 🖥️ How to Use **Option 1: Web Frontend (Recommended)** @@ -308,7 +332,7 @@ The online version is currently available for invited users only: [https://dcai-paper2any.cpolar.top/](https://dcai-paper2any.cpolar.top/)
-Web UI +Web UI
**Highlights**: @@ -335,6 +359,7 @@ Open `http://127.0.0.1:7860` --- + ### 2️⃣ Easy-DataFlow - Data Governance Pipeline > From task description to executable pipelines: an AI-powered end-to-end data governance workflow. @@ -351,12 +376,15 @@ Open `http://127.0.0.1:7860` --- +
+Show Easy-DataFlow screenshots + #### 📸 Feature Demos **Pipeline Recommendation: From task to code**
-Pipeline recommendation +Pipeline recommendation
💻 Analyze requirements and generate an optimal operator chain with runnable Python pipeline code
@@ -365,7 +393,7 @@ Open `http://127.0.0.1:7860` **Operator Authoring: AI-assisted development**
-Operator authoring +Operator authoring
⚙️ Generate operator code from functional descriptions and test/debug in the same UI
@@ -374,7 +402,7 @@ Open `http://127.0.0.1:7860` **Visual Orchestration: Drag-and-drop**
-Visual orchestration +Visual orchestration
🎨 Build pipelines visually by composing operators with a WYSIWYG interface
@@ -383,7 +411,7 @@ Open `http://127.0.0.1:7860` **Prompt Optimization: Automatic tuning**
-Prompt optimization +Prompt optimization
✨ Reuse existing operators to auto-generate DataFlow prompt templates and optimize prompts
@@ -392,29 +420,15 @@ Open `http://127.0.0.1:7860` **Web Collection: Web to data**
-Web collection +Web collection
📊 Automate web collection & structuring into DataFlow-ready datasets
--- -### 3️⃣ DataFlow-Table - Multi-source Data Analysis - -> Connect to multiple data sources and generate automated analysis and insights. - -#### 🚧 Work in Progress - -DataFlow-Table is under active development. Stay tuned! - -**Working features**: -- 📥 Multi-source ingestion (DB / files / web / API) -- 🧹 Intelligent cleaning & normalization -- 📊 AI-driven automated analysis -- 📝 Natural-language reports -- 📈 Interactive charts & dashboards - ---- +
+ ## 🚀 Quick Start ### Requirements @@ -446,6 +460,9 @@ pip install -e . Paper2Any requires extra Python dependencies (see `requirements-paper.txt`) and a few system/conda tools for rendering and vector graphics processing: +
+Show: Paper2Any extra dependencies + ```bash # Install Paper2Any dependencies pip install -r requirements-paper.txt @@ -458,6 +475,8 @@ sudo apt-get update sudo apt-get install -y inkscape ``` +
+ ### Environment Configuration ```bash @@ -466,11 +485,13 @@ export DF_API_URL=xxx # If using third-party API gateway ``` -Third-party API gateways: +
+Optional: Third-party API gateways -[https://api.apiyi.com/](https://api.apiyi.com/) +- https://api.apiyi.com/ +- http://123.129.219.111:3000/ -[http://123.129.219.111:3000/](http://123.129.219.111:3000/) +
--- @@ -483,18 +504,31 @@ Third-party API gateways: **Web Frontend (Recommended)** +> The frontend requires Node.js 18+. + ```bash -# Start backend API +# Start backend API (Terminal 1) cd fastapi_app -uvicorn main:app --host 0.0.0.0 --port 8000 +uvicorn main:app --host 0.0.0.0 --port 9999 -# Start frontend (new terminal) +# Start frontend (Terminal 2) cd frontend-workflow npm install npm run dev +``` + +Visit `http://localhost:3000` + +> [!NOTE] +> `frontend-workflow/vite.config.ts` already proxies `/api` to `http://127.0.0.1:9999` by default. + +
+Change frontend proxy port (vite.config.ts) + +```ts +import { defineConfig } from 'vite' +import react from '@vitejs/plugin-react' -# Configure dev/DataFlow-Agent/frontend-workflow/vite.config.ts -# Modify server.proxy to: export default defineConfig({ plugins: [react()], server: { @@ -503,7 +537,7 @@ export default defineConfig({ allowedHosts: true, proxy: { '/api': { - target: 'http://127.0.0.1:8000', // FastAPI backend address + target: 'http://127.0.0.1:9999', // FastAPI backend address changeOrigin: true, }, }, @@ -511,7 +545,7 @@ export default defineConfig({ }) ``` -Visit `http://localhost:3000` +
> [!TIP] > **Paper2Figure Web Beta Access** @@ -550,15 +584,7 @@ Visit `http://127.0.0.1:7860` --- -> [!NOTE] -> **DataFlow-Table**: For multi-source data ingestion and exploratory analysis, currently under development. - -#### 🔍 DataFlow-Table - Data Analysis - -🚧 **Under development, stay tuned!** - ---- - + ## 📂 Project Structure ``` @@ -587,8 +613,15 @@ DataFlow-Agent/ --- + ## 🗺️ Roadmap +> [!NOTE] +> The roadmap tables are long; they are collapsed by default. + +
+Show full Roadmap (tables) + ### 🎓 Paper Series @@ -674,16 +707,6 @@ DataFlow-Agent/ Done - - - - -
📊 DataFlow-Table
Multi-source Data Analysis
Working -Working
-Working
-Working
-Working -
--- @@ -717,12 +740,15 @@ DataFlow-Agent/
-Workflow Editor +Workflow Editor
🎨 Workflow Visual Editor Preview
--- +
+ + ## 🤝 Contributing We welcome all forms of contributions!