You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+19-4Lines changed: 19 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -62,13 +62,14 @@ After data generation, you can use [LLaMA-Factory](https://github.com/hiyouga/LL
62
62
63
63
## 📌 Latest Updates
64
64
65
+
-**2025.12.1**: Added search support for [NCBI](https://www.ncbi.nlm.nih.gov/) and [RNAcentral](https://rnacentral.org/) databases, enabling extraction of DNA and RNA data from these bioinformatics databases.
65
66
-**2025.10.30**: We support several new LLM clients and inference backends including [Ollama_client](https://github.com/open-sciencelab/GraphGen/blob/main/graphgen/models/llm/api/ollama_client.py), [http_client](https://github.com/open-sciencelab/GraphGen/blob/main/graphgen/models/llm/api/http_client.py), [HuggingFace Transformers](https://github.com/open-sciencelab/GraphGen/blob/main/graphgen/models/llm/local/hf_wrapper.py) and [SGLang](https://github.com/open-sciencelab/GraphGen/blob/main/graphgen/models/llm/local/sglang_wrapper.py).
66
67
-**2025.10.23**: We support VQA(Visual Question Answering) data generation now. Run script: `bash scripts/generate/generate_vqa.sh`.
67
-
-**2025.10.21**: We support PDF as input format for data generation now via [MinerU](https://github.com/opendatalab/MinerU).
68
68
69
69
<details>
70
70
<summary>History</summary>
71
71
72
+
-**2025.10.21**: We support PDF as input format for data generation now via [MinerU](https://github.com/opendatalab/MinerU).
72
73
-**2025.09.29**: We auto-update gradio demo on [Hugging Face](https://huggingface.co/spaces/chenzihong/GraphGen) and [ModelScope](https://modelscope.cn/studios/chenzihong/GraphGen).
73
74
-**2025.08.14**: We have added support for community detection in knowledge graphs using the Leiden algorithm, enabling the synthesis of Chain-of-Thought (CoT) data.
74
75
-**2025.07.31**: We have added Google, Bing, Wikipedia, and UniProt as search back-ends.
@@ -82,9 +83,10 @@ After data generation, you can use [LLaMA-Factory](https://github.com/hiyouga/LL
82
83
We support various LLM inference servers, API servers, inference clients, input file formats, data modalities, output data formats, and output data types.
83
84
Users can flexibly configure according to the needs of synthetic data.
84
85
85
-
| Inference Server | Api Server | Inference Client | Input File Format | Data Modal | Data Format | Data Type |
0 commit comments