Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
9d6cca1
docs: add 1.9.0 post
e06084 Aug 1, 2025
9c16fb2
Merge pull request #140 from e06084/market
shijinpjlab Aug 1, 2025
3061ea8
feat: add GPT-5 Factuality eval
e06084 Aug 11, 2025
033cdbf
x
e06084 Aug 11, 2025
a886c34
Merge pull request #143 from e06084/dev
shijinpjlab Aug 11, 2025
fd0105c
feat: add image artimuse
shijinpjlab Aug 21, 2025
012f41f
feat: fix lint
shijinpjlab Aug 21, 2025
80609f4
feat: example
shijinpjlab Aug 21, 2025
292cad8
feat: example add main
shijinpjlab Aug 21, 2025
012adaf
feat: 添加artimuse的介绍文档
shijinpjlab Aug 21, 2025
474f1dd
feat: 添加artimuse的_metric_info
shijinpjlab Aug 22, 2025
88d2e98
feat: fix lint
shijinpjlab Aug 22, 2025
055d658
Merge pull request #145 from shijinpjlab/dev_0821
e06084 Aug 22, 2025
efae22e
fix: rule_config key_list load in subprocessor
e06084 Aug 23, 2025
04a20e5
Merge pull request #147 from e06084/dev
e06084 Aug 23, 2025
b43d37a
docs: add github star click gif
e06084 Aug 25, 2025
185047b
Merge pull request #148 from e06084/dev
e06084 Aug 25, 2025
69a1c62
feat: add meta-rater model-based evaluations
seancoding-day Aug 25, 2025
1c410da
📚 Auto-update metrics documentation
actions-user Aug 25, 2025
9ac1cb1
Merge pull request #149 from seancoding-day/dev
e06084 Aug 25, 2025
0964447
feat: update clickstar gif
shijinpjlab Aug 25, 2025
b721d7a
feat: fix lint
shijinpjlab Aug 25, 2025
915a704
feat: dynamic_config threshold
shijinpjlab Aug 25, 2025
a398981
Merge pull request #150 from shijinpjlab/dev_0825
e06084 Aug 25, 2025
b9a5319
新增数学公式对比
1041206149 Aug 25, 2025
69dadcf
Merge pull request #152 from MigoXLab/main
e06084 Aug 25, 2025
444fe8d
add: _metric_info
1041206149 Aug 26, 2025
15e60fb
x
1041206149 Aug 26, 2025
b34230e
Merge pull request #151 from 1041206149/LLM_math
shijinpjlab Aug 26, 2025
4e65bab
feat: update gradio demo
shijinpjlab Aug 29, 2025
225fe67
Merge pull request #154 from shijinpjlab/dev_0829
e06084 Aug 29, 2025
4f00b1a
docs: update wechat pic
e06084 Aug 29, 2025
a0c720c
Merge pull request #155 from e06084/dev
e06084 Aug 29, 2025
1e2c883
docs: update CONTRIBUTING docs
e06084 Sep 1, 2025
9b4bd08
Merge pull request #159 from e06084/dev
e06084 Sep 1, 2025
95454fe
feat: 添加artimuse抽样测试集
shijinpjlab Sep 3, 2025
7065ae6
feat: 添加nano
shijinpjlab Sep 4, 2025
1962f63
feat: 重命名
shijinpjlab Sep 4, 2025
dfed837
feat: gradio标题
shijinpjlab Sep 4, 2025
38bb285
Merge pull request #160 from shijinpjlab/dev_0903
shijinpjlab Sep 4, 2025
9111472
docs(artimuse): align with code; add sample reproduction and description
Kylie-dot-s Sep 4, 2025
e2cfde8
docs(artimuse): address review feedback
Kylie-dot-s Sep 4, 2025
65491d4
docs(artimuse): refine overview; correct nano_banana source; dedup ru…
Kylie-dot-s Sep 4, 2025
c03d510
Merge pull request #163 from Kylie-dot-s/docs/artimuse-pr
e06084 Sep 5, 2025
b71774c
docs: update wechat pic
e06084 Sep 8, 2025
1b101c6
Merge pull request #170 from e06084/dev
e06084 Sep 8, 2025
e86fd86
docs: update wechat pic
e06084 Aug 30, 2025
680f7a0
Update vsl.py
Kylie-dot-s Sep 1, 2025
85ee6f5
docs: wechat pic
e06084 Sep 6, 2025
061ffcc
Fix Issue #165: Status of PromptTextHelpful always be bad
sakunkun Sep 8, 2025
5129cf3
📚 Auto-update metrics documentation
actions-user Sep 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@
</p>


<p align="center">
If you like Dingo, please give us a ⭐ on GitHub!
<br/>
<a href="https://github.com/DataEval/dingo/stargazers" target="_blank">
<img src="docs/assets/clickstar_2.gif" alt="Click Star" width="480">
</a>
</p>


# Introduction

Dingo is a data quality evaluation tool that helps you automatically detect data quality issues in your datasets. Dingo provides a variety of built-in rules and model evaluation methods, and also supports custom evaluation methods. Dingo supports commonly used text datasets and multimodal datasets, including pre-training datasets, fine-tuning datasets, and evaluation datasets. In addition, Dingo supports multiple usage methods, including local CLI and SDK, making it easy to integrate into various evaluation platforms, such as [OpenCompass](https://github.com/open-compass/opencompass).
Expand Down Expand Up @@ -183,6 +192,7 @@ Our evaluation system includes:
- **Classification Metrics**: Topic categorization and content classification
- **Multimodality Assessment Metrics**: Image classification and relevance evaluation
- **Rule-Based Quality Metrics**: Automated quality checks using heuristic rules for effectiveness and similarity detection
- **Factuality Assessment Metrics**: Two-stage factuality evaluation based on GPT-5 System Card
- etc

Most metrics are backed by academic sources to ensure objectivity and scientific rigor.
Expand Down Expand Up @@ -217,6 +227,12 @@ For detailed guidance on using Dingo's hallucination detection capabilities, inc

📖 **[View Hallucination Detection Guide →](docs/hallucination_guide.md)**

### Factuality Assessment

For comprehensive guidance on using Dingo's two-stage factuality evaluation system:

📖 **[View Factuality Assessment Guide →](docs/factcheck_guide.md)**

# Rule Groups

Dingo provides pre-configured rule groups for different types of datasets:
Expand Down
15 changes: 15 additions & 0 deletions README_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,14 @@
👋 <a href="https://discord.gg/Jhgb2eKWh8" target="_blank">Discord</a>と<a href="./docs/assets/wechat.jpg" target="_blank">WeChat</a>でご参加ください
</p>

<p align="center">
このプロジェクトが役に立ったら、GitHubで⭐を付けてください!
<br/>
<a href="https://github.com/DataEval/dingo/stargazers" target="_blank">
<img src="docs/assets/clickstar_2.gif" alt="Star をクリック" width="480">
</a>
</p>


# はじめに

Expand Down Expand Up @@ -178,6 +186,7 @@ Dingoはルールベースおよびプロンプトベースの評価メトリク
- **分類メトリクス**: トピック分類とコンテンツ分類
- **マルチモーダル評価メトリクス**: 画像分類と関連性評価
- **ルールベース品質メトリクス**: ヒューリスティックルールによる効果性と類似性検出を用いた自動品質チェック
- **事実性評価メトリクス**: GPT-5 System Cardに基づく二段階事実性評価
- など

大部分のメトリクスは学術的なソースによって支持されており、客観性と科学的厳密性を保証しています。
Expand Down Expand Up @@ -212,6 +221,12 @@ HHEM-2.1-Openローカル推論とLLMベース評価を含む、Dingoの幻覚

📖 **[幻覚検出ガイドを見る →](docs/hallucination_guide.md)**

### 事実性評価

Dingoの二段階事実性評価システムの使用に関する詳細なガイダンス:

📖 **[事実性評価ガイドを見る →](docs/factcheck_guide.md)**

# ルールグループ

Dingoは異なるタイプのデータセット用に事前設定されたルールグループを提供します:
Expand Down
15 changes: 15 additions & 0 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,14 @@
👋 加入我们 <a href="https://discord.gg/Jhgb2eKWh8" target="_blank">Discord</a> 和 <a href="./docs/assets/wechat.jpg" target="_blank">微信</a>
</p>

<p align="center">
如果觉得有帮助,欢迎在 GitHub 上点个 ⭐ 支持!
<br/>
<a href="https://github.com/DataEval/dingo/stargazers" target="_blank">
<img src="docs/assets/clickstar_2.gif" alt="点击 Star 支持" width="480">
</a>
</p>

</div>


Expand Down Expand Up @@ -179,6 +187,7 @@ Dingo通过基于规则和基于提示的评估指标提供全面的数据质量
- **分类指标**:主题分类和内容分类
- **多模态评估指标**:图像分类和相关性评估
- **基于规则的质量指标**:使用启发式规则进行效果性和相似性检测的自动化质量检查
- **事实性评估指标**:基于 GPT-5 System Card 的两阶段事实性评估
- 等等

大部分指标都由学术来源支持,以确保客观性和科学严谨性。
Expand Down Expand Up @@ -213,6 +222,12 @@ input_data = {

📖 **[查看幻觉检测指南 →](docs/hallucination_guide.md)**

### 事实性评估

有关使用Dingo两阶段事实性评估系统的详细指导:

📖 **[查看事实性评估指南 →](docs/factcheck_guide.md)**

# 规则组

Dingo为不同类型的数据集提供预配置的规则组:
Expand Down
32 changes: 19 additions & 13 deletions app_gradio/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,20 +48,26 @@ def dingo_demo(

try:
input_data = {
"dataset": dataset_source,
"data_format": data_format,
"input_path": final_input_path,
"output_path": "" if dataset_source == 'hugging_face' else os.path.dirname(final_input_path),
"save_data": True,
"save_raw": True,

"max_workers": max_workers,
"batch_size": batch_size,

"column_content": column_content,
"custom_config": {
"dataset": {
"source": dataset_source,
"format": data_format,
"field": {
"content": column_content
}
},
"executor": {
"rule_list": rule_list,
"prompt_list": prompt_list,
"result_save": {
"bad": True,
"raw": True
},
"max_workers": max_workers,
"batch_size": batch_size,
},
"evaluator": {
"llm_config": {
scene_list: {
"model": model,
Expand All @@ -72,11 +78,11 @@ def dingo_demo(
}
}
if column_id:
input_data['column_id'] = column_id
input_data['dataset']['field']['id'] = column_id
if column_prompt:
input_data['column_prompt'] = column_prompt
input_data['dataset']['field']['prompt'] = column_prompt
if column_image:
input_data['column_image'] = column_image
input_data['dataset']['field']['image'] = column_image

# print(input_data)
# exit(0)
Expand Down
2 changes: 1 addition & 1 deletion app_gradio/header.html
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@
color: #fafafa;
opacity: 0.8;
">
Dingo: A Comprehensive Data Quality Evaluation Tool.<br>
Dingo: A Comprehensive AI Data Quality Evaluation Tool.<br>
</p>
<style>
.link-block {
Expand Down
5 changes: 5 additions & 0 deletions dingo/exec/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,11 @@ def evaluate(self):
log.debug("[Summary]: " + str(self.summary))

def evaluate_single_data(self, group_type, group, data: Data):
# Ensure dynamic configs are applied in child processes as well
try:
Model.apply_config(self.input_args)
except Exception as e:
raise RuntimeError(f"Failed to apply config in child process: {e}")
result_info = ResultInfo(
data_id=data.data_id, prompt=data.prompt, content=data.content
)
Expand Down
Loading