Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
474a1b5
docs: fix artimuse example link; polish Tuiwen.md layout for Xiaohong…
Kylie-dot-s Sep 5, 2025
fb2732d
chore: satisfy end-of-file-fixer by ensuring newline at EOF in Tuiwen.md
Kylie-dot-s Sep 8, 2025
56e22f0
Merge pull request #164 from Kylie-dot-s/feat/docs-artimuse-tuiwen
e06084 Sep 8, 2025
0ecf9ac
docs: update readme with seo meta
e06084 Sep 9, 2025
57dac84
Merge pull request #174 from e06084/dev
e06084 Sep 9, 2025
8b752ff
docs: update readme with Trust Score badage
e06084 Sep 9, 2025
16e64d3
Merge pull request #175 from e06084/dev
e06084 Sep 9, 2025
feafb75
feat: new config
shijinpjlab Sep 10, 2025
316e9d3
feat: 多轮 exapmle link
shijinpjlab Sep 10, 2025
12f38d4
feat: fix lint
shijinpjlab Sep 10, 2025
1ed74b2
Merge pull request #177 from shijinpjlab/dev_0910
e06084 Sep 10, 2025
7a8089a
add Document Parse Quality Prompt
Sep 23, 2025
bf59cbc
fix ci
Sep 23, 2025
9242ed9
fix ci
Sep 23, 2025
84241b0
📚 Auto-update metrics documentation
actions-user Sep 23, 2025
e64fcd1
fix ci
Sep 23, 2025
e529b17
Merge branch 'main' of https://github.com/chaserRen/dingo-rzf
Sep 23, 2025
49b5bbe
fix: fix Hallucination eval
e06084 Sep 23, 2025
21d347a
Merge pull request #186 from e06084/dev
e06084 Sep 23, 2025
36c99cf
fix: fix Hallucination docs
e06084 Sep 23, 2025
51df8a0
Merge pull request #187 from e06084/dev
e06084 Sep 23, 2025
e934614
Merge pull request #185 from chaserRen/main
e06084 Sep 24, 2025
a023043
add xyz audio rules
Sep 24, 2025
248ab20
📚 Auto-update metrics documentation
actions-user Sep 24, 2025
59f5e0e
fix conflict
Sep 24, 2025
ec59bcb
fix ci
Sep 24, 2025
6b1b6fa
fix conflict
Sep 24, 2025
c852cff
fix ci
Sep 24, 2025
51f1007
fix ci
Sep 24, 2025
f54d9d9
feat: add audio example
e06084 Sep 24, 2025
2b36869
Merge pull request #1 from e06084/pr-189
chaserRen Sep 25, 2025
7037da2
Merge pull request #189 from chaserRen/main
e06084 Sep 25, 2025
fe0c21a
feat: 3h eval with reason result
e06084 Sep 25, 2025
88e42a2
📚 Auto-update metrics documentation
actions-user Sep 25, 2025
c5cd805
Merge pull request #191 from e06084/dev
shijinpjlab Sep 25, 2025
41de8c5
docs: add artimuse x and reddit post
e06084 Sep 26, 2025
b983a2e
Merge pull request #192 from e06084/dev
e06084 Sep 26, 2025
2871a16
feat: vsl add multi dir
e06084 Sep 28, 2025
94b92fa
x
e06084 Sep 28, 2025
e91885f
Merge pull request #195 from e06084/dev
shijinpjlab Sep 29, 2025
0abb686
x
e06084 Sep 28, 2025
318c0fb
feat: local file support gz
e06084 Sep 29, 2025
edc4f30
Merge pull request #197 from e06084/dev
e06084 Sep 29, 2025
7619e1d
feat: add factcheck example
e06084 Sep 29, 2025
f8502c5
Merge pull request #198 from e06084/dev
e06084 Sep 29, 2025
a4d16bf
feat: ModelRes type and name support List
shijinpjlab Oct 9, 2025
4d190f4
feat: RuleImageLabelOverlap RuleImageLabelVisualization
shijinpjlab Oct 9, 2025
0564fa9
feat: fix lint
shijinpjlab Oct 9, 2025
ba0cb43
feat: example data
shijinpjlab Oct 9, 2025
e984a99
feat: example
shijinpjlab Oct 9, 2025
ec7935e
feat: example
shijinpjlab Oct 9, 2025
b17dcdb
feat: unit test
shijinpjlab Oct 9, 2025
433bec1
feat: fix lint
shijinpjlab Oct 9, 2025
0580267
Merge pull request #200 from shijinpjlab/dev_1009
e06084 Oct 9, 2025
a016429
feat: webkit新旧方法对比,prompt优化
1041206149 Oct 11, 2025
f64048b
feat: 优化webkit抽取prompt
1041206149 Oct 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 16 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,18 @@
<div align="center" xmlns="http://www.w3.org/1999/html">
<!-- SEO Meta Information and Structured Data -->
<div itemscope itemtype="https://schema.org/SoftwareApplication" align="center" xmlns="http://www.w3.org/1999/html">
<meta itemprop="name" content="Dingo: A Comprehensive AI Data Quality Evaluation Tool">
<meta itemprop="description" content="Comprehensive AI-powered data quality assessment platform for machine learning datasets, LLM training data validation, hallucination detection, and RAG system evaluation">
<meta itemprop="applicationCategory" content="Data Quality Software">
<meta itemprop="operatingSystem" content="Cross-platform">
<meta itemprop="programmingLanguage" content="Python">
<meta itemprop="url" content="https://github.com/MigoXLab/dingo">
<meta itemprop="downloadUrl" content="https://pypi.org/project/dingo-python/">
<meta itemprop="softwareVersion" content="latest">
<meta itemprop="license" content="Apache-2.0">

<!-- logo -->
<p align="center">
<img src="docs/assets/dingo-logo.png" width="300px" style="vertical-align:middle;">
<img src="docs/assets/dingo-logo.png" width="300px" style="vertical-align:middle;" alt="Dingo AI Data Quality Evaluation Tool Logo">
</p>

<!-- badges -->
Expand All @@ -15,8 +26,7 @@
<a href="https://github.com/DataEval/dingo/issues"><img src="https://img.shields.io/github/issues/DataEval/dingo" alt="GitHub issues"></a>
<a href="https://mseep.ai/app/dataeval-dingo"><img src="https://mseep.net/pr/dataeval-dingo-badge.png" alt="MseeP.ai Security Assessment Badge" height="20"></a>
<a href="https://deepwiki.com/MigoXLab/dingo"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>

[![Trust Score](https://archestra.ai/mcp-catalog/api/badge/quality/DataEval/dingo)](https://archestra.ai/mcp-catalog/dataeval__dingo)
<a href="https://archestra.ai/mcp-catalog/dataeval__dingo"><img src="https://archestra.ai/mcp-catalog/api/badge/quality/DataEval/dingo" alt="Trust Score"></a>
</p>

</div>
Expand Down Expand Up @@ -45,7 +55,7 @@
</p>


# Introduction
# Introduction of Dingo

Dingo is a data quality evaluation tool that helps you automatically detect data quality issues in your datasets. Dingo provides a variety of built-in rules and model evaluation methods, and also supports custom evaluation methods. Dingo supports commonly used text datasets and multimodal datasets, including pre-training datasets, fine-tuning datasets, and evaluation datasets. In addition, Dingo supports multiple usage methods, including local CLI and SDK, making it easy to integrate into various evaluation platforms, such as [OpenCompass](https://github.com/open-compass/opencompass).

Expand All @@ -61,7 +71,7 @@ Dingo is a data quality evaluation tool that helps you automatically detect data
pip install dingo-python
```

## Example Use Cases
## Example Use Cases of Dingo

### 1. Evaluate LLM chat data

Expand Down
15 changes: 13 additions & 2 deletions README_ja.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,18 @@
<div align="center" xmlns="http://www.w3.org/1999/html">
<!-- SEO メタ情報と構造化データ -->
<div itemscope itemtype="https://schema.org/SoftwareApplication" align="center" xmlns="http://www.w3.org/1999/html">
<meta itemprop="name" content="Dingo: AI データ品質評価ツール">
<meta itemprop="description" content="機械学習データセット、LLM学習データ検証、幻覚検出、RAGシステム評価のための包括的なAI駆動データ品質評価プラットフォーム">
<meta itemprop="applicationCategory" content="データ品質ソフトウェア">
<meta itemprop="operatingSystem" content="クロスプラットフォーム">
<meta itemprop="programmingLanguage" content="Python">
<meta itemprop="url" content="https://github.com/MigoXLab/dingo">
<meta itemprop="downloadUrl" content="https://pypi.org/project/dingo-python/">
<meta itemprop="softwareVersion" content="latest">
<meta itemprop="license" content="Apache-2.0">

<!-- logo -->
<p align="center">
<img src="docs/assets/dingo-logo.png" width="300px" style="vertical-align:middle;">
<img src="docs/assets/dingo-logo.png" width="300px" style="vertical-align:middle;" alt="Dingo AI データ品質評価ツール ロゴ">
</p>

<!-- badges -->
Expand Down
19 changes: 15 additions & 4 deletions README_zh-CN.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,18 @@
<div align="center" xmlns="http://www.w3.org/1999/html">
<!-- SEO 元信息和结构化数据 -->
<div itemscope itemtype="https://schema.org/SoftwareApplication" align="center" xmlns="http://www.w3.org/1999/html">
<meta itemprop="name" content="Dingo: AI 数据质量评估工具">
<meta itemprop="description" content="全面的AI驱动数据质量评估平台,专为机器学习数据集、LLM训练数据验证、幻觉检测和RAG系统评估而设计">
<meta itemprop="applicationCategory" content="数据质量软件">
<meta itemprop="operatingSystem" content="跨平台">
<meta itemprop="programmingLanguage" content="Python">
<meta itemprop="url" content="https://github.com/MigoXLab/dingo">
<meta itemprop="downloadUrl" content="https://pypi.org/project/dingo-python/">
<meta itemprop="softwareVersion" content="latest">
<meta itemprop="license" content="Apache-2.0">

<!-- logo -->
<p align="center">
<img src="docs/assets/dingo-logo.png" width="300px" style="vertical-align:middle;">
<img src="docs/assets/dingo-logo.png" width="300px" style="vertical-align:middle;" alt="Dingo AI 数据质量评估工具 Logo">
</p>

<!-- badges -->
Expand Down Expand Up @@ -40,7 +51,7 @@
</div>


# 介绍
# Dingo 介绍

Dingo是一款数据质量评估工具,帮助你自动化检测数据集中的数据质量问题。Dingo提供了多种内置的规则和模型评估方法,同时也支持自定义评估方法。Dingo支持常用的文本数据集和多模态数据集,包括预训练数据集、微调数据集和评测数据集。此外,Dingo支持多种使用方式,包括本地CLI和SDK,便于集成到各种评测平台,如[OpenCompass](https://github.com/open-compass/opencompass)等。

Expand All @@ -57,7 +68,7 @@ Dingo是一款数据质量评估工具,帮助你自动化检测数据集中的
pip install dingo-python
```

## 2. 使用示例
## 2. Dingo 使用示例

### 2.1 评估LLM对话数据

Expand Down
2 changes: 1 addition & 1 deletion app_gradio/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -438,4 +438,4 @@ def get_data_column_mapping():
)

# 启动界面
demo.launch()
demo.launch(server_port=7861, share=True)
43 changes: 37 additions & 6 deletions dingo/data/datasource/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,20 +30,51 @@ def load_local_file(path: str, by_line: bool = True) -> Generator[str, None, Non
Returns:
str: The contents of the file.
"""
import gzip

if not os.path.exists(path):
raise RuntimeError(f'"{path}" is not a valid path')
f_list = []
if os.path.exists(path) and os.path.isfile(path):
f_list = [path]
elif os.path.exists(path) and os.path.isdir(path):
find_all_files(path, f_list)

for f in f_list:
with open(f, "r", encoding="utf-8") as _f:
if by_line:
for line in _f.readlines():
yield line
else:
yield _f.read()
# Check if file is gzipped
if f.endswith('.gz'):
try:
with gzip.open(f, 'rt', encoding='utf-8') as _f:
if by_line:
for line in _f.readlines():
yield line
else:
yield _f.read()
except Exception as gz_error:
raise RuntimeError(
f'Failed to read gzipped file "{f}": {str(gz_error)}. '
f'Please ensure the file is a valid gzip-compressed text file.'
)
else:
# For regular files, try UTF-8 encoding
try:
with open(f, "r", encoding="utf-8") as _f:
if by_line:
for line in _f.readlines():
yield line
else:
yield _f.read()
except UnicodeDecodeError as decode_error:
raise RuntimeError(
f'Failed to read file "{f}": Unsupported file format or encoding. '
f'Dingo only supports UTF-8 text files (.jsonl, .json, .txt) and .gz compressed text files. '
f'Original error: {str(decode_error)}'
)
except Exception as e:
raise RuntimeError(
f'Unexpected error reading file "{f}": {str(e)}. '
f'Please check if the file exists and is readable.'
)


@DataSource.register()
Expand Down
54 changes: 46 additions & 8 deletions dingo/exec/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -238,12 +238,30 @@ def evaluate_rule(self, group: List[BaseRule], d: Data) -> ResultInfo:
# analyze result
if tmp.error_status:
result_info.error_status = True
bad_type_list.append(tmp.type)
bad_name_list.append(tmp.type + "-" + tmp.name)
if isinstance(tmp.type, str) and isinstance(tmp.name, str):
bad_type_list.append(tmp.type)
bad_name_list.append(tmp.type + "-" + tmp.name)
elif isinstance(tmp.type, List) and isinstance(tmp.name, List):
if len(tmp.type) != len(tmp.name):
raise Exception(f'ModelRes.type is not the same length to ModelRes.name.\n type: {tmp.type} \n name: {tmp.name}')
for i in range(len(tmp.type)):
bad_type_list.append(tmp.type[i])
bad_name_list.append(tmp.type[i] + "-" + tmp.name[i])
else:
raise Exception('ModelRes.type and ModelRes.name are not str or List at the same time.')
bad_reason_list.extend(tmp.reason)
else:
good_type_list.append(tmp.type)
good_name_list.append(tmp.type + "-" + tmp.name)
if isinstance(tmp.type, str) and isinstance(tmp.name, str):
good_type_list.append(tmp.type)
good_name_list.append(tmp.type + "-" + tmp.name)
elif isinstance(tmp.type, List) and isinstance(tmp.name, List):
if len(tmp.type) != len(tmp.name):
raise Exception(f'ModelRes.type is not the same length to ModelRes.name.\n type: {tmp.type} \n name: {tmp.name}')
for i in range(len(tmp.type)):
good_type_list.append(tmp.type[i])
good_name_list.append(tmp.type[i] + "-" + tmp.name[i])
else:
raise Exception('ModelRes.type and ModelRes.name are not str or List at the same time.')
good_reason_list.extend(tmp.reason)
if result_info.error_status:
result_info.type_list = list(set(bad_type_list))
Expand Down Expand Up @@ -271,12 +289,32 @@ def evaluate_prompt(self, group: List[BasePrompt], d: Data) -> ResultInfo:
# analyze result
if tmp.error_status:
result_info.error_status = True
bad_type_list.append(tmp.type)
bad_name_list.append(tmp.type + "-" + tmp.name)
if isinstance(tmp.type, str) and isinstance(tmp.name, str):
bad_type_list.append(tmp.type)
bad_name_list.append(tmp.type + "-" + tmp.name)
elif isinstance(tmp.type, List) and isinstance(tmp.name, List):
if len(tmp.type) != len(tmp.name):
raise Exception(
f'ModelRes.type is not the same length to ModelRes.name.\n type: {tmp.type} \n name: {tmp.name}')
for i in range(len(tmp.type)):
bad_type_list.append(tmp.type[i])
bad_name_list.append(tmp.type[i] + "-" + tmp.name[i])
else:
raise Exception('ModelRes.type and ModelRes.name are not str or List at the same time.')
bad_reason_list.extend(tmp.reason)
else:
good_type_list.append(tmp.type)
good_name_list.append(tmp.type + "-" + tmp.name)
if isinstance(tmp.type, str) and isinstance(tmp.name, str):
good_type_list.append(tmp.type)
good_name_list.append(tmp.type + "-" + tmp.name)
elif isinstance(tmp.type, List) and isinstance(tmp.name, List):
if len(tmp.type) != len(tmp.name):
raise Exception(
f'ModelRes.type is not the same length to ModelRes.name.\n type: {tmp.type} \n name: {tmp.name}')
for i in range(len(tmp.type)):
good_type_list.append(tmp.type[i])
good_name_list.append(tmp.type[i] + "-" + tmp.name[i])
else:
raise Exception('ModelRes.type and ModelRes.name are not str or List at the same time.')
good_reason_list.extend(tmp.reason)
if result_info.error_status:
result_info.type_list = list(set(bad_type_list))
Expand Down
49 changes: 17 additions & 32 deletions dingo/model/llm/llm_factcheck_public.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,9 @@ def eval(cls, input_data: Data) -> ModelRes:

except Exception as e:
return ModelRes(
error_status=True,
type="QUALITY_BAD_FACTUALITY",
name="FACTUALITY_CHECK_ERROR",
score=0.0,
threshold=cls.threshold,
reason=[f"Evaluation failed: {str(e)}"],
Expand Down Expand Up @@ -210,41 +213,23 @@ def _parse_check_results(cls, text: str) -> List[FactCheckResult]:

results = []
for item in data:
evidence_list = [
Evidence(**e) for e in item["supporting_evidence"]
]
# 处理 evidence,确保所有必需字段都存在
evidence_list = []
for e in item.get("supporting_evidence", []):
# 确保所有必需字段都存在,提供默认值
evidence = Evidence(
url=e.get("url", ""),
snippet=e.get("snippet", ""), # 提供默认值避免缺失
summary=e.get("summary", "")
)
evidence_list.append(evidence)

results.append(FactCheckResult(
claim=item["claim"],
answer=item["answer"],
reasoning=item["reasoning"],
claim=item.get("claim", ""),
answer=item.get("answer", "unsure"), # 默认为 unsure
reasoning=item.get("reasoning", ""),
supporting_evidence=evidence_list
))
return results
except Exception as e:
raise ValueError(f"Invalid results format: {str(e)}")

@classmethod
def send_messages(cls, messages: List) -> str:
"""重写发送消息方法,避免使用 models.list()"""
if not cls.dynamic_config.model:
raise ValueError("model name must be specified")

params = cls.dynamic_config.parameters or {}
cls.validate_config(params)

completions = cls.client.chat.completions.create(
model=cls.dynamic_config.model,
messages=messages,
temperature=params.get("temperature", 0.3),
top_p=params.get("top_p", 1),
max_tokens=params.get("max_tokens", 4000),
presence_penalty=params.get("presence_penalty", 0),
frequency_penalty=params.get("frequency_penalty", 0),
)

if completions.choices[0].finish_reason == "length":
raise ExceedMaxTokens(
f"Exceed max tokens: {params.get('max_tokens', 4000)}"
)

return str(completions.choices[0].message.content)
2 changes: 1 addition & 1 deletion dingo/model/llm/llm_hallucination.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def build_messages(cls, input_data: Data) -> List:
# Format contexts for display
contexts_str = json.dumps(contexts, ensure_ascii=False, indent=2)

prompt_content = cls.prompt.content % (question, response, contexts_str)
prompt_content = cls.prompt.content.format(question, response, contexts_str)

messages = [{"role": "user", "content": prompt_content}]
return messages
Expand Down
12 changes: 6 additions & 6 deletions dingo/model/llm/llm_html_abtract.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
from dingo.model import Model
from dingo.model.llm.base_openai import BaseOpenAI
from dingo.model.modelres import ModelRes
from dingo.model.prompt.prompt_html_abstract import PromptHtmlAbstract
from dingo.model.prompt.prompt_html_extract_compare import PromptHtmlExtractCompare
from dingo.model.response.response_class import ResponseScoreTypeNameReason
from dingo.utils import log
from dingo.utils.exception import ConvertJsonError


@Model.llm_register("LLMHtmlAbstract")
class LLMHtmlAbstract(BaseOpenAI):
prompt = PromptHtmlAbstract
@Model.llm_register("LLMHtmlExtractCompare")
class LLMHtmlExtractCompare(BaseOpenAI):
prompt = PromptHtmlExtractCompare

@classmethod
def build_messages(cls, input_data: Data) -> List:
Expand All @@ -23,8 +23,8 @@ def build_messages(cls, input_data: Data) -> List:
"role": "user",
"content": cls.prompt.content.format(
input_data.content,
input_data.raw_data["markdown_ours"],
input_data.raw_data["markdown_m10"],
input_data.raw_data["magic_md"],
input_data.raw_data["content"],
),
}
]
Expand Down
4 changes: 2 additions & 2 deletions dingo/model/llm/llm_text_3h.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,12 +41,12 @@ def process_response(cls, response: str) -> ModelRes:

# error_status
if response_model.score == 1:
result.reason = [response_model.reason]
result.reason = [response_model.reason] if response_model.reason else ["Response meets quality criteria"]
result.name = cls.prompt.__name__[8:].upper()
else:
result.error_status = True
result.type = "QUALITY_BAD"
result.reason = [response_model.reason]
result.reason = [response_model.reason] if response_model.reason else ["Response fails quality criteria"]
result.name = "NOT_" + cls.prompt.__name__[8:].upper()

return result
Loading