feat: update auto-gen-metric #289

e06084 · 2025-12-15T11:39:18Z

No description provided.

gemini-code-assist · 2025-12-15T11:39:41Z

Summary of Changes

Hello @e06084, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a significant refactoring of metric definitions and their corresponding documentation. The primary goal is to standardize metric naming to reflect their underlying LLM-based nature and to improve the overall organization and clarity of the metric catalog. Additionally, new evaluation metrics specifically designed for Retrieval Augmented Generation (RAG) systems have been integrated, enhancing the system's capability to assess advanced language model applications. These changes are supported by updates to the automated documentation generation process, ensuring consistency and maintainability.

Highlights

Metric Naming Convention Update: Many existing metrics have been renamed from a Prompt* prefix to an LLM* prefix, standardizing the naming convention for Large Language Model-based metrics across various categories like Pretrain Text Quality Assessment, SFT Data Assessment, Classification, and Meta Rater Evaluation.
Refined Metric Categorization: Metric categories have been updated for clarity and better organization. For instance, 'Resume ATS Matching Metrics' and 'Resume ATS Optimization Metrics' are now consolidated under 'Resume Quality Assessment Metrics', and 'Xinghe Data Quality Metrics' are now 'Rule-Based TEXT Quality Metrics'.
Introduction of RAG Evaluation Metrics: A new dedicated section for 'RAG Evaluation Metrics' has been added to the documentation, introducing metrics such as LLMRAGAnswerRelevancy, LLMRAGContextPrecision, and LLMRAGFaithfulness to assess Retrieval Augmented Generation systems.
Documentation Generation Script Enhancement: The scripts/generate_metrics.py script has been refactored to align with the new metric naming and categorization, ensuring that the metrics.md documentation is automatically generated accurately and includes the new RAG metrics in the correct order.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant refactoring of the metrics system, primarily renaming 'Prompt-based' metrics to 'LLM-based' metrics and reorganizing the metric categories for better clarity. The changes are consistently applied across metric definition files, the auto-generation script, and the resulting metrics.md documentation. The script scripts/generate_metrics.py has been updated to reflect this new structure and now uses llm_name_map and standardizes on using class names for metric names in the documentation.

My review focuses on the correctness of the generation script and the consistency of the generated documentation. I've identified a potential data issue in the documentation and a suggestion to improve the maintainability of the generation script. Overall, the changes are well-executed and improve the structure of the metrics.

gemini-code-assist · 2025-12-15T11:41:13Z

docs/metrics.md

+| `LLMMinerURecognizeQuality` | LLMMinerURecognizeQuality | Evaluate the quality of mineru recognize | Internal Implementation | [📊 See Results](error_category and error_label) |
+| `VLMDocumentParsingOCRTrain` | VLMDocumentParsingOCRTrain | Evaluate the quality of mineru recognize | Internal Implementation | [📊 See Results](error_category and error_label) |


The metrics LLMMinerURecognizeQuality and VLMDocumentParsingOCRTrain appear to have identical descriptions ("Evaluate the quality of mineru recognize"). This might be a copy-paste error in the source _metric_info of one of these metrics. Please verify if VLMDocumentParsingOCRTrain should have a more specific description to differentiate it from LLMMinerURecognizeQuality.

gemini-code-assist · 2025-12-15T11:41:13Z

scripts/generate_metrics.py

+    category_order = [
+        "RAG Evaluation Metrics",
+        "Pretrain Text Quality Assessment Metrics",
+        "SFT Data Assessment Metrics",
+        "Classification Metrics",
+        "Multimodality Assessment Metrics",
+        "Rule-Based TEXT Quality Metrics",
+        "Rule-Based IMG Quality Metrics"
+    ]


The category_order list only defines the order for a subset of the metric categories. The remaining categories are appended alphabetically, which might lead to an unintended document structure if new categories are added in the future. To ensure a stable and explicit document structure, it's better to include all categories in this list in the desired order.

category_order = [ "RAG Evaluation Metrics", "Pretrain Text Quality Assessment Metrics", "SFT Data Assessment Metrics", "Classification Metrics", "Multimodality Assessment Metrics", "Rule-Based TEXT Quality Metrics", "Rule-Based IMG Quality Metrics", "Audio Quality Metrics", "Meta Rater Evaluation Metrics", "OCR Eval Metric", "Resume Quality Assessment Metrics", "Rule-Based RESUME Quality Metrics", "Text Generation", ]

* feat: update auto-gen-metric * 📚 Auto-update metrics documentation --------- Co-authored-by: GitHub Action <[email protected]>

feat: update auto-gen-metric

9b3c8f4

📚 Auto-update metrics documentation

df68051

gemini-code-assist bot reviewed Dec 15, 2025

View reviewed changes

e06084 merged commit f2ffe4f into MigoXLab:dev Dec 15, 2025

tenwanft pushed a commit to tenwanft/dingo that referenced this pull request Dec 24, 2025

feat: update auto-gen-metric (MigoXLab#289)

ba0c60b

* feat: update auto-gen-metric * 📚 Auto-update metrics documentation --------- Co-authored-by: GitHub Action <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: update auto-gen-metric #289

feat: update auto-gen-metric #289

Uh oh!

e06084 commented Dec 15, 2025

Uh oh!

gemini-code-assist bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		\| `LLMMinerURecognizeQuality` \| LLMMinerURecognizeQuality \| Evaluate the quality of mineru recognize \| Internal Implementation \| [📊 See Results](error_category and error_label) \|
		\| `VLMDocumentParsingOCRTrain` \| VLMDocumentParsingOCRTrain \| Evaluate the quality of mineru recognize \| Internal Implementation \| [📊 See Results](error_category and error_label) \|

feat: update auto-gen-metric #289

feat: update auto-gen-metric #289

Uh oh!

Conversation

e06084 commented Dec 15, 2025

Uh oh!

gemini-code-assist bot commented Dec 15, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants