doc(l10n): add AI agent instructions to review translations

jiangxin · jiangxin · commit 724b92905e49 · 2026-03-03T14:20:29.000+08:00
Add a new "Reviewing po/XX.po" section to po/AGENTS.md that provides
comprehensive guidance for AI agents to review translation files.

The review workflow leverages git-po-helper subcommands:

- git-po-helper compare: Extract new or changed entries between two PO
  file versions into a valid PO file for review. Supports multiple modes:
  * Compare HEAD with working tree (local changes)
  * Compare parent of commit with the commit (--commit)
  * Compare commit with working tree (--since)
  * Compare two arbitrary revisions (-r)

- git-po-helper msg-select: Split large review files into smaller batches
  by entry index range for manageable review sessions. Supports range
  formats like "-50" (first 50), "51-100", "101-" (from 101 to end).

The review procedure covers:
1. Extracting entries using git-po-helper compare
2. Handling large files with batch processing via git-po-helper msg-select
3. Step-by-step review against glossary and PO format rules
4. Outputting review reports in JSON format with issues and suggestions

This enables AI agents to conduct structured translation reviews for:
- Full file reviews
- Changes in a specific commit
- Changes since a specific commit

Signed-off-by: Jiang Xin &lt;worldhello.net@gmail.com&gt;
diff --git a/po/AGENTS.md b/po/AGENTS.md
@@ -10,6 +10,7 @@ most commonly used housekeeping tasks:
 1. Generating or updating po/git.pot
 2. Updating po/XX.po
 3. Translating po/XX.po
+4. Reviewing translation quality
 
 
 ## Background knowledge for localization workflows
@@ -728,6 +729,176 @@ and fuzzy entry; do not stop before the loop completes.
    ```
 
 
+### Task 4: Review translation quality
+
+Review may target the full `po/XX.po`, a specific commit, or changes since a
+commit. When asked to review, follow the steps below. **Note**: This task uses
+`git-po-helper compare`; if `git-po-helper` is not available, the task
+cannot be performed.
+
+1. **Check for existing review**: Evaluate in order:
+
+   - If `po/review-input.po` does **not** exist, proceed to step 2 regardless
+     of any other files (e.g., batch or JSON files).
+   - If both `po/review-input.po` and `po/review-result.json` exist, go
+     directly to the final step (Merge and summary) and display the report.
+     Do **not** check for batch or other temporary files; no further review
+     steps are needed.
+   - If `po/review-input.po` exists but `po/review-result.json` does **not**
+     exist, go to step 4 (Check batch files and select current batch) to
+     continue the previous unfinished review.
+
+2. **Extract entries**: Run `git-po-helper compare` with the desired range and
+   redirect the output to `po/review-input.po`. Do not use `git show` or
+   `git diff`—they can fragment or lose PO context (see "Comparing PO files
+   for translation and review" under git-po-helper).
+
+3. **Prepare review batches**: Run the script below to clean up any leftover
+   files from previous reviews and split `po/review-input.po` into one or
+   more `po/review-input-<N>.json` files (dynamic batch sizing). Run as a
+   single script (define the function, then call it):
+
+   ```shell
+   review_split_batches () {
+       min_batch_size=${1:-50}
+       rm -f po/review-input-*.json
+       rm -f po/review-result-*.json
+       rm -f po/review-result.json
+       rm -f po/review-output.po
+
+       if test ! -f po/review-output.po
+       then
+           cp po/review-input.po po/review-output.po
+       fi
+
+       ENTRY_COUNT=$(grep -c '^msgid ' po/review-input.po 2>/dev/null || true)
+       ENTRY_COUNT=$((ENTRY_COUNT > 0 ? ENTRY_COUNT - 1 : 0))
+
+       if test "$ENTRY_COUNT" -gt $min_batch_size
+       then
+           if test "$ENTRY_COUNT" -gt $((min_batch_size * 8))
+           then
+               NUM=$((min_batch_size * 2))
+           elif test "$ENTRY_COUNT" -gt $((min_batch_size * 4))
+           then
+               NUM=$((min_batch_size + min_batch_size / 2))
+           else
+               NUM=$min_batch_size
+           fi
+           BATCH_COUNT=$(( (ENTRY_COUNT + NUM - 1) / NUM ))
+           for i in $(seq 1 "$BATCH_COUNT")
+           do
+               START=$(((i - 1) * NUM + 1))
+               END=$((i * NUM))
+               if test "$END" -gt "$ENTRY_COUNT"
+               then
+                   END=$ENTRY_COUNT
+               fi
+               if test "$i" -eq 1
+               then
+                   git-po-helper msg-select --json --range "-$NUM" \
+                       po/review-input.po -o "po/review-input-$i.json"
+               elif test "$END" -ge "$ENTRY_COUNT"
+               then
+                   git-po-helper msg-select --json --range "$START-" \
+                       po/review-input.po -o "po/review-input-$i.json"
+               else
+                   git-po-helper msg-select --json --range "$START-$END" \
+                       po/review-input.po -o "po/review-input-$i.json"
+               fi
+           done
+       else
+           git-po-helper msg-cat --json \
+               -o po/review-input-1.json po/review-input.po
+       fi
+   }
+   review_split_batches 50
+   ```
+
+4. **Check batch files and select current batch**: If no batch files
+   (`po/review-input-*.json`) exist, proceed to step 9. Otherwise, select the
+   **first** remaining file (smallest batch index N) as the current batch. In
+   steps 5–8, "current batch file" means `po/review-input-<N>.json`. This
+   enables resuming after an unexpected stop.
+
+5. **Read context**: Consult the "Background knowledge for localization
+   workflows" section for PO format, JSON format, placeholder rules, and
+   terminology. If the current batch file has a glossary in the
+   `header_comment` field, add it to your context for consistent terminology.
+
+6. **Review entries**:
+   - Read the current batch file (`po/review-input-<N>.json`).
+   - Do not review or modify the header entry (in PO format: empty `msgid`
+     with metadata in `msgstr`; in JSON format: `header_comment` and
+     `header_meta`).
+   - For all other entries, check `msgstr` and `msgstr_plural` against the
+     "Quality checklist" above.
+   - After reviewing all entries in the current batch, save the report as
+     described in the next step.
+
+7. **Generate review report**:
+   - Save the report for the current batch to `po/review-result-<N>.json`.
+     See the "Review result JSON format" section below.
+   - For each entry with issues, create an issue object: copy the original
+     `msgid` to the `msgid` field; put the correct translation in
+     `suggest_msgstr` (singular) or `suggest_msgstr_plural` (plural); write a
+     summary of the issue in `description`; set `score` from 0 to 3 (3 =
+     perfect, no issues found; 0 = critical, 1 = major, 2 = minor).
+   - Include only entries with issues (score less than 3). Do **not** include
+     entries with no issues (score 3).
+   - Optionally provide inline suggestions or a human-readable report.
+
+8. **Repeat review process**: After saving the report to
+   `po/review-result-<N>.json`, delete `po/review-input-<N>.json`. Return to
+   step 4 to review the next batch.
+
+9. **Merge and summary**: Run the command below to merge all
+   `po/review-result-*.json` files into `po/review-result.json`, optionally
+   apply the result to `po/review-output.po`, and display the report. The
+   command shows its output to the user. Do **not** open or read the result
+   files; the user will refer to them as needed.
+
+   ```shell
+   git-po-helper agent-run --apply report
+   ```
+
+**Review result JSON format**:
+
+- `issues`: Array of issue objects. Each issue has:
+  - `msgid` (and `msgid_plural` for plural): Original source text for reference.
+  - `suggest_msgstr`: Correct translation for the singular form.
+  - `suggest_msgstr_plural`: Array of correct translations for plural forms;
+    `suggest_msgstr` is empty for plural entries.
+  - `score`: 0–3 (see scale below).
+  - `description`: Brief summary of the issue.
+- Score scale: 0 = critical (must fix before release), 1 = major (should fix),
+  2 = minor (improve later), 3 = perfect.
+
+
+```json
+{
+  "issues": [
+    {
+      "msgid": "commit",
+      "msgid_plural": "",
+      "suggest_msgstr": "提交",
+      "suggest_msgstr_plural": [],
+      "score": 0,
+      "description": "Terminology error: 'commit' should be translated as '提交'",
+    },
+    {
+      "msgid": "repository",
+      "msgid_plural": "repositories",
+      "suggest_msgstr": "",
+      "suggest_msgstr_plural": ["仓库", "仓库"],
+      "score": 2,
+      "description": "Consistency issue: '版本库' and '仓库' are used interchangeably; suggest using '仓库' consistently",
+    }
+  ]
+}
+```
+
+
 ## Human translators remain in control
 
 Git translation is human-driven; language team leaders and contributors are
@@ -740,7 +911,13 @@ responsible for:
 - Building and maintaining language glossaries
 - Reviewing and approving all changes before submission
 
-AI tools, if used, only accelerate routine tasks.
+AI tools, if used, only accelerate routine tasks:
+
+- First-draft translations for new or updated messages
+- Finding untranslated or fuzzy entries
+- Checking consistency with glossary and existing translations
+- Detecting technical errors (placeholders, formatting)
+- Reviewing against quality criteria
 
 AI-generated output should always be treated as rough drafts requiring human
 review, editing, and approval by someone who understands both the technical