feat: add reference verification pipeline skill by javiertoledo · Pull Request #4 · theam/limina

javiertoledo · 2026-03-25T23:00:57Z

Summary

Adds a complete reference verification system for markdown documents with inline citations
The pipeline catches both accuracy errors (cited claims that don't match sources) and completeness gaps (unsourced assertions and overclaims)
Battle-tested on 7 whitepapers with 192+ cited claims and 75 completeness flags across 5 verification passes

Components

Type	Files	Purpose
Skill	`skills/verify-refs/SKILL.md`, `agents/openai.yaml`	Installable skill metadata + Codex interface
Scripts	`scripts/ref_{extract,triage,verify,reconcile,audit}.py`	5 stdlib-only pipeline scripts (no external deps)
Agents	`.claude/agents/{fact-checker,claim-reviewer,claims-auditor}.md`	3 specialized agent definitions
Command	`.claude/commands/verify-refs.md`	`/verify-refs` slash command with 6-phase workflow

How it works

Two complementary passes run in parallel:

Phase 0 — Completeness audit: LLM agents read each document and flag unsourced assertions and overclaims (e.g., specific stats without citations, claims that overstate evidence).

Phases 1-3 — Citation verification: Regex extraction → LLM triage → isolated fact-checking against original sources. Fact-checker agents have zero project context (only a URL and claims), preventing circular verification.

Phases 4-5: Register results in kb/ and apply fixes (with user approval).

Test plan

Run python3 scripts/ref_extract.py --help — all 5 scripts should show usage
Run python3 -c "import py_compile; py_compile.compile('scripts/ref_extract.py', doraise=True)" for each script — all compile without errors
Verify /verify-refs command appears in Claude Code slash commands
Run Phase 0 on a sample markdown doc with references to validate end-to-end

🤖 Generated with Claude Code

Add a complete reference verification system for markdown documents with inline citations. The pipeline catches both accuracy errors (cited claims that don't match sources) and completeness gaps (unsourced assertions and overclaims). Components: - 5 scripts: ref_extract, ref_triage, ref_verify, ref_reconcile, ref_audit - 3 agents: fact-checker (isolated), claim-reviewer, claims-auditor - /verify-refs slash command with 6-phase workflow - Installable skill metadata in skills/verify-refs/ Key design: fact-checker agents have zero project context (only URL + claims), preventing circular verification. Phase 0 (completeness audit) runs in parallel with Phase 1 (extraction) for efficiency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8001f5030

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-25T23:05:56Z

scripts/ref_triage.py

+            new_source = r.get("correct_source_id", "")
+            if new_source:
+                claim["source_id"] = new_source
+            # Keep as PENDING for verification with corrected source


Validate reattributed source IDs before updating claims

apply_triage writes correct_source_id directly into claim["source_id"], but the claim-reviewer contract allows returning a source name as well as an ID. If a name is returned, ref_verify.generate_all_prompts cannot find it in sources.csv and skips the claim, so that claim remains unverified (still pending) and silently drops out of the verification batches except for a warning. Please map/validate reattributions against known IDs before overwriting source_id.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-25T23:05:56Z

scripts/ref_triage.py

+        if classification == "EDITORIAL":
+            claim["verification_status"] = "NOT_FOUND"
+            claim["source_evidence"] = f"Triage: EDITORIAL — {r.get('reason', '')}"
+            claim["discrepancy_detail"] = "Filtered by claim-reviewer: not a sourced claim."
+        elif classification == "SYNTHESIS":
+            claim["verification_status"] = "NOT_FOUND"
+            claim["source_evidence"] = f"Triage: SYNTHESIS — {r.get('reason', '')}"
+            claim["discrepancy_detail"] = "Filtered by claim-reviewer: original synthesis, not verifiable against single source."


Keep filtered triage outcomes out of NOT_FOUND status

Setting EDITORIAL and SYNTHESIS claims to verification_status = "NOT_FOUND" makes reconciliation misreport them as source-verification misses. ref_reconcile.py interprets NOT_FOUND as “source accessible but claim absent,” so filtered non-source claims inflate failure counts and appear in the Not Found warning section as if fact-checking failed. Use a distinct triage-filtered status (or exclude these rows from reconcile) to preserve report accuracy.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-25T23:05:56Z

scripts/ref_extract.py

+                dedup_key = f"{draft_name}:{ratio}"
+                if dedup_key in seen_claims:
+                    continue


Deduplicate ratio claims with contextual keys

The ratio dedup key only uses {draft_name}:{ratio}, so repeated ratios in the same draft collapse to one claim even when they are different assertions on different lines or tied to different citations. That drops valid claims from verification and can hide real contradictions/misattributions whenever multiple statements share the same numeric pattern (for example, two separate 50:1 claims). Include line/context/source in the dedup key instead of just the ratio value.

Useful? React with 👍 / 👎.

chatgpt-codex-connector bot reviewed Mar 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add reference verification pipeline skill#4

feat: add reference verification pipeline skill#4
javiertoledo wants to merge 1 commit intomainfrom
feat/verify-refs-skill

javiertoledo commented Mar 25, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 25, 2026

Uh oh!

chatgpt-codex-connector bot Mar 25, 2026

Uh oh!

chatgpt-codex-connector bot Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

javiertoledo commented Mar 25, 2026

Summary

Components

How it works

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant