Skip to content

Commit ce3bc67

Browse files
Zacclaude
andcommitted
Rewrite README opening to be human-first
Lead with the scenario and problem the tool solves, not a dry technical description. Replace feature table with narrative differentiators. Technical details preserved in the back half. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 7516bbb commit ce3bc67

File tree

1 file changed

+8
-16
lines changed

1 file changed

+8
-16
lines changed

README.md

Lines changed: 8 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,7 @@
44

55
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
66

7-
**Resolve messy identifiers to canonical IDs using versioned registries.**
8-
9-
Know what matched, what didn't, and why.
7+
**The same entity has five names across three vendors. canon makes them one.**
108

119
```bash
1210
brew install cmdrvl/tap/canon
@@ -16,22 +14,16 @@ brew install cmdrvl/tap/canon
1614

1715
---
1816

19-
## TL;DR
20-
21-
**The Problem**: The same entity has 5 names across 3 vendors. CUSIPs map to ISINs map to tickers — but which mapping version? Counterparty names drift. The resolution lives in a VLOOKUP chain, an unmaintained script, or someone's head.
17+
The same loan appears as CUSIP `037833100` in one system, ISIN `US0378331005` in another, and ticker `AAPL` in a third. Three vendors, three identifiers, one entity. Your reconciliation pipeline needs them to be the same row. Right now, the mapping lives in a VLOOKUP chain, an unmaintained Python script, or someone's head.
2218

23-
**The Solution**: One command, one mapping. `canon` resolves input identifiers against versioned registries and records everything — what matched, what didn't, and which rule produced the match. Deterministic. Inspectable. Reproducible.
19+
**canon resolves identifiers against versioned registries — deterministic, traceable, reproducible.** Every resolution records which registry version was used, which rule produced the match, and what didn't match. Same input plus same registry version equals same output, every time. No fuzzy matching, no silent normalization, no guessing.
2420

25-
### Why Use canon?
21+
### What makes this different
2622

27-
| Feature | What It Does |
28-
|---------|--------------|
29-
| **Versioned registries** | Every resolution is pinned to a registry version — same input + same version = same output |
30-
| **Four clear outcomes** | RESOLVED, PARTIAL, UNRESOLVED, or REFUSAL — every input is classified |
31-
| **Two output modes** | JSON mapping artifact for audit, or CSV with canonical column appended for pipelines |
32-
| **Pipeline stage** | `canon --emit csv` feeds directly into `rvl`, `shape`, or any CSV tool |
33-
| **Full traceability** | Every mapping includes the rule ID, canonical type, and confidence level |
34-
| **Deterministic** | Exact byte match after ASCII-trim — no fuzzy heuristics, no silent normalization |
23+
- **Versioned registries** — every resolution is pinned to a registry version with semver. When the registry updates, you know exactly what changed. Registries are plain JSON directories — inspectable in git, diffable, no database required.
24+
- **Pipeline composable**`canon --emit csv` appends a `<column>__canon` column to your CSV. Pipe the output directly into `rvl` or `shape`: `canon nov.csv --column cusip --emit csv | rvl - dec.canon.csv --key cusip__canon`.
25+
- **Full traceability** — every mapping includes `rule_id`, `canonical_type`, and `confidence`. Every unresolved entry includes the reason. Every result is auditable.
26+
- **Deduplication built in** — input values are deduplicated before lookup. 500 unique CUSIPs produce 500 mapping entries whether your file has 500 rows or 500,000.
3527

3628
---
3729

0 commit comments

Comments
 (0)