Last Updated: 2025-10-22
- Design a font for Arabic → Use Character Set Files
- Add romanization to my font → Use Romanization Files + Adobe Latin 3
- Look up how a character is romanized → See How to Look Up Romanization
- Parse data programmatically → See Programmatic Access
- Verify against official standards → See arabic-roman-standards.md
| File | Use For | Details |
|---|---|---|
adobe-arabic-1.txt |
Basic Arabic | See README |
adobe-arabic-2.txt |
Urdu, Persian, Punjabi | See README |
adobe-arabic-3.txt |
Uyghur, Kazakh, Kyrgyz | See README |
adobe-arabic-4.txt |
Kashmiri, Saraiki, Balti | See README |
adobe-arabic-5.txt |
Pashto, Sindhi, Kurdish, Balochi | See README |
Important: All romanization modules require Adobe Latin 3
| File | Use For | Details |
|---|---|---|
adobe-arabic-1-roman.txt |
Romanizing Arabic | See README |
adobe-arabic-2-roman.txt |
Romanizing Urdu, Persian, Punjabi | See README |
adobe-arabic-3-roman.txt |
Romanizing Uyghur, Kazakh, Kyrgyz | See README |
adobe-arabic-4-roman.txt |
Romanizing Kashmiri, Saraiki, Balti | See README |
adobe-arabic-5-roman.txt |
Romanizing extended languages | See README |
Go to: documentation/arabic-roman-source.md
This file contains tables showing how each character is romanized across different standards (BGN/PCGN, UNGEGN, ALA-LC, ISO, etc.)
Example:
- Looking for how ع (Ayn) is romanized in BGN/PCGN?
- Find the Arabic or Urdu table
- Look up Unicode
0639 - Check the BGN/PCGN column
b= single romanizationk/g= multiple options (context-dependent)-= not used in this standard
# Read character set files (tab-delimited)
with open('adobe-arabic-1.txt', 'r', encoding='utf-8') as f:
lines = f.readlines()[1:] # Skip header
for line in lines:
if line.strip() and not line.startswith('#'):
unicode_code, char, glyph, name, notes = line.split('\t')
print(f"{unicode_code}: {char} ({glyph})")# Load romanization mappings from standards_mappings/
import sys
sys.path.insert(0, '.')
from standards_mappings import get_standard_mappings, get_metadata
# Get mappings for a specific standard
mappings = get_standard_mappings('BGN/PCGN Urdu')
# Returns: {'0628': 'b', '067E': 'p', '062A': 't', ...}
# Get metadata
metadata = get_metadata('BGN/PCGN Urdu')
# Returns: {'standard_name': 'BGN/PCGN Urdu', 'table_type': 'urdu', ...}- BGN/PCGN - Geographic names (US/UK)
- UNGEGN - Geographic names (UN)
- ALA-LC - Library cataloging
- ISO 233 - International standard
- IPA - Phonetic transcription
See documentation/arabic-roman-standards.md for complete list with official source URLs.
- Interactive Browser - Visual reference
- README.md - Module descriptions and language support
- arabic-roman-source.md - Romanization lookup tables
- arabic-roman-standards.md - Official standard documentation