Adobe Arabic Character Sets - User Guide

Last Updated: 2025-10-22

Quick Start

I want to...

Design a font for Arabic → Use Character Set Files
Add romanization to my font → Use Romanization Files + Adobe Latin 3
Look up how a character is romanized → See How to Look Up Romanization
Parse data programmatically → See Programmatic Access
Verify against official standards → See arabic-roman-standards.md

Quick Reference

Character Set Files (Arabic Script)

File	Use For	Details
`adobe-arabic-1.txt`	Basic Arabic	See README
`adobe-arabic-2.txt`	Urdu, Persian, Punjabi	See README
`adobe-arabic-3.txt`	Uyghur, Kazakh, Kyrgyz	See README
`adobe-arabic-4.txt`	Kashmiri, Saraiki, Balti	See README
`adobe-arabic-5.txt`	Pashto, Sindhi, Kurdish, Balochi	See README

Romanization Files (Latin Script)

Important: All romanization modules require Adobe Latin 3

File	Use For	Details
`adobe-arabic-1-roman.txt`	Romanizing Arabic	See README
`adobe-arabic-2-roman.txt`	Romanizing Urdu, Persian, Punjabi	See README
`adobe-arabic-3-roman.txt`	Romanizing Uyghur, Kazakh, Kyrgyz	See README
`adobe-arabic-4-roman.txt`	Romanizing Kashmiri, Saraiki, Balti	See README
`adobe-arabic-5-roman.txt`	Romanizing extended languages	See README

How to Look Up Romanization

"How is this Arabic character romanized?"

Go to: documentation/arabic-roman-source.md

This file contains tables showing how each character is romanized across different standards (BGN/PCGN, UNGEGN, ALA-LC, ISO, etc.)

Example:

Looking for how ع (Ayn) is romanized in BGN/PCGN?
Find the Arabic or Urdu table
Look up Unicode 0639
Check the BGN/PCGN column

Notation

b = single romanization
k/g = multiple options (context-dependent)
- = not used in this standard

Programmatic Access

Parse Character Sets (Python)

# Read character set files (tab-delimited)
with open('adobe-arabic-1.txt', 'r', encoding='utf-8') as f:
    lines = f.readlines()[1:]  # Skip header
    for line in lines:
        if line.strip() and not line.startswith('#'):
            unicode_code, char, glyph, name, notes = line.split('\t')
            print(f"{unicode_code}: {char} ({glyph})")

Access Romanization Mappings (Python)

# Load romanization mappings from standards_mappings/
import sys
sys.path.insert(0, '.')
from standards_mappings import get_standard_mappings, get_metadata

# Get mappings for a specific standard
mappings = get_standard_mappings('BGN/PCGN Urdu')
# Returns: {'0628': 'b', '067E': 'p', '062A': 't', ...}

# Get metadata
metadata = get_metadata('BGN/PCGN Urdu')
# Returns: {'standard_name': 'BGN/PCGN Urdu', 'table_type': 'urdu', ...}

Common Standards

BGN/PCGN - Geographic names (US/UK)
UNGEGN - Geographic names (UN)
ALA-LC - Library cataloging
ISO 233 - International standard
IPA - Phonetic transcription

See documentation/arabic-roman-standards.md for complete list with official source URLs.

Resources

Interactive Browser - Visual reference
README.md - Module descriptions and language support
arabic-roman-source.md - Romanization lookup tables
arabic-roman-standards.md - Official standard documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adobe Arabic Character Sets - User Guide

Quick Start

I want to...

Quick Reference

Character Set Files (Arabic Script)

Romanization Files (Latin Script)

How to Look Up Romanization

"How is this Arabic character romanized?"

Notation

Programmatic Access

Parse Character Sets (Python)

Access Romanization Mappings (Python)

Common Standards

Resources

FilesExpand file tree

USER_GUIDE.md

Latest commit

History

USER_GUIDE.md

File metadata and controls

Adobe Arabic Character Sets - User Guide

Quick Start

I want to...

Quick Reference

Character Set Files (Arabic Script)

Romanization Files (Latin Script)

How to Look Up Romanization

"How is this Arabic character romanized?"

Notation

Programmatic Access

Parse Character Sets (Python)

Access Romanization Mappings (Python)

Common Standards

Resources