Skip to content

xvi-xv-xii-ix-xxii-ix-xiv/unicode2utf8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Unicode to UTF-8 Converter

A Python script to convert Unicode code points (decimal, U+hex, or ranges) to UTF-8 escape sequences and vice versa.

Features

  • Convert Unicode code points (e.g., 61572, U+F084, U+F000-U+F0FF) to UTF-8 escape sequences (e.g., \xEF\x82\xA4).
  • Reverse mode: Convert UTF-8 escape sequences to Unicode code points.
  • Output formats: Plain text, JSON, CSV, or C string array.
  • Customizable output fields (e.g., codepoint, hex, unicode, escape, c_string, char).
  • Save output to a file with .txt, .json, .csv, or .c extension.

Usage

./unicode2utf8.py [codes] [options]

Examples

  • Convert a single code point:

    ./unicode2utf8.py U+F084

    Output: U+F084 \xEF\x82\xA4

  • Convert a range:

    ./unicode2utf8.py U+F000-U+F002
  • Reverse mode:

    ./unicode2utf8.py --reverse '\xEF\x82\xA4'
  • Output as JSON:

    ./unicode2utf8.py --json U+F084
  • Save to file:

    ./unicode2utf8.py --out output.csv --csv U+F084

Options

  • --reverse, -r: Convert UTF-8 escape sequences to Unicode code points.
  • --c-string: Output as C-style string (e.g., "\xEF\x82\xA4").
  • --table: Output as a C string array.
  • --table-name NAME: Set C array name (default: icons).
  • --hex: Include hex code (e.g., 0xF084).
  • --char: Include the actual character (if displayable).
  • --json: Output in JSON format.
  • --csv: Output in CSV format.
  • --fields FIELDS: Comma-separated fields to output (e.g., unicode,escape,char).
  • --out FILE: Save output to a file.

Requirements

  • Python 3.x
  • No external dependencies

License

MIT License

About

Converts between Unicode code points (decimal, U+hex, or ranges) and UTF-8 escape sequences.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages