|
| 1 | +<h1 align="center">simple-wcswidth</h1> |
| 2 | +<h3 align="center"> 🖥️ 💬 Simplified JS/TS implementation of wcswidth() written by Markus Kuhn in C</h3> |
| 3 | + |
| 4 | +<p align="center"> |
| 5 | + <a href="https://codecov.io/gh/ayonious/simple-wcswidth"> |
| 6 | + <img alt="codecov" src="https://codecov.io/gh/ayonious/simple-wcswidth/branch/master/graph/badge.svg"> |
| 7 | + </a> |
| 8 | + <a href="https://badge.fury.io/js/simple-wcswidth"> |
| 9 | + <img alt="npm version" src="https://badge.fury.io/js/simple-wcswidth.svg"> |
| 10 | + </a> |
| 11 | + <a href="https://packagephobia.now.sh/result?p=simple-wcswidth"> |
| 12 | + <img alt="install size" src="https://packagephobia.now.sh/badge?p=simple-wcswidth@latest"> |
| 13 | + </a> |
| 14 | +</p> |
| 15 | +<p align="center"> |
| 16 | + <a href="https://github.com/semantic-release/semantic-release"> |
| 17 | + <img alt="semantic-release" src="https://img.shields.io/badge/%20%20%F0%9F%93%A6%F0%9F%9A%80-semantic--release-e10079.svg"> |
| 18 | + </a> |
| 19 | +</p> |
| 20 | + |
| 21 | +# Why another wcswidth? |
| 22 | + |
| 23 | +1. 💙 Types included |
| 24 | +2. 🤏 Installation Size kept as min possible |
| 25 | +3. 🐒 No Unnecessary dependency added |
| 26 | +4. 🤖 Tested Automatically and Regularly on different versions of node, including current LTS and stable |
| 27 | + |
| 28 | +# Example Usage |
| 29 | + |
| 30 | +```js |
| 31 | +const { wcswidth } = require('simple-wcwidth'); |
| 32 | + |
| 33 | +console.log(wcswidth('Yes 重要')); // 8 |
| 34 | +console.log(wcswidth('请你')); // 4 |
| 35 | +console.log(wcswidth('Hi')); // 2 |
| 36 | +``` |
| 37 | + |
| 38 | +# What is simplified here? |
| 39 | + |
| 40 | +In the original [C code](https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c) there were 2 versions of `wcswidth()` I have included here only for first one, which is applicable for general use. |
| 41 | + |
| 42 | +About the second one(WHICH I DIDNT INCLUDE HERE), useful for users of CJK legacy encodings who want to migrate to UCS without changing the traditional terminal character-width behaviour. It is not otherwise recommended for general use. |
| 43 | + |
| 44 | +# Info Taken from Markus Kuhn's [C code](https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c) |
| 45 | + |
| 46 | +This is an implementation of [wcwidth()](http://www.opengroup.org/onlinepubs/007904975/functions/wcwidth.html) and [wcswidth()](http://www.opengroup.org/onlinepubs/007904975/functions/wcswidth.html) (defined in |
| 47 | +IEEE Std 1002.1-2001) for Unicode. |
| 48 | + |
| 49 | +In fixed-width output devices, Latin characters all occupy a single |
| 50 | +"cell" position of equal width, whereas ideographic CJK characters |
| 51 | +occupy two such cells. Interoperability between terminal-line |
| 52 | +applications and (teletype-style) character terminals using the |
| 53 | +UTF-8 encoding requires agreement on which character should advance |
| 54 | +the cursor by how many cell positions. No established formal |
| 55 | +standards exist at present on which Unicode character shall occupy |
| 56 | +how many cell positions on character terminals. These routines are |
| 57 | +a first attempt of defining such behavior based on simple rules |
| 58 | +applied to data provided by the Unicode Consortium. |
| 59 | + |
| 60 | +For some graphical characters, the Unicode standard explicitly |
| 61 | +defines a character-cell width via the definition of the East Asian |
| 62 | +FullWidth (F), Wide (W), Half-width (H), and Narrow (Na) classes. |
| 63 | +In all these cases, there is no ambiguity about which width a |
| 64 | +terminal shall use. For characters in the East Asian Ambiguous (A) |
| 65 | +class, the width choice depends purely on a preference of backward |
| 66 | +compatibility with either historic CJK or Western practice. |
| 67 | +Choosing single-width for these characters is easy to justify as |
| 68 | +the appropriate long-term solution, as the CJK practice of |
| 69 | +displaying these characters as double-width comes from historic |
| 70 | +implementation simplicity (8-bit encoded characters were displayed |
| 71 | +single-width and 16-bit ones double-width, even for Greek, |
| 72 | +Cyrillic, etc.) and not any typographic considerations. |
| 73 | + |
| 74 | +Much less clear is the choice of width for the Not East Asian |
| 75 | +(Neutral) class. Existing practice does not dictate a width for any |
| 76 | +of these characters. It would nevertheless make sense |
| 77 | +typographically to allocate two character cells to characters such |
| 78 | +as for instance EM SPACE or VOLUME INTEGRAL, which cannot be |
| 79 | +represented adequately with a single-width glyph. The following |
| 80 | +routines at present merely assign a single-cell width to all |
| 81 | +neutral characters, in the interest of simplicity. This is not |
| 82 | +entirely satisfactory and should be reconsidered before |
| 83 | +establishing a formal standard in this area. At the moment, the |
| 84 | +decision which Not East Asian (Neutral) characters should be |
| 85 | +represented by double-width glyphs cannot yet be answered by |
| 86 | +applying a simple rule from the Unicode database content. Setting |
| 87 | +up a proper standard for the behavior of UTF-8 character terminals |
| 88 | +will require a careful analysis not only of each Unicode character, |
| 89 | +but also of each presentation form, something the author of these |
| 90 | +routines has avoided to do so far. |
| 91 | + |
| 92 | +http://www.unicode.org/unicode/reports/tr11/ |
| 93 | + |
| 94 | +# LICENSE |
| 95 | + |
| 96 | +MIT |
| 97 | + |
| 98 | +The original Code written in C was very permissive. You can find it here [Code](http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c) |
0 commit comments