1
- # unicode-width
1
+ # ` unicode-width `
2
2
3
- Determine displayed width of ` char ` and ` str ` types according to
4
- [ Unicode Standard Annex #11 ] [ UAX11 ] rules.
3
+ [ ![ Build status] ( https://github.com/unicode-rs/unicode-width/actions/workflows/rust.yml/badge.svg )] ( https://github.com/unicode-rs/unicode-width/actions/workflows/rust.yml )
4
+ [ ![ crates.io version] ( https://img.shields.io/crates/v/unicode-width )] ( https://crates.io/crates/unicode-width )
5
+ [ ![ Docs status] ( https://img.shields.io/docsrs/unicode-width )] ( https://docs.rs/unicode-width/ )
5
6
6
- [ UAX11 ] : http://www.unicode.org/reports/tr11/
7
+ Determine displayed width of ` char ` and ` str ` types according to [ Unicode Standard Annex #11 ] [ UAX11 ]
8
+ and other portions of the Unicode standard.
7
9
8
- [ ![ Build Status ] ( https://travis-ci.org/unicode-rs/unicode-width.svg )] ( https://travis-ci.org/unicode-rs/unicode-width )
10
+ This crate is ` #![no_std] ` .
9
11
10
- [ Documentation ] ( https:// unicode-rs.github.io/unicode-width/unicode_width/index.html )
12
+ [ UAX11 ] : http://www. unicode.org/reports/tr11/
11
13
12
14
``` rust
13
- extern crate unicode_width;
14
-
15
15
use unicode_width :: UnicodeWidthStr ;
16
16
17
17
fn main () {
18
18
let teststr = " Hello, world!" ;
19
- let width = UnicodeWidthStr :: width (teststr );
19
+ let width = teststr . width ();
20
20
println! (" {}" , teststr );
21
21
println! (" The above string is {} columns wide." , width );
22
22
let width = teststr . width_cjk ();
@@ -25,27 +25,26 @@ fn main() {
25
25
```
26
26
27
27
** NOTE:** The computed width values may not match the actual rendered column
28
- width. For example, the woman scientist emoji comprises of a woman emoji, a
29
- zero-width joiner and a microscope emoji.
28
+ width. For example, many Brahmic scripts like Devanagari have complex rendering rules
29
+ which this crate does not currently handle (and will never fully handle, because
30
+ the exact rendering depends on the font):
30
31
31
32
``` rust
32
33
extern crate unicode_width;
33
34
use unicode_width :: UnicodeWidthStr ;
34
35
35
36
fn main () {
36
- assert_eq! (UnicodeWidthStr :: width ( " 👩 " ), 2 ); // Woman
37
- assert_eq! (UnicodeWidthStr :: width ( " 🔬 " ), 2 ); // Microscope
38
- assert_eq! (UnicodeWidthStr :: width ( " 👩🔬 " ), 4 ); // Woman scientist
37
+ assert_eq! (" क " . width ( ), 1 ); // Devanagari letter Ka
38
+ assert_eq! (" ष " . width ( ), 1 ); // Devanagari letter Ssa
39
+ assert_eq! (" क्ष " . width ( ), 2 ); // Ka + Virama + Ssa
39
40
}
40
41
```
41
42
42
- See [ Unicode Standard Annex #11 ] [ UAX11 ] for precise details on what is and isn't
43
- covered by this crate.
44
-
45
- ## features
46
-
47
- unicode-width does not depend on libstd, so it can be used in crates
48
- with the ` #![no_std] ` attribute.
43
+ Additionally, [ defective combining character sequences] ( https://unicode.org/glossary/#defective_combining_character_sequence )
44
+ and nonstandard [ Korean jamo] ( https://unicode.org/glossary/#jamo ) sequences may
45
+ be rendered with a different width than what this crate says. (This is not an
46
+ exhaustive list.) For a list of what this crate * does* handle, see
47
+ [ docs.rs] ( https://docs.rs/unicode-width/latest/unicode_width/#rules-for-determining-width ) .
49
48
50
49
## crates.io
51
50
@@ -54,5 +53,18 @@ to your `Cargo.toml`:
54
53
55
54
``` toml
56
55
[dependencies ]
57
- unicode-width = " 0.1.7 "
56
+ unicode-width = " 0.1.11 "
58
57
```
58
+
59
+
60
+ ## Changelog
61
+
62
+
63
+ ### 0.2.0
64
+
65
+ - Treat ` \n ` as width 1 (#60 )
66
+ - Treat ambiguous ` Modifier_Letter ` s as narrow (#63 )
67
+ - Support ` Grapheme_Cluster_Break=Prepend ` (#62 )
68
+ - Support lots of ligatures (#53 )
69
+
70
+ Note: If you are using ` unicode-width ` for linebreaking, the change treating ` \n ` as width 1 _ may cause behavior changes_ . It is recommended that in such cases you feed already-line segmented text to ` unicode-width ` . In other words, please apply higher level control character based line breaking protocols before feeding text to ` unicode-width ` . Relying on any character producing a stable width in this crate is likely the sign of a bug.
0 commit comments