|
| 1 | +--- |
| 2 | +title: Introduction to ICU4X for Rust |
| 3 | +--- |
| 4 | + |
| 5 | + |
| 6 | + |
| 7 | +`ICU4X` is an implementation of [Internationalization Components of Unicode](http://site.icu-project.org/) (ICU) intended to be modular, performant and flexible. |
| 8 | + |
| 9 | +The library provides a layer of APIs for all software to enable internationalization capabilities. |
| 10 | + |
| 11 | +To use `ICU4X` in the Rust ecosystem one can either add dependencies on selected components, or add a dependency on the meta-crate `icu` which brings the full selection of components in the most user-friendly configuration of features. |
| 12 | + |
| 13 | +In this tutorial we are going to build up to writing an app that uses the `icu::datetime` component to format a date and time, covering various topics in the process. |
| 14 | + |
| 15 | +## 1. Requirements |
| 16 | + |
| 17 | +For this tutorial we assume the user has basic Rust knowledge. If acquiring it is necessary, the [Rust Book](https://doc.rust-lang.org/book/) provides an excellent introduction. |
| 18 | +We also assume that the user is familiar with a terminal and have `rust` and `cargo` installed. |
| 19 | + |
| 20 | +To verify that, open a terminal and check that the results are similar to: |
| 21 | + |
| 22 | +```shell |
| 23 | +$ cargo --version |
| 24 | +cargo 1.81.0 (2dbb1af80 2024-08-20) |
| 25 | +``` |
| 26 | + |
| 27 | +## 2. Creating an app with ICU4X as a dependency |
| 28 | + |
| 29 | +Use `cargo` to initialize a binary application: |
| 30 | + |
| 31 | +```shell |
| 32 | +cargo new --bin myapp |
| 33 | +cd myapp |
| 34 | +``` |
| 35 | + |
| 36 | +Then add a dependency on `ICU4X`'s main crate, `icu`: |
| 37 | + |
| 38 | +```shell |
| 39 | +$ cargo add icu |
| 40 | +``` |
| 41 | + |
| 42 | +## 3. Locales |
| 43 | + |
| 44 | +`ICU4X` comes with a variety of components allowing to manage various facets of software internationalization. |
| 45 | + |
| 46 | +Most of those features depend on the selection of a `Locale` which is a particular combination of language, script, region with optional variants. An examples of such locales are `en-US` (American English), `sr-Cyrl` (Serbian with Cyrillic script) or `ar-EG-u-nu-latn` (Egyptian Arabic with ASCII numerals). |
| 47 | + |
| 48 | +In `ICU4X` `Locale` is a part of the `locale_core` component. If the user needs just this one feature, they can use `icu_locale_core` crate as a dependency, but since here we already added a dependency on `icu`, we can refer to it via `icu::locale`. |
| 49 | + |
| 50 | +Let's use this in our application. |
| 51 | + |
| 52 | +Open `src/main.rs` and edit it to: |
| 53 | + |
| 54 | +```rust |
| 55 | +use icu::locale::Locale; |
| 56 | + |
| 57 | +fn main() { |
| 58 | + let loc: Locale = "ES-AR".parse() |
| 59 | + .expect("should be a valid locale"); |
| 60 | + |
| 61 | + if loc.id.language.as_str() == "es" { |
| 62 | + println!("¡Hola!"); |
| 63 | + } |
| 64 | + |
| 65 | + println!("You are using: {}", loc); |
| 66 | +} |
| 67 | +``` |
| 68 | + |
| 69 | +After saving it, call `cargo run` and it should display: |
| 70 | + |
| 71 | +```text |
| 72 | +¡Hola! |
| 73 | +You are using: es-AR |
| 74 | +``` |
| 75 | + |
| 76 | +*Notice:* Here, `ICU4X` canonicalized the locales's syntax which uses lowercase letters for the language portion. |
| 77 | + |
| 78 | +Congratulations! `ICU4X` has been used to semantically operate on a locale! |
| 79 | + |
| 80 | +### Convenience macro |
| 81 | + |
| 82 | +The scenario of working with statically declared `Locale`s is common. |
| 83 | + |
| 84 | +It's a bit unergonomic to have to parse them at runtime and handle a parser error in such case. |
| 85 | + |
| 86 | +For that purpose, ICU4X provides a macro one can use to parse it at compilation time: |
| 87 | + |
| 88 | +```rust |
| 89 | +use icu::locale::{Locale, locale}; |
| 90 | + |
| 91 | +const LOCALE: Locale = locale!("ES-AR"); |
| 92 | + |
| 93 | +fn main() { |
| 94 | + if LOCALE.id.language.as_str() == "es" { |
| 95 | + println!("¡Hola amigo!"); |
| 96 | + } |
| 97 | + |
| 98 | + println!("You are using: {}", LOCALE); |
| 99 | +} |
| 100 | +``` |
| 101 | + |
| 102 | +In this case, the parsing is performed at compilation time, so we don't need to handle an error case. Try passing an malformed identifier, like "foo-bar" and call `cargo check`. |
| 103 | + |
| 104 | +Next, let's add some more complex functionality. |
| 105 | + |
| 106 | +## 4. Using an ICU4X component |
| 107 | + |
| 108 | +We're going to extend our app to use the `icu::datetime` component to format a date and time. This component requires data; we will look at custom data generation later and for now use the default included data, |
| 109 | +which is exposed through constructors such as `try_new`. |
| 110 | + |
| 111 | +```rust |
| 112 | +use icu::locale::{Locale, locale}; |
| 113 | +use icu::calendar::Date; |
| 114 | +use icu::datetime::{DateTimeFormatter, Length, fieldsets::YMD}; |
| 115 | + |
| 116 | +const LOCALE: Locale = locale!("ja"); // let's try some other language |
| 117 | + |
| 118 | +fn main() { |
| 119 | + |
| 120 | + let dtf = DateTimeFormatter::try_new( |
| 121 | + LOCALE.into(), |
| 122 | + YMD::medium(), |
| 123 | + ) |
| 124 | + .expect("ja data should be available"); |
| 125 | + |
| 126 | + let date = Date::try_new_iso(2020, 10, 14) |
| 127 | + .expect("date should be valid"); |
| 128 | + |
| 129 | + // DateTimeFormatter supports the ISO and native calendars as input via DateTime<AnyCalendar>. |
| 130 | + // For smaller codesize you can use FixedCalendarDateTimeFormatter<Gregorian> with a DateTime<Gregorian> |
| 131 | + let date = date.to_any(); |
| 132 | + |
| 133 | + let formatted_date = dtf.format(&date).to_string(); |
| 134 | + |
| 135 | + println!("📅: {}", formatted_date); |
| 136 | +} |
| 137 | +``` |
| 138 | + |
| 139 | +If all went well, running the app with `cargo run` should display: |
| 140 | + |
| 141 | +```text |
| 142 | +📅: 2020年10月14日 13:21:28 |
| 143 | +``` |
| 144 | + |
| 145 | +Here's an internationalized date! |
| 146 | + |
| 147 | +*Notice:* By default, `cargo run` builds and runs a `debug` mode of the binary. If you want to evaluate performance, memory or size of this example, use `cargo run --release`. |
| 148 | + |
| 149 | + |
| 150 | +## 5. Data Management |
| 151 | + |
| 152 | +While the locale API is purely algorithmic, many internationalization APIs like the date formatting API require more complex data to work. You've seen this in the previous example where we had to call `.expect("ja data should be available")` after the constructor. |
| 153 | + |
| 154 | +Data management is a complex and non-trivial area which often requires customizations for particular environments and integrations into a project's ecosystem. |
| 155 | + |
| 156 | +The way `ICU4X` handles data is one of its novelties, aimed at making the data management more flexible and enabling better integration in asynchronous environments. |
| 157 | + |
| 158 | +`ICU4X` by default contains data for a a wide range of CLDR locales[^1], meaning that for most languages, the constructors can be considered infallible and you can `expect` or `unwrap` them, as we did above. |
| 159 | + |
| 160 | +However, shipping the library with all locales will have a size impact on your binary. It also requires you to update your binary whenever CLDR data changes, which happens twice a year. To learn how to solve these problems, see our [data management](/2_0_beta/tutorials/data-management) tutorial. |
| 161 | + |
| 162 | +[^1]: All locales with coverage level `basic`, `moderate`, or `modern` in [`CLDR`](https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-core/coverageLevels.json) |
| 163 | + |
| 164 | +## 6. Summary |
| 165 | + |
| 166 | +This concludes this introduction tutorial. With the help of `DateTimeFormat`, `Locale` and `DataProvider` we formatted a date to Japanese, but that's just the start. |
| 167 | + |
| 168 | +Internationalization is a broad domain and there are many more components in `ICU4X`. |
| 169 | + |
| 170 | +Next, learn how to [generate optimized data for your binary](/2_0_beta/tutorials/data-management), [configure your Cargo.toml file](/2_0_beta/tutorials/cargo), or continue exploring by reading [the docs](https://docs.rs/icu/2.0.0-beta2/). |
| 171 | + |
| 172 | + |
| 173 | + |
0 commit comments