Skip to content

Commit 8496d7d

Browse files
committed
Add 2.0.0-beta2 content, update config, workaround upstream broken links
1 parent ec4ec8e commit 8496d7d

File tree

9 files changed

+1586
-5
lines changed

9 files changed

+1586
-5
lines changed

astro.config.mjs

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ import starlightLinksValidator from 'starlight-links-validator'
66

77
// !! UPDATE LATEST VERSION HERE !!
88
// format semver with underscores to make it look like a semver would in a URL
9-
var latest_semver_str = '1.5';
10-
var latest_ver_display_str = '1_5';
9+
var latest_display_name = 'Version 2.0 Beta';
10+
var latest_dir_name = '2_0_beta';
1111

1212
// https://astro.build/config
1313
export default defineConfig({
@@ -32,7 +32,7 @@ export default defineConfig({
3232
label: "leadingNavLinks",
3333
items: [
3434
{ label: "Overview", link: "/overview" },
35-
{ label: "Quickstart", link: "/" + latest_ver_display_str + "/quickstart" }
35+
{ label: "Quickstart", link: "/" + latest_dir_name + "/quickstart" }
3636
]
3737
},
3838
{
@@ -42,13 +42,18 @@ export default defineConfig({
4242
],
4343
},
4444
{
45-
label: 'Version ' + latest_semver_str,
46-
autogenerate: { directory: latest_ver_display_str},
45+
label: latest_display_name,
46+
autogenerate: { directory: latest_dir_name},
4747
},
4848
{
4949
label: 'Previous Versions',
5050
collapsed: false,
5151
items: [
52+
{
53+
label: 'Version 1.5',
54+
autogenerate: { directory: '1_5'},
55+
collapsed: true,
56+
},
5257
{
5358
label: 'Version 1.2',
5459
autogenerate: { directory: '1_2'},
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: Introduction to ICU4X for Rust
3+
---
4+
5+
6+
7+
`ICU4X` is an implementation of [Internationalization Components of Unicode](http://site.icu-project.org/) (ICU) intended to be modular, performant and flexible.
8+
9+
The library provides a layer of APIs for all software to enable internationalization capabilities.
10+
11+
To use `ICU4X` in the Rust ecosystem one can either add dependencies on selected components, or add a dependency on the meta-crate `icu` which brings the full selection of components in the most user-friendly configuration of features.
12+
13+
In this tutorial we are going to build up to writing an app that uses the `icu::datetime` component to format a date and time, covering various topics in the process.
14+
15+
## 1. Requirements
16+
17+
For this tutorial we assume the user has basic Rust knowledge. If acquiring it is necessary, the [Rust Book](https://doc.rust-lang.org/book/) provides an excellent introduction.
18+
We also assume that the user is familiar with a terminal and have `rust` and `cargo` installed.
19+
20+
To verify that, open a terminal and check that the results are similar to:
21+
22+
```shell
23+
$ cargo --version
24+
cargo 1.81.0 (2dbb1af80 2024-08-20)
25+
```
26+
27+
## 2. Creating an app with ICU4X as a dependency
28+
29+
Use `cargo` to initialize a binary application:
30+
31+
```shell
32+
cargo new --bin myapp
33+
cd myapp
34+
```
35+
36+
Then add a dependency on `ICU4X`'s main crate, `icu`:
37+
38+
```shell
39+
$ cargo add icu
40+
```
41+
42+
## 3. Locales
43+
44+
`ICU4X` comes with a variety of components allowing to manage various facets of software internationalization.
45+
46+
Most of those features depend on the selection of a `Locale` which is a particular combination of language, script, region with optional variants. An examples of such locales are `en-US` (American English), `sr-Cyrl` (Serbian with Cyrillic script) or `ar-EG-u-nu-latn` (Egyptian Arabic with ASCII numerals).
47+
48+
In `ICU4X` `Locale` is a part of the `locale_core` component. If the user needs just this one feature, they can use `icu_locale_core` crate as a dependency, but since here we already added a dependency on `icu`, we can refer to it via `icu::locale`.
49+
50+
Let's use this in our application.
51+
52+
Open `src/main.rs` and edit it to:
53+
54+
```rust
55+
use icu::locale::Locale;
56+
57+
fn main() {
58+
let loc: Locale = "ES-AR".parse()
59+
.expect("should be a valid locale");
60+
61+
if loc.id.language.as_str() == "es" {
62+
println!("¡Hola!");
63+
}
64+
65+
println!("You are using: {}", loc);
66+
}
67+
```
68+
69+
After saving it, call `cargo run` and it should display:
70+
71+
```text
72+
¡Hola!
73+
You are using: es-AR
74+
```
75+
76+
*Notice:* Here, `ICU4X` canonicalized the locales's syntax which uses lowercase letters for the language portion.
77+
78+
Congratulations! `ICU4X` has been used to semantically operate on a locale!
79+
80+
### Convenience macro
81+
82+
The scenario of working with statically declared `Locale`s is common.
83+
84+
It's a bit unergonomic to have to parse them at runtime and handle a parser error in such case.
85+
86+
For that purpose, ICU4X provides a macro one can use to parse it at compilation time:
87+
88+
```rust
89+
use icu::locale::{Locale, locale};
90+
91+
const LOCALE: Locale = locale!("ES-AR");
92+
93+
fn main() {
94+
if LOCALE.id.language.as_str() == "es" {
95+
println!("¡Hola amigo!");
96+
}
97+
98+
println!("You are using: {}", LOCALE);
99+
}
100+
```
101+
102+
In this case, the parsing is performed at compilation time, so we don't need to handle an error case. Try passing an malformed identifier, like "foo-bar" and call `cargo check`.
103+
104+
Next, let's add some more complex functionality.
105+
106+
## 4. Using an ICU4X component
107+
108+
We're going to extend our app to use the `icu::datetime` component to format a date and time. This component requires data; we will look at custom data generation later and for now use the default included data,
109+
which is exposed through constructors such as `try_new`.
110+
111+
```rust
112+
use icu::locale::{Locale, locale};
113+
use icu::calendar::Date;
114+
use icu::datetime::{DateTimeFormatter, Length, fieldsets::YMD};
115+
116+
const LOCALE: Locale = locale!("ja"); // let's try some other language
117+
118+
fn main() {
119+
120+
let dtf = DateTimeFormatter::try_new(
121+
LOCALE.into(),
122+
YMD::medium(),
123+
)
124+
.expect("ja data should be available");
125+
126+
let date = Date::try_new_iso(2020, 10, 14)
127+
.expect("date should be valid");
128+
129+
// DateTimeFormatter supports the ISO and native calendars as input via DateTime<AnyCalendar>.
130+
// For smaller codesize you can use FixedCalendarDateTimeFormatter<Gregorian> with a DateTime<Gregorian>
131+
let date = date.to_any();
132+
133+
let formatted_date = dtf.format(&date).to_string();
134+
135+
println!("📅: {}", formatted_date);
136+
}
137+
```
138+
139+
If all went well, running the app with `cargo run` should display:
140+
141+
```text
142+
📅: 2020年10月14日 13:21:28
143+
```
144+
145+
Here's an internationalized date!
146+
147+
*Notice:* By default, `cargo run` builds and runs a `debug` mode of the binary. If you want to evaluate performance, memory or size of this example, use `cargo run --release`.
148+
149+
150+
## 5. Data Management
151+
152+
While the locale API is purely algorithmic, many internationalization APIs like the date formatting API require more complex data to work. You've seen this in the previous example where we had to call `.expect("ja data should be available")` after the constructor.
153+
154+
Data management is a complex and non-trivial area which often requires customizations for particular environments and integrations into a project's ecosystem.
155+
156+
The way `ICU4X` handles data is one of its novelties, aimed at making the data management more flexible and enabling better integration in asynchronous environments.
157+
158+
`ICU4X` by default contains data for a a wide range of CLDR locales[^1], meaning that for most languages, the constructors can be considered infallible and you can `expect` or `unwrap` them, as we did above.
159+
160+
However, shipping the library with all locales will have a size impact on your binary. It also requires you to update your binary whenever CLDR data changes, which happens twice a year. To learn how to solve these problems, see our [data management](/2_0_beta/tutorials/data-management) tutorial.
161+
162+
[^1]: All locales with coverage level `basic`, `moderate`, or `modern` in [`CLDR`](https://github.com/unicode-org/cldr-json/blob/main/cldr-json/cldr-core/coverageLevels.json)
163+
164+
## 6. Summary
165+
166+
This concludes this introduction tutorial. With the help of `DateTimeFormat`, `Locale` and `DataProvider` we formatted a date to Japanese, but that's just the start.
167+
168+
Internationalization is a broad domain and there are many more components in `ICU4X`.
169+
170+
Next, learn how to [generate optimized data for your binary](/2_0_beta/tutorials/data-management), [configure your Cargo.toml file](/2_0_beta/tutorials/cargo), or continue exploring by reading [the docs](https://docs.rs/icu/2.0.0-beta2/).
171+
172+
173+
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
---
2+
title: Configuring Cargo.toml
3+
---
4+
5+
6+
7+
ICU4X makes heavy use of small crates and Cargo features in order to be highly modular. This tutorial is intended to help you set up a Cargo.toml file to download what you need for ICU4X.
8+
9+
## Basic Cargo.toml with compiled data
10+
11+
The most basic Cargo.toml to get you off the ground is the following:
12+
13+
```toml
14+
[dependencies]
15+
icu = "2.0.0-dev"
16+
```
17+
18+
In your main.rs, you can use all stable ICU4X components for the recommended set of locales, which get compiled into the library.
19+
20+
[« Fully Working Example »](https://github.com/unicode-org/icu4x/tree/main/tutorials/./crates/default)
21+
22+
## Cargo.toml with custom compiled data
23+
24+
If you wish to use custom compiled data for ICU4X, no changes to Cargo.toml are required. Instead, set the `ICU4X_DATA_DIR` environment variable to the
25+
datagen output during your build:
26+
27+
```shell
28+
icu4x-datagen --format baked --markers all --locales ru --out baked_data
29+
ICU4X_DATA_DIR=$(pwd)/baked_data cargo build --release
30+
```
31+
32+
[« Fully Working Example »](https://github.com/unicode-org/icu4x/tree/main/tutorials/./crates/custom_compiled)
33+
34+
## Cargo.toml with experimental modules
35+
36+
Experimental modules are published in a separate `icu_experimental` crate:
37+
38+
```toml
39+
[dependencies]
40+
icu = { version = "2.0.0-dev", features = ["experimental"] }
41+
```
42+
43+
In your main.rs, you can now use e.g. the `icu_experimental::displaynames` module.
44+
45+
[« Fully Working Example »](https://github.com/unicode-org/icu4x/tree/main/tutorials/./crates/experimental)
46+
47+
## Cargo.toml with Buffer Provider
48+
49+
If you wish to generate your own data in blob format and pass it into ICU4X, enable the "serde" Cargo feature as follows:
50+
51+
```toml
52+
[dependencies]
53+
icu = { version = "2.0.0-dev", features = ["serde"] }
54+
icu_provider_blob = {version = "2.0.0-dev", features = ["alloc"] }
55+
```
56+
57+
To learn about building ICU4X data, including whether to check in the data blob file to your repository, see [data-management.md](/2_0_beta/tutorials/data-management).
58+
59+
[« Fully Working Example »](https://github.com/unicode-org/icu4x/tree/main/tutorials/./crates/buffer)
60+
61+
## Cargo.toml with `Sync`
62+
63+
If you wish to share ICU4X objects between threads, you must enable the `"sync"` Cargo feature:
64+
65+
```toml
66+
[dependencies]
67+
icu = { version = "2.0.0-dev", features = ["sync"] }
68+
```
69+
70+
You can now use most ICU4X types when `Send + Sync` are required, such as when sharing across threads.
71+
72+
[« Fully Working Example »](https://github.com/unicode-org/icu4x/tree/main/tutorials/./crates/sync)
73+
74+
## Cargo.toml with `build.rs` data generation
75+
76+
If you wish to use data generation in a `build.rs` script, you need to manually include the data and any dependencies (the `ICU4X_DATA_DIR` variable won't work as ICU4X cannot access your build script output).
77+
78+
```toml
79+
[dependencies]
80+
icu = { version = "2.0.0-dev", default-features = false } # turn off compiled_data
81+
icu_provider = "2.0.0-dev" # for databake
82+
icu_provider_baked = "2.0.0-dev" # for databake
83+
zerovec = "0.9" # for databake
84+
85+
# for build.rs:
86+
[build-dependencies]
87+
icu = "2.0.0-dev"
88+
icu_provider_export = "2.0.0-dev"
89+
icu_provider_source = "2.0.0-dev"
90+
```
91+
92+
This example has an additional section for auto-generating the data in build.rs. In your build.rs, invoke the ICU4X Datagen API with the set of markers you require. Don't worry; if using databake, you will get a compiler error if you don't specify enough markers.
93+
94+
The build.rs approach has several downsides and should only be used if Cargo is the only build system you can use, and you cannot check in your data:
95+
* The build script with the whole of `icu_provider_source` in it is slow to build
96+
* If you're using networking features of `icu_provider_source` (behind the `networking` Cargo feature), the build script will access the network
97+
* Using the data requires ICU4X's [`_unstable`](https://docs.rs/icu_provider/2.0.0-beta2/icu_provider/constructors/index.html) APIs with a custom data provider, and that `icu_provider_source` is the same *minor* version as the `icu` crate.
98+
* `build.rs` output is not written to the console so it will appear that the build is hanging
99+
100+
[« Fully Working Example »](https://github.com/unicode-org/icu4x/tree/main/tutorials/./crates/baked)

0 commit comments

Comments
 (0)