@@ -16,14 +16,14 @@ License & terms of use: http://www.unicode.org/copyright.html
1616ICU is the [ premier library for software internationalization] ( https://icu.unicode.org/#h.i33fakvpjb7o ) ,
1717used by a [ wide array of companies and organizations] ( https://icu.unicode.org/#h.f9qwubthqabj ) .
1818
19- ## Draft page
19+ ## Release Candidate
2020
21- ** This is an early draft of the ICU 78 download page. Nothing is published yet .**
21+ ** This is a release candidate. Please use it for testing, but do not use it in production .**
2222
2323## Release Overview
2424
25- * Download: TODO [ releases/tag/release-78-1 ] ( https://github.com/unicode-org/icu/releases/tag/release-78-1 )
26- * TODO [ Maven: com.ibm.icu / icu4j / version 78.1] ( https://mvnrepository.com/artifact/com.ibm.icu/icu4j/78.1 )
25+ * Download: [ releases/tag/release-78.1rc ] ( https://github.com/unicode-org/icu/releases/tag/release-78.1rc ) <!-- TODO final version -->
26+ <!-- * TODO [Maven: com.ibm.icu / icu4j / version 78.1](https://mvnrepository.com/artifact/com.ibm.icu/icu4j/78.1) -->
2727
2828ICU 78 updates to
2929[ Unicode 17] ( https://www.unicode.org/versions/Unicode17.0.0/ )
@@ -35,7 +35,7 @@ It also updates to
3535([ beta blog] ( https://blog.unicode.org/2025/10/unicode-cldr-48-beta-available-for.html ) )
3636locale data with new locales, and various additions and corrections.
3737
38- In Java, there is a draft new Segmenter API which is easier and safer to use than BreakIterator.
38+ In Java, there is a new Segmenter API which is easier and safer to use than BreakIterator.\
3939In C++, there is a new set of APIs for Unicode string (UTF-8/16/32) code point iteration
4040that works seamlessly with modern C++ iterators and ranges.
4141
@@ -53,6 +53,15 @@ For more details, including migration issues, see below.
5353Please use the [ icu-support mailing list] ( https://icu.unicode.org/contacts ) and/or
5454[ find/submit error reports] ( https://icu.unicode.org/bugs ) .
5555
56+ ### Attention: Future Changes
57+
58+ Beginning with CLDR 49 / ICU 79 (2026-mar), CLDR and ICU are planning to make changes in
59+ time formatting options for the hour cycle (details of 12/24 hour formats),
60+ make the week-of-year numbering always follow ISO rules,
61+ and remove the pre-Meiji Japanese eras.
62+
63+ See the [ CLDR V49 advance warnings] ( https://cldr.unicode.org/downloads/cldr-48#v49-advance-warnings ) .
64+
5665### Version Number
5766
5867The initial release has library version number 78.1.
@@ -83,38 +92,42 @@ Note: There may be additional commits on the [maint/maint-78](https://github.com
8392 remain recommended for default identifier use.
8493* [ CLDR 48] ( https://cldr.unicode.org/downloads/cldr-48 )
8594 ([ beta blog] ( https://blog.unicode.org/2025/10/unicode-cldr-48-beta-available-for.html ) ):
86- * TODO: old news here
87- * No major data collection for existing locales; focus on bug fixes and structural improvements
88- * New regional variants: English in several European countries, and Cantonese in Macau (` yue_Hant_MO ` )
89- * Improved RBNF (number spellout) and transliteration data
90- * TODO: old news here
91- * Subtle segmentation changes to make ICU fully conform to Unicode 16
92- * Word break: Root tailoring of colon reverted, Swedish & Finnish tailorings removed
93- ([ ICU-22941] ( https://unicode-org.atlassian.net/browse/ICU-22941 ) )
94- * These tailorings were introduced in ICU 72, but feedback has been negative,
95- and the UTC declined to adopt these changes.
96- * Line break: Fixed a bug in the line breaking of obscure sequences
97- ⟨no-break space, combining mark, hyphen, alphabetic character⟩
98- ([ ICU-22986] ( https://unicode-org.atlassian.net/browse/ICU-22986 ) ).
99- * Updated Indic grapheme clusters to use the latest ` Indic_Conjunct_Break ` data
100- ([ ICU-22956] ( https://unicode-org.atlassian.net/browse/ICU-22956 ) )
95+ * Significant data updates across all locales
96+ * Locales which are now at modern coverage level: Akan, Bashkir, Chuvash, Kazakh (Arabic), Romansh, Shan, Quechua
97+ * Locales which are now at moderate coverage level: Anii, Esperanto
98+ * Many new measurement units, for scientific contexts (coulombs, farads, teslas, etc.)
99+ and for English systems (fortnights, imperial pints, etc.)
100+ * Some measurement unit identifiers changed, see [ CLDR 48 Migration] ( https://cldr.unicode.org/downloads/cldr-48#migration )
101101* Time zone data (tzdata) version 2025b (2025-mar).
102102
103103## ICU4C Specific Changes
104104
105- * [ API changes since ICU4C 77 (Markdown)] ( https://github.com/unicode-org/icu/blob/maint/maint-78/icu4c/APIChangeReport.md ) / [ (HTML)] ( https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-78/icu4c/APIChangeReport.html )
106- * TODO: In C++, there is a new set of APIs for Unicode string (UTF-8/16/32) code point iteration
105+ * [ API changes since ICU4C 77 (Markdown)] ( https://github.com/unicode-org/icu/blob/maint/maint-78/icu4c/APIChangeReport.md ) /
106+ [ (HTML)] ( https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-78/icu4c/APIChangeReport.html )
107+ * The widely used class Locale has been optimized. Locale objects for most common locale IDs now only use 40 bytes
108+ (down from at least 224 bytes).
109+ ([ ICU-20392] ( https://unicode-org.atlassian.net/browse/ICU-20392 ) )
110+ * New set of “C++ Header-Only APIs” for Unicode string (UTF-8/16/32) code point iteration
107111 that works seamlessly with modern C++ iterators and ranges.
108- * TODO: old news here
109- * New APIs for colloquial C++ use of C USet ([ ICU-22876] ( https://unicode-org.atlassian.net/browse/ICU-22876 ) )
110- and C UCollator ([ ICU-22879] ( https://unicode-org.atlassian.net/browse/ICU-22879 ) )
111- * These were added in ICU 76, but some of the new APIs did not actually compile with ` U_SHOW_CPLUSPLUS_API=0 ` .
112- They have been fixed in ICU 77 and thoroughly tested.
113- USetElementIterator now returns std::u16string instead of icu::UnicodeString,
114- and therefore it and related APIs have been changed to ` @draft ICU 77 ` .
115- ([ ICU-22954] ( https://unicode-org.atlassian.net/browse/ICU-22954 ) )
116- * For details about these APIs and an example see the
117- “C++ Header-Only APIs” section of the [ ICU 76 Migration Issues] ( 76.md#migration-issues ) .
112+ As with the existing C macros, there are versions which validate the code unit sequences on the fly,
113+ as well as fast but “unsafe” versions which assume & require well-formed strings.
114+ ([ ICU-23004] ( https://unicode-org.atlassian.net/browse/ICU-23004 )] ):\
115+ [ unicode/utfiterator.h] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/utfiterator_8h.html ) \
116+ (For an introduction to “C++ Header-Only APIs” see
117+ this section of the [ ICU 76 Migration Issues] ( 76.md#migration-issues ) .)
118+ * Additional Unicode helper APIs
119+ ([ ICU-23152] ( https://unicode-org.atlassian.net/browse/ICU-23152 ) ):
120+ * ([ unicode/utf.h] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/utf_8h.html ) ,
121+ [ unicode/utf8.h] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/utf8_8h.html ) ):
122+ U_IS_CODE_POINT(cp), U_IS_SCALAR_VALUE(cp), U8_LENGTH_FROM_LEAD_BYTE\[ _ UNSAFE\]
123+ * ([ icu::UnicodeString] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/classicu_1_1UnicodeString.html ) ):
124+ begin(), end(), rbegin(), rend(), push_back(c)
125+ (a UnicodeString is now a C++ “range” of char16_t code units),
126+ toUTF8String() without output string * parameter*
127+ * (new [ unicode/utfiterator.h] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/utfiterator_8h.html ) ):
128+ “Range” classes AllCodePoints & AllScalarValues
129+ * (new [ unicode/utfstring.h] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/utfstring_8h.html ) ):
130+ Functions that write a code point to a string object
118131
119132## ICU4J Specific Changes
120133
@@ -125,11 +138,23 @@ Note: There may be additional commits on the [maint/maint-78](https://github.com
125138 Note that [ Android desugaring] ( https://developer.android.com/studio/write/java11-default-support-table )
126139 supports at least Java 11 since late 2023.
127140* [ API Changes since ICU4J 77] ( https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-78/icu4j/APIChangeReport.html )
128- * TODO: In Java, there is a draft new Segmenter API which is easier and safer to use than BreakIterator.
141+ * New Segmenter API which is easier and safer to use than BreakIterator.
142+ The new API builds immutable objects and returns Java Streams of boundaries and segments.
143+ ([ ICU-22789] ( https://unicode-org.atlassian.net/browse/ICU-22789 ) )\
144+ See [ package com.ibm.icu.segmenter] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4j/com/ibm/icu/segmenter/package-summary.html ) ,
145+ [ interface Segmenter] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4j/com/ibm/icu/segmenter/Segmenter.html )
146+ and its implementing classes, etc.
147+ * All clone() functions now explicitly return their class types, rather than Object (“covariant return types”),
148+ so that call sites no longer need to downcast.
149+ ([ ICU-23140] ( https://unicode-org.atlassian.net/browse/ICU-23140 ) )
150+ * Additional Unicode helper functions
151+ ([ class UCharacter] ( https://unicode-org.github.io/icu-docs/apidoc/dev/icu4j/com/ibm/icu/lang/UCharacter.html ) ):
152+ isNoncharacter(cp), isScalarValue(cp), allCodePoints(), allScalarValuesStream(), etc.
153+ ([ ICU-23152] ( https://unicode-org.atlassian.net/browse/ICU-23152 ) )
154+ <!-- TODO: after release, change API docs links from dev to released -->
129155* We have removed the
130- [ ICU4J Locale Service Provider] ( ../userguide/icu4j/locale-service-provider.md )
156+ [ ICU4J Locale Service Provider] ( ../userguide/icu4j/locale-service-provider.md ) .
131157 ([ ICU-23071] ( https://unicode-org.atlassian.net/browse/ICU-23071 ) )\
132- ([ Maven: com.ibm.icu / icu4j-localespi / version 77.1] ( https://mvnrepository.com/artifact/com.ibm.icu/icu4j-localespi/77.1 ) ).
133158 It had become much less useful than when we added it and had very low usage.
134159 Projects that used it should call ICU4J directly instead.
135160* The Java implementation of the
@@ -169,8 +194,8 @@ ICU4J should work on Android API level 21 and later but may require “[library
169194## Download
170195
171196### GitHub
172- TODO rc tag:
173- Source and binary downloads are available on the git/GitHub tag page: < https://github.com/unicode-org/icu/releases/tag/release-78.1rc >
197+ Source and binary downloads are available on the git/GitHub tag page :
198+ < https://github.com/unicode-org/icu/releases/tag/release-78.1rc >
174199
175200See the [ Source Code Setup] ( ../devsetup/source/ ) page for how to download the ICU file tree directly from GitHub.
176201
@@ -183,4 +208,3 @@ ICU locale data was generated from CLDR data equivalent to:
183208* TODO: not published yet
184209* https://mvnrepository.com/artifact/com.ibm.icu/icu4j/78.1
185210* https://mvnrepository.com/artifact/com.ibm.icu/icu4j-charset/78.1
186- * https://mvnrepository.com/artifact/com.ibm.icu/icu4j-localespi/78.1
0 commit comments