Skip to content

Conversation

@roubert
Copy link
Member

@roubert roubert commented Feb 2, 2026

Checklist

  • Required: Issue filed: ICU-23310
  • Required: The PR title must be prefixed with a JIRA Issue number. Example: "ICU-NNNNN Fix xyz"
  • Required: Each commit message must be prefixed with a JIRA Issue number. Example: "ICU-NNNNN Fix xyz"
  • Issue accepted (done by Technical Committee after discussion)
  • Tests included, if applicable
  • API docs and/or User Guide docs changed or added, if applicable
  • Approver: Feel free to merge on my behalf

ALLOW_MANY_COMMITS=true


/** For case closure on a large set, look only at code points with relevant properties. */
const UnicodeSet &maybeOnlyCaseSensitive(const UnicodeSet &src, UnicodeSet &subset) {
const UnicodeSet &maybeOnlyCaseSensitive(const UnicodeSet &src U_LIFETIME_BOUND, UnicodeSet &subset) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const UnicodeSet &maybeOnlyCaseSensitive(const UnicodeSet &src U_LIFETIME_BOUND, UnicodeSet &subset) {
const UnicodeSet &maybeOnlyCaseSensitive(const UnicodeSet &src U_LIFETIME_BOUND, UnicodeSet &subset U_LIFETIME_BOUND) {

and please wrap the line so it is no longer than 100

Copy link
Member Author

@roubert roubert Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion doesn't look right to me, subset is a non-const reference so it couldn't be bound to a temporary (and therefore doesn't need the lifetimebound annotation for the compiler to prevent that).

Isn't that so? Is there some case where it really would need to be annotated?

Also, the current column limit in ICU4C is 105 not 100 (see .clang-format). I don't have anything against changing the current default, 105 always seemed like a really odd number to me, but I don't think it makes any sense to reformat one particular commit to a column limit other than the currently configured default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your suggestion doesn't look right to me, subset is a non-const reference so it couldn't be bound to a temporary (and therefore doesn't need the lifetimebound annotation for the compiler to prevent that).

It's totally possible that I don't yet understand lifetimebound.

I added the suggestions where I saw in the code that the object can be returned, like return subset in this function. Does that not need to be annotated because it can't be a temporary value?

Also, the current column limit in ICU4C is 105 not 100 (see .clang-format). I don't have anything against changing the current default, 105 always seemed like a really odd number to me, but I don't think it makes any sense to reformat one particular commit to a column limit other than the currently configured default.

105 seems weird. I also don't think many people use the .clang-format file in ICU4C.
If you keep to 105 for now, that's ok.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that not need to be annotated because it can't be a temporary value?

Let's put it like this: Can you come up with some example code which will compile just fine despite returning an invalid reference through subset here, unless it's annotated with lifetimebound? (I can't.)

I also don't think many people use the .clang-format file in ICU4C.

I wasn't the one who added that file so we are at least two humans using it, plus Clang-Tidy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the Clang & MS docs again. I don't see a restriction or recommendation to use this annotation only for const reference parameters.

Clang docs: “The lifetimebound attribute on a function parameter or implicit object parameter indicates that objects that are referred to by that parameter may also be referred to by the return value of the annotated function ...”

What about the following which will usually return subset?

const UnicodeSet &helper(const UnicodeSet &src U_LIFETIME_BOUND) {
    UnicodeSet subset(0, 0x10ffff);
    return maybeOnlyCaseSensitive(src, subset);
}

namespace {

const UnicodeString &fixLabel(const UnicodeString &current, UnicodeString &temp) {
const UnicodeString &fixLabel(const UnicodeString &current U_LIFETIME_BOUND, UnicodeString &temp) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const UnicodeString &fixLabel(const UnicodeString &current U_LIFETIME_BOUND, UnicodeString &temp) {
const UnicodeString &fixLabel(const UnicodeString &current U_LIFETIME_BOUND, UnicodeString &temp U_LIFETIME_BOUND) {

and please wrap the line <=100

* Returns s if no replacement was made, otherwise buffer.
*/
const UnicodeString &surrogatesToFFFD(const UnicodeString &s, UnicodeString &buffer) {
const UnicodeString &surrogatesToFFFD(const UnicodeString &s U_LIFETIME_BOUND, UnicodeString &buffer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const UnicodeString &surrogatesToFFFD(const UnicodeString &s U_LIFETIME_BOUND, UnicodeString &buffer) {
const UnicodeString &surrogatesToFFFD(const UnicodeString &s U_LIFETIME_BOUND, UnicodeString &buffer U_LIFETIME_BOUND) {

and wrap <=100

Comment on lines +271 to 272
const UnicodeSet &abbreviateSet(const UnicodeSet &set U_LIFETIME_BOUND, bool abbreviated, int density,
UnicodeSet &copy) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const UnicodeSet &abbreviateSet(const UnicodeSet &set U_LIFETIME_BOUND, bool abbreviated, int density,
UnicodeSet &copy) {
const UnicodeSet &abbreviateSet(const UnicodeSet &set U_LIFETIME_BOUND, bool abbreviated, int density,
UnicodeSet &copy U_LIFETIME_BOUND) {

Comment on lines 454 to 457
// Get and return the new Modifier
const Modifier* mod = parameters.obj->getModifier(parameters.signum, resultPlural);
U_ASSERT(mod != nullptr);
return *mod;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc could this ever return second?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this borrows either first or self but not second.

@grhoten
Copy link
Member

grhoten commented Feb 2, 2026

I highly recommend reviewing my branch on this topic. Though I do admit that my experience with this annotation is next to nothing. I also understand that the U_LIFETIME_CAPTURE_BY annotation is explicitly excluded from the design proposal at this time.

I only reviewed the public API in my branch. These changes seem to be more focused on internal API.

Example areas that should be looked at in more detail include:

  • The UnicodeString aliasing, and access to internal buffers.
  • The raw pointers returned from ULocale.
  • UText API. While it's C API, it can benefit from C++ annotation when it's used from C++ code.
  • Lots of the object adoption usage in the calendar, number formatter, and transliterator. Some API return an object containing a reference to the parameter, and that object manages when to delete the parameter.
  • Other various API that should be investigated for U_LIFETIME_BOUND usage mentioned in that branch.

I want to make sure that my branch is at least looked at. If it's not worth reviewing, at least provide a reason.

@grhoten
Copy link
Member

grhoten commented Feb 2, 2026

Here's a bad example that I think should be warned about. The setTo is aliasing the getTerminatedBuffer value. It's deleted upon return, and then the UnicodeString is asked to return the terminated buffer from the deleted object.

UnicodeString str;
str.setTo(true, UnicodeString(u"This is a long string that should be allocated and copied.").getTerminatedBuffer(), -1).getTerminatedBuffer();

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants