Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -1641,6 +1641,17 @@ def warn_implicitly_retains_self : Warning <
"block implicitly retains 'self'; explicitly mention 'self' to indicate "
"this is intended behavior">,
InGroup<DiagGroup<"implicit-retain-self">>, DefaultIgnore;
def warn_blocks_capturing_this : Warning<"block implicitly captures 'this'">,
InGroup<DiagGroup<"blocks-capturing-this">>,
DefaultIgnore;
def warn_blocks_capturing_reference
: Warning<"block implicitly captures a C++ reference">,
InGroup<DiagGroup<"blocks-capturing-reference">>,
DefaultIgnore;
def warn_blocks_capturing_raw_pointer
: Warning<"block implicitly captures a raw pointer">,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think "raw pointer" is a term we use anywhere else in Clang diagnostics. You might say a "C pointer" or "non-Objective-C pointer".

The more I think about it, though, the more I find the idea of warning about pointer captures in all escaping blocks problematic. For references, sure, I think there's a very reasonable argument that they're specifically likely to be temporary. Even for this, though, that really depends on the kind of programming you're doing, although I suppose there's an argument that an escaping block might need to capture a smart pointer to this. (That's definitely not reliably true, though, e.g. if someone was doing manual refcounting.) For pointers in general... unfortunately, we just can't know the ownership semantics that the user intends; I think this would be very noisy. Maybe that's okay in an opt-in warning, but it's unfortunate, because I think there's potential for the reference-capture warning to be turned on by default eventually.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially developed this change because I was concerned about the number of occurrences of blocks capturing C pointers (or this or references) I was seeing during code reviews (or when investigating crashes). At this point this was just a feeling that there was a problem in our code base, but I had no hard data.

Having developed this change, I was able to build the code base I'm familiar with (Chrome on iOS) with the three warnings enabled, and spent the last few days looking at all the warnings reported by the current change, trying to get an idea of the volume of true and false positive they would find.

I'm still doing the analysis, but I can share some intermediate results. I saved all the build output, then collected all unique warnings triplet "warning-name, filename, line" and looks at each of them. This gives me a lower bound of the number of errors (as the same warning can happen multiple times per line if multiple pointers are captured).

Numbers of occurrences per warning in production code (additionally in test code):

  • -Wblocks-capturing-this: 28 (571)
  • -Wblocks-capturing-reference: 7 (37)
  • -Wblocks-capturing-raw-pointer: 53 (174)

Trying to categorize the 88 warnings in production

  • 53 are totally true positive, and would need to be fixed in our code base (60%),
  • 13 are captured blocks that are non-escaping blocks not marked as such (15%),
  • 7 are capturing pointers to static variables (8%),
  • 2 are capturing function pointers (2%) -- likely due to a bug in my implementation,
  • 2 are pointers to pointers and should have used __block variables anyway (2%).

The last 11 warnings are in third-party code I'm not really familiar with. If we ignore those, this gives 55 true positive (I'm counting the 53 above and the 2 pointers to pointers here) out of 77 warnings, or 73% of true positive.

Among the warnings I left in the false positive, it could be argued that some of them are not false positive but could be counted as true positive. For exemple, among the 7 pointers to static variables, only 4 of them could be deduced statically. The other 3 are blocks capturing const char* parameters where the function are only ever called with pointer to static string (this is something I know because I know the code base, but that the compiler cannot prove).

I didn't have time to look at all the warnings in test code as there are so many warnings reported there (but I think many could be fixed by marking a few helper function as taking a non-escaping block).

Given those numbers, I think those warning are valuable for the Chrome on iOS codebase. I am less familiar with other codebase, though the Chromium codebase is considered of a relatively good quality, so maybe the warnings could also be interesting in other code bases.

If you think that despite those numbers the warnings are not interesting in other code bases, then I will consider implementing them as Chrome specific clang plugins (the Chromium documentation recommends first trying to get warning adopted by clang, and this is why I started this effort).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's really interesting information, thanks. Can I also ask for the cross-table of true/false positive vs. the kind of capture?

One question I think we need to consider is what the suppression mechanism for this warning is. If someone wanted to live with all three of these warnings enabled in their builds, how would you expect them to turn the warning off for the false positives? Just #pragma clang diagnostic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table with the data with true/false positive vs kind of capture.

kind count true positive no escape block __block static pointer function pointer other ratio
this 28 20 0 0 0 0 8 71.43%
reference 7 3 3 0 1 0 0 42.86%
pointer 53 30 10 2 6 2 3 56.60%
total 88 53 13 2 7 2 11 60.23%

Regarding the suppression mechanism, besides #pragma clang diagnostic, the developer can be explicit about the capture either by declaring a local struct to store the captured values, or using variables marked with __block.

So for example, one of the "false positive" is this code in dawn

    Ref<DeviceBase> deviceRef = GetDevice();
    wgpu::Callback callback = hostMappedDesc->disposeCallback;
    void* userdata = hostMappedDesc->userdata;
    auto dispose = ^(void*, NSUInteger) {
        deviceRef->GetCallbackTaskManager()->AddCallbackTask(
            [callback, userdata] { callback(userdata); });
    };

I think this can be fixed by writing either

    Ref<DeviceBase> deviceRef = GetDevice();
    __block wgpu::Callback callback = hostMappedDesc->disposeCallback;
    __block void* userdata = hostMappedDesc->userdata;
    auto dispose = ^(void*, NSUInteger) {
        deviceRef->GetCallbackTaskManager()->AddCallbackTask(
            [callback, userdata] { callback(userdata); });
    };

or

    Ref<DeviceBase> deviceRef = GetDevice();
    struct { wgpu::Callback callback; void* userdata; } captured = {
        hostMappedDesc->disposeCallback,
        hostMappedDesc->userdata,
    };
    auto dispose = ^(void*, NSUInteger) {
        deviceRef->GetCallbackTaskManager()->AddCallbackTask(
            [captured] { captured.callback(captured.userdata); });
    };

Since the warning does not inspect the captured objects, and only warn about pointers and reference, using a structure hides the capture from the warning. In the same way by using __block the pointer are copied in the block explicitly, and thus the warning should not fire (I have not tested, if it does, I would consider this a bug in my implementation and will fix it). The same pattern can also work for references.

Basically, to disable the warning, you would have to be explicit that the capture is intended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like __block as a suppression mechanism, both because it adds real overhead and because it doesn't meaningfully change the danger. In fact, I don't think __block should be suppressing this warning at all.

I think you just need a new attribute to suppress the warning.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'll investigate defining and using a new attributes.

Do you think this should be an attributes on the captured variables e.g. __attribute__((capturable)) void* userData or on the block e.g. __attribute__((can_capture_pointers)) auto dispose = ^{ ... }; or both?

On one hand an attribute on the variable would make it difficult to silence the warnings about capturing this (since this is implicitly available, it is not easy to mark it with the attribute), and would probably require defining a local variable _this and mark it with the attribute.

On the other hand an attribute on the block would probably make it too easy to silence all the warnings.

Finally, I have a question about process. Should I summarize all the discussion in the issue #143924? Should I write a formal design document for the warning or is the github issue enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing an RFC on the forums is a good idea, yeah.

I agree that, as a "this is fine to capture because the lifetime is longer than you think" suppression mechanism, an attribute on the block seems too broad. It's unfortunate that putting attributes on the captured variables makes this really awkward, though. I don't have a good suggestion for that, only the not-great suggestion of allowing the declaration-specific attribute to also be written on the block with a list of captures to not warn about. If blocks could have explicit capture lists, putting an attribute on a capture list item would be a very appealing solution, but alas.

Regardless, we should probably allow __attribute__((noescape)) on the block as an "all these captures are fine because I promise this block doesn't outlive its scope" suppression mechanism.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

I'll work on writing an RFC.

Regarding __attribute__((noescape)), my current patch already respect this and do not report captures from blocks marked with this attribute.

InGroup<DiagGroup<"blocks-capturing-raw-pointer">>,
DefaultIgnore;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an existing block-capture-autoreleasing warning group that's probably the right pattern for these group names.

We can also create a diagnostic group that includes all of these warnings. I'm not sure if that's a good idea because of my concerns above, but I state it for completeness.

def warn_arc_possible_repeated_use_of_weak : Warning <
"weak %select{variable|property|implicit property|instance variable}0 %1 may "
"be accessed multiple times in this %select{function|method|block|lambda}2 "
Expand Down
19 changes: 17 additions & 2 deletions clang/include/clang/Sema/Sema.h
Original file line number Diff line number Diff line change
Expand Up @@ -8227,6 +8227,22 @@ class Sema final : public SemaBase {
}
};

/// Store information about a diagnosable block catpure.
struct BlockCapture {
/// Enumeration representing types of block captures that may be
/// diagnosable because they could be problematic.
enum CaptureType {
Self,
This,
Reference,
RawPointer,
};

SourceLocation Loc;
const BlockDecl *BD;
CaptureType Type;
};

/// Check an argument list for placeholders that we won't try to
/// handle later.
bool CheckArgsForPlaceholders(MultiExprArg args);
Expand All @@ -8243,8 +8259,7 @@ class Sema final : public SemaBase {

/// List of SourceLocations where 'self' is implicitly retained inside a
/// block.
llvm::SmallVector<std::pair<SourceLocation, const BlockDecl *>, 1>
ImplicitlyRetainedSelfLocs;
llvm::SmallVector<BlockCapture, 1> DiagnosableBlockCaptures;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I am concerned about this being global on Sema rather than tied to something like a specific function scope info. We might end up diagnosing the same thing in multiple places when we have nested functions after the block capture, e.g. because of lambdas or local classes — anything that causes us to end a function body while the outer function body is still being processed. Maybe we just never saw this with the narrow case of implicit self captures that this was used with before.


/// Do an explicit extend of the given block pointer if we're in ARC.
void maybeExtendBlockObject(ExprResult &E);
Expand Down
44 changes: 35 additions & 9 deletions clang/lib/Sema/SemaDecl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16149,7 +16149,7 @@ class ExitFunctionBodyRAII {
bool IsLambda = false;
};

static void diagnoseImplicitlyRetainedSelf(Sema &S) {
static void diagnoseBlockCaptures(Sema &S) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe diagnoseEscapingBlockCaptures?

llvm::DenseMap<const BlockDecl *, bool> EscapeInfo;

auto IsOrNestedInEscapingBlock = [&](const BlockDecl *BD) {
Expand All @@ -16170,13 +16170,35 @@ static void diagnoseImplicitlyRetainedSelf(Sema &S) {
return It->second = R;
};

// If the location where 'self' is implicitly retained is inside a escaping
// block, emit a diagnostic.
for (const std::pair<SourceLocation, const BlockDecl *> &P :
S.ImplicitlyRetainedSelfLocs)
if (IsOrNestedInEscapingBlock(P.second))
S.Diag(P.first, diag::warn_implicitly_retains_self)
<< FixItHint::CreateInsertion(P.first, "self->");
for (const auto &C : S.DiagnosableBlockCaptures) {
// Do not emit a diagnostic if the location is invalid.
if (!C.Loc.isValid())
continue;

// Do not emit a diagnostic if the block is not escaping.
if (!IsOrNestedInEscapingBlock(C.BD))
continue;

// Emit the correct warning depending on the type of capture.
switch (C.Type) {
case Sema::BlockCapture::Self:
S.Diag(C.Loc, diag::warn_implicitly_retains_self)
<< FixItHint::CreateInsertion(C.Loc, "self->");
break;

case Sema::BlockCapture::This:
S.Diag(C.Loc, diag::warn_blocks_capturing_this);
break;

case Sema::BlockCapture::Reference:
S.Diag(C.Loc, diag::warn_blocks_capturing_reference);
break;

case Sema::BlockCapture::RawPointer:
S.Diag(C.Loc, diag::warn_blocks_capturing_raw_pointer);
break;
}
}
}

static bool methodHasName(const FunctionDecl *FD, StringRef Name) {
Expand Down Expand Up @@ -16511,6 +16533,10 @@ Decl *Sema::ActOnFinishFunctionBody(Decl *dcl, Stmt *Body,
}
}

// Warn about block captures, unless this is a lambda function.
if (!isLambdaCallOperator(FD))
diagnoseBlockCaptures(*this);

assert((FD == getCurFunctionDecl(/*AllowLambdas=*/true)) &&
"Function parsing confused");
} else if (ObjCMethodDecl *MD = dyn_cast_or_null<ObjCMethodDecl>(dcl)) {
Expand Down Expand Up @@ -16563,7 +16589,7 @@ Decl *Sema::ActOnFinishFunctionBody(Decl *dcl, Stmt *Body,
FSI->ObjCWarnForNoInitDelegation = false;
}

diagnoseImplicitlyRetainedSelf(*this);
diagnoseBlockCaptures(*this);
} else {
// Parsing the function declaration failed in some way. Pop the fake scope
// we pushed on.
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Sema/SemaDeclObjC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -367,7 +367,7 @@ HasExplicitOwnershipAttr(Sema &S, ParmVarDecl *Param) {
/// and user declared, in the method definition's AST.
void SemaObjC::ActOnStartOfObjCMethodDef(Scope *FnBodyScope, Decl *D) {
ASTContext &Context = getASTContext();
SemaRef.ImplicitlyRetainedSelfLocs.clear();
SemaRef.DiagnosableBlockCaptures.clear();
assert((SemaRef.getCurMethodDecl() == nullptr) && "Methodparsing confused");
ObjCMethodDecl *MDecl = dyn_cast_or_null<ObjCMethodDecl>(D);

Expand Down
13 changes: 13 additions & 0 deletions clang/lib/Sema/SemaExpr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16536,6 +16536,11 @@ ExprResult Sema::ActOnBlockStmtExpr(SourceLocation CaretLoc,
// Set the captured variables on the block.
SmallVector<BlockDecl::Capture, 4> Captures;
for (Capture &Cap : BSI->Captures) {
if (Cap.isThisCapture()) {
DiagnosableBlockCaptures.push_back(
Sema::BlockCapture{Cap.getLocation(), BD, Sema::BlockCapture::This});
}

if (Cap.isInvalid() || Cap.isThisCapture())
continue;
// Cap.getVariable() is always a VarDecl because
Expand Down Expand Up @@ -16595,6 +16600,14 @@ ExprResult Sema::ActOnBlockStmtExpr(SourceLocation CaretLoc,
}
}

QualType Type = Cap.getCaptureType();
if (Type->isPointerOrReferenceType()) {
DiagnosableBlockCaptures.push_back(Sema::BlockCapture{
Cap.getLocation(), BD,
Type->isReferenceType() ? Sema::BlockCapture::Reference
: Sema::BlockCapture::RawPointer});
}

BlockDecl::Capture NewCap(Var, Cap.isBlockCapture(), Cap.isNested(),
CopyExpr);
Captures.push_back(NewCap);
Expand Down
6 changes: 4 additions & 2 deletions clang/lib/Sema/SemaExprObjC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4913,8 +4913,10 @@ ExprResult SemaObjC::BuildIvarRefExpr(Scope *S, SourceLocation Loc,
SemaRef.getCurFunction()->recordUseOfWeak(Result);
}
if (getLangOpts().ObjCAutoRefCount && !SemaRef.isUnevaluatedContext())
if (const BlockDecl *BD = SemaRef.CurContext->getInnermostBlockDecl())
SemaRef.ImplicitlyRetainedSelfLocs.push_back({Loc, BD});
if (const BlockDecl *BD = SemaRef.CurContext->getInnermostBlockDecl()) {
SemaRef.DiagnosableBlockCaptures.push_back(
Sema::BlockCapture{Loc, BD, Sema::BlockCapture::Self});
}

return Result;
}
Expand Down
103 changes: 103 additions & 0 deletions clang/test/SemaObjCXX/warn-blocks-capturing-pointers.mm
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
// RUN: %clang_cc1 -x objective-c++ -std=c++11 -fobjc-arc -fblocks -Wblocks-capturing-this -Wblocks-capturing-reference -Wblocks-capturing-raw-pointer -Wimplicit-retain-self -verify %s

typedef void (^BlockTy)();

void noescapeFunc(__attribute__((noescape)) BlockTy);
void escapeFunc(BlockTy);

@interface Root @end

@interface I : Root
- (void)foo;
@end

class F {
public:
void Foo() const;
void FooAsync(F* p, F& r, I* i) {
escapeFunc(
^{
Foo(); // expected-warning {{block implicitly captures 'this'}}
p->Foo(); // expected-warning {{block implicitly captures a raw pointer}}
r.Foo(); // expected-warning {{block implicitly captures a C++ reference}}
[i foo];
});

([=, &r]() {
escapeFunc(
^{
Foo(); // expected-warning {{block implicitly captures 'this'}}
p->Foo(); // expected-warning {{block implicitly captures a raw pointer}}
r.Foo(); // expected-warning {{block implicitly captures a C++ reference}}
[i foo];
});
})();

escapeFunc(
^{
([=]() {
Foo(); // expected-warning {{block implicitly captures 'this'}}
p->Foo(); // expected-warning {{block implicitly captures a raw pointer}}
r.Foo(); // expected-warning {{block implicitly captures a C++ reference}}
[i foo];
})();
});

noescapeFunc(
^{
Foo();
p->Foo();
r.Foo();
[i foo];
});
}
};

@implementation I {
int _bar;
}

- (void)doSomethingWithPtr:(F*)p
ref:(F&)r
obj:(I*)i {
escapeFunc(
^{
(void)_bar; // expected-warning {{block implicitly retains 'self'; explicitly mention 'self' to indicate this is intended behavior}}
p->Foo(); // expected-warning {{block implicitly captures a raw pointer}}
r.Foo(); // expected-warning {{block implicitly captures a C++ reference}}
[i foo];
});

([=, &r]() {
escapeFunc(
^{
(void)_bar; // expected-warning {{block implicitly retains 'self'; explicitly mention 'self' to indicate this is intended behavior}}
p->Foo(); // expected-warning {{block implicitly captures a raw pointer}}
r.Foo(); // expected-warning {{block implicitly captures a C++ reference}}
[i foo];
});
})();

escapeFunc(
^{
([=]() {
(void)_bar; // expected-warning {{block implicitly retains 'self'; explicitly mention 'self' to indicate this is intended behavior}}
p->Foo(); // expected-warning {{block implicitly captures a raw pointer}}
r.Foo(); // expected-warning {{block implicitly captures a C++ reference}}
[i foo];
})();
});

noescapeFunc(
^{
(void)_bar;
p->Foo();
r.Foo();
[i foo];
});
}

- (void)foo {
}

@end
Loading