-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Clang] Introduce OverflowBehaviorType for fine-grained overflow control #148914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lldb @llvm/pr-subscribers-clang-driver Author: Justin Stitt (JustinStitt) ChangesIntroduce The new
A key aspect of this feature is its interaction with existing mechanisms. This implementation also includes specific promotion rules to preserve the intended overflow behavior within expressions and new diagnostics to warn against accidentally discarding the attribute. See the docs and tests for examples and more info. RFCsCCs (because of your participation in RFCs or previous implementations)@kees @AaronBallman @vitalybuka @melver @efriedma-quic Patch is 138.97 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/148914.diff 66 Files Affected:
diff --git a/clang/docs/OverflowBehaviorTypes.rst b/clang/docs/OverflowBehaviorTypes.rst
new file mode 100644
index 0000000000000..18e337e181e8a
--- /dev/null
+++ b/clang/docs/OverflowBehaviorTypes.rst
@@ -0,0 +1,269 @@
+=====================
+OverflowBehaviorTypes
+=====================
+
+.. contents::
+ :local:
+
+Introduction
+============
+
+Clang provides a type attribute that allows developers to have fine-grained control
+over the overflow behavior of integer types. The ``overflow_behavior``
+attribute can be used to specify how arithmetic operations on a given integer
+type should behave upon overflow. This is particularly useful for projects that
+need to balance performance and safety, allowing developers to enable or
+disable overflow checks for specific types.
+
+The attribute can be enabled using the compiler option
+``-foverflow-behavior-types``.
+
+The attribute syntax is as follows:
+
+.. code-block:: c++
+
+ __attribute__((overflow_behavior(behavior)))
+
+Where ``behavior`` can be one of the following:
+
+* ``wrap``: Specifies that arithmetic operations on the integer type should
+ wrap on overflow. This is equivalent to the behavior of ``-fwrapv``, but it
+ applies only to the attributed type and may be used with both signed and
+ unsigned types. When this is enabled, UBSan's integer overflow and integer
+ truncation checks (``signed-integer-overflow``,
+ ``unsigned-integer-overflow``, ``implicit-signed-integer-truncation``, and
+ ``implicit-unsigned-integer-truncation``) are suppressed for the attributed
+ type.
+
+* ``no_wrap``: Specifies that arithmetic operations on the integer type should
+ be checked for overflow. When using the ``signed-integer-overflow`` sanitizer
+ or when using ``-ftrapv`` alongside a signed type, this is the default
+ behavior. Using this, one may enforce overflow checks for a type even when
+ ``-fwrapv`` is enabled globally.
+
+This attribute can be applied to ``typedef`` declarations and to integer types directly.
+
+Examples
+========
+
+Here is an example of how to use the ``overflow_behavior`` attribute with a ``typedef``:
+
+.. code-block:: c++
+
+ typedef unsigned int __attribute__((overflow_behavior(no_wrap))) non_wrapping_uint;
+
+ non_wrapping_uint add_one(non_wrapping_uint a) {
+ return a + 1; // Overflow is checked for this operation.
+ }
+
+Here is an example of how to use the ``overflow_behavior`` attribute with a type directly:
+
+.. code-block:: c++
+
+ int mul_alot(int n) {
+ int __attribute__((overflow_behavior(wrap))) a = n;
+ return a * 1337; // Potential overflow is not checked and is well-defined
+ }
+
+"Well-defined" overflow is consistent with two's complement wrap-around
+semantics and won't be removed via eager compiler optimizations (like some
+undefined behavior might).
+
+Overflow behavior types are implicitly convertible to and from built-in
+integral types.
+
+Note that C++ overload set formation rules treat promotions to and from
+overflow behavior types the same as normal integral promotions and conversions.
+
+Interaction with Command-Line Flags and Sanitizer Special Case Lists
+====================================================================
+
+The ``overflow_behavior`` attribute interacts with sanitizers, ``-ftrapv``,
+``-fwrapv``, and Sanitizer Special Case Lists (SSCL) by wholly overriding these
+global flags. The following table summarizes the interactions:
+
+.. list-table:: Overflow Behavior Precedence
+ :widths: 15 15 15 15 20 15
+ :header-rows: 1
+
+ * - Behavior
+ - Default(No Flags)
+ - -ftrapv
+ - -fwrapv
+ - Sanitizers
+ - SSCL
+ * - ``overflow_behavior(wrap)``
+ - Wraps
+ - No trap
+ - Wraps
+ - No report
+ - Overrides SSCL
+ * - ``overflow_behavior(no_wrap)``
+ - Traps
+ - Traps
+ - Traps
+ - Reports
+ - Overrides SSCL
+
+It is important to note the distinction between signed and unsigned types. For
+unsigned integers, which wrap on overflow by default, ``overflow_behavior(no_wrap)``
+is particularly useful for enabling overflow checks. For signed integers, whose
+overflow behavior is undefined by default, ``overflow_behavior(wrap)`` provides
+a guaranteed wrapping behavior.
+
+The ``overflow_behavior`` attribute can be used to override the behavior of
+entries from a :doc:`SanitizerSpecialCaseList`. This is useful for allowlisting
+specific types into overflow instrumentation.
+
+Promotion Rules
+===============
+
+The promotion rules for overflow behavior types are designed to preserve the
+specified overflow behavior throughout an arithmetic expression. They differ
+from standard C/C++ integer promotions but in a predictable way, similar to
+how ``_Complex`` and ``_BitInt`` have their own promotion rules.
+
+* **OBT and Standard Integer Type**: In an operation involving an overflow
+ behavior type (OBT) and a standard integer type, the result will have the
+ type of the OBT, including its overflow behavior, sign, and bit-width. The
+ standard integer type is implicitly converted to match the OBT.
+
+ .. code-block:: c++
+
+ typedef char __attribute__((overflow_behavior(no_wrap))) no_wrap_char;
+ // The result of this expression is no_wrap_char.
+ no_wrap_char c;
+ unsigned long ul;
+ auto result = c + ul;
+
+* **Two OBTs of the Same Kind**: When an operation involves two OBTs of the
+ same kind (e.g., both ``wrap``), the result will have the larger of the two
+ bit-widths. If the bit-widths are the same, an unsigned type is favored over
+ a signed one.
+
+ .. code-block:: c++
+
+ typedef unsigned char __attribute__((overflow_behavior(wrap))) u8_wrap;
+ typedef unsigned short __attribute__((overflow_behavior(wrap))) u16_wrap;
+ // The result of this expression is u16_wrap.
+ u8_wrap a;
+ u16_wrap b;
+ auto result = a + b;
+
+* **Two OBTs of Different Kinds**: In an operation between a ``wrap`` and a
+ ``no_wrap`` type, a ``no_wrap`` is produced. It is recommended to avoid such
+ operations, as Clang may emit a warning for such cases in the future.
+ Regardless, the resulting type matches the bit-width, sign and behavior of
+ the ``no_wrap`` type.
+
+Diagnostics
+===========
+
+Clang provides diagnostics to help developers manage overflow behavior types.
+
+-Wimplicitly-discarded-overflow-behavior
+----------------------------------------
+
+This warning is issued when an overflow behavior type is implicitly converted
+to a standard integer type, which may lead to the loss of the specified
+overflow behavior.
+
+.. code-block:: c++
+
+ typedef int __attribute__((overflow_behavior(wrap))) wrapping_int;
+
+ void some_function(int);
+
+ void another_function(wrapping_int w) {
+ some_function(w); // warning: implicit conversion from 'wrapping_int' to
+ // 'int' discards overflow behavior
+ }
+
+To fix this, you can explicitly cast the overflow behavior type to a standard
+integer type.
+
+.. code-block:: c++
+
+ typedef int __attribute__((overflow_behavior(wrap))) wrapping_int;
+
+ void some_function(int);
+
+ void another_function(wrapping_int w) {
+ some_function(static_cast<int>(w)); // OK
+ }
+
+This warning acts as a group that includes
+``-Wimplicitly-discarded-overflow-behavior-pedantic`` and
+``-Wimplicitly-discarded-overflow-behavior-assignment``.
+
+-Wimplicitly-discarded-overflow-behavior-pedantic
+-------------------------------------------------
+
+A less severe version of the warning, ``-Wimplicitly-discarded-overflow-behavior-pedantic``,
+is issued for implicit conversions from an unsigned wrapping type to a standard
+unsigned integer type. This is considered less problematic because both types
+have well-defined wrapping behavior, but the conversion still discards the
+explicit ``overflow_behavior`` attribute.
+
+.. code-block:: c++
+
+ typedef unsigned int __attribute__((overflow_behavior(wrap))) wrapping_uint;
+
+ void some_function(unsigned int);
+
+ void another_function(wrapping_uint w) {
+ some_function(w); // warning: implicit conversion from 'wrapping_uint' to
+ // 'unsigned int' discards overflow behavior
+ // [-Wimplicitly-discarded-overflow-behavior-pedantic]
+ }
+
+-Wimplicitly-discarded-overflow-behavior-assignment
+---------------------------------------------------
+
+This warning is issued when an overflow behavior type is implicitly converted
+to a standard integer type as part of an assignment, which may lead to the
+loss of the specified overflow behavior. This is a more specific version of
+the ``-Wimplicitly-discarded-overflow-behavior`` warning, and it is off by
+default.
+
+.. code-block:: c++
+
+ typedef int __attribute__((overflow_behavior(wrap))) wrapping_int;
+
+ void some_function() {
+ wrapping_int w = 1;
+ int i = w; // warning: implicit conversion from 'wrapping_int' to 'int'
+ // discards overflow behavior
+ // [-Wimplicitly-discarded-overflow-behavior-assignment]
+ }
+
+To fix this, you can explicitly cast the overflow behavior type to a standard
+integer type.
+
+.. code-block:: c++
+
+ typedef int __attribute__((overflow_behavior(wrap))) wrapping_int;
+
+ void some_function() {
+ wrapping_int w = 1;
+ int i = static_cast<int>(w); // OK
+ int j = (int)w; // C-style OK
+ }
+
+
+-Woverflow-behavior-attribute-ignored
+-------------------------------------
+
+This warning is issued when the ``overflow_behavior`` attribute is applied to
+a type that is not an integer type.
+
+.. code-block:: c++
+
+ typedef float __attribute__((overflow_behavior(wrap))) wrapping_float;
+ // warning: 'overflow_behavior' attribute only applies to integer types;
+ // attribute is ignored [-Woverflow-behavior-attribute-ignored]
+
+ typedef struct S { int i; } __attribute__((overflow_behavior(wrap))) S_t;
+ // warning: 'overflow_behavior' attribute only applies to integer types;
+ // attribute is ignored [-Woverflow-behavior-attribute-ignored]
+
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 970825c98fec1..0c9139cb252d2 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -378,6 +378,8 @@ New Compiler Flags
- New options ``-g[no-]key-instructions`` added, disabled by default. Reduces jumpiness of debug stepping for optimized code in some debuggers (not LLDB at this time). Not recommended for use without optimizations. DWARF only. Note both the positive and negative flags imply ``-g``.
+- New option ``-foverflow-behavior-types`` added to enable parsing of the ``overflow_behavior`` type attribute.
+
Deprecated Compiler Flags
-------------------------
@@ -478,6 +480,10 @@ related warnings within the method body.
- Clang will print the "reason" string argument passed on to
``[[clang::warn_unused_result("reason")]]`` as part of the warning diagnostic.
+- Introduced a new type attribute ``__attribute__((overflow_behavior))`` which
+ currently accepts either ``wrap`` or ``no_wrap`` as an argument, enabling
+ type-level control over overflow behavior.
+
Improvements to Clang's diagnostics
-----------------------------------
diff --git a/clang/docs/SanitizerSpecialCaseList.rst b/clang/docs/SanitizerSpecialCaseList.rst
index 2c50778d0f491..f4c9274b89b68 100644
--- a/clang/docs/SanitizerSpecialCaseList.rst
+++ b/clang/docs/SanitizerSpecialCaseList.rst
@@ -133,6 +133,40 @@ precedence. Here are a few examples.
fun:*bar
fun:bad_bar=sanitize
+Interaction with Overflow Behavior Types
+----------------------------------------
+
+The ``overflow_behavior`` attribute provides a more granular, source-level
+control that takes precedence over the Sanitizer Special Case List. If a type
+is given an ``overflow_behavior`` attribute, it will override any matching
+``type:`` entry in a special case list.
+
+This allows developers to enforce a specific overflow behavior for a critical
+type, even if a broader rule in the special case list would otherwise disable
+instrumentation for it.
+
+.. code-block:: bash
+
+ $ cat ignorelist.txt
+ # Disable signed overflow checks for all types by default.
+ [signed-integer-overflow]
+ type:*
+
+ $ cat foo.c
+ // Force 'critical_type' to always have overflow checks,
+ // overriding the ignorelist.
+ typedef int __attribute__((overflow_behavior(no_wrap))) critical_type;
+
+ void foo(int x) {
+ critical_type a = x;
+ a++; // Overflow is checked here due to the 'no_wrap' attribute.
+
+ int b = x;
+ b++; // Overflow is NOT checked here due to the ignorelist.
+ }
+
+For more details on overflow behavior types, see :doc:`OverflowBehaviorTypes`.
+
Format
======
diff --git a/clang/docs/UndefinedBehaviorSanitizer.rst b/clang/docs/UndefinedBehaviorSanitizer.rst
index 0a2d833783e57..f98eda1e9399c 100644
--- a/clang/docs/UndefinedBehaviorSanitizer.rst
+++ b/clang/docs/UndefinedBehaviorSanitizer.rst
@@ -380,6 +380,28 @@ This attribute may not be
supported by other compilers, so consider using it together with
``#if defined(__clang__)``.
+Disabling Overflow Instrumentation with ``__attribute__((overflow_behavior(wrap)))``
+------------------------------------------------------------------------------------
+
+For more fine-grained control over how integer overflow is handled, you can use
+the ``__attribute__((overflow_behavior(wrap)))`` attribute. This attribute can
+be applied to ``typedef`` declarations and integer types to specify that
+arithmetic operations on that type should wrap on overflow. This can be used to
+disable overflow sanitization for specific types, while leaving it enabled for
+all other types.
+
+For more information, see :doc:`OverflowBehaviorTypes`.
+
+Enforcing Overflow Instrumentation with ``__attribute__((overflow_behavior(no_wrap)))``
+---------------------------------------------------------------------------------------
+
+Conversely, you can use ``__attribute__((overflow_behavior(no_wrap)))`` to
+enforce overflow checks for a specific type, even when ``-fwrapv`` is enabled
+globally. This is useful for ensuring that critical calculations are always
+checked for overflow, regardless of the global compiler settings.
+
+For more information, see :doc:`OverflowBehaviorTypes`.
+
Suppressing Errors in Recompiled Code (Ignorelist)
--------------------------------------------------
diff --git a/clang/docs/index.rst b/clang/docs/index.rst
index 6c792af66a62c..3205709aa395a 100644
--- a/clang/docs/index.rst
+++ b/clang/docs/index.rst
@@ -40,6 +40,7 @@ Using Clang as a Compiler
SanitizerCoverage
SanitizerStats
SanitizerSpecialCaseList
+ OverflowBehaviorTypes
BoundsSafety
BoundsSafetyAdoptionGuide
BoundsSafetyImplPlans
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 2b9cd035623cc..dbfc8df6fba5e 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -14,6 +14,7 @@
#ifndef LLVM_CLANG_AST_ASTCONTEXT_H
#define LLVM_CLANG_AST_ASTCONTEXT_H
+#include "Type.h"
#include "clang/AST/ASTFwd.h"
#include "clang/AST/CanonicalType.h"
#include "clang/AST/CommentCommandTraits.h"
@@ -259,6 +260,7 @@ class ASTContext : public RefCountedBase<ASTContext> {
mutable llvm::ContextualFoldingSet<DependentBitIntType, ASTContext &>
DependentBitIntTypes;
mutable llvm::FoldingSet<BTFTagAttributedType> BTFTagAttributedTypes;
+ mutable llvm::FoldingSet<OverflowBehaviorType> OverflowBehaviorTypes;
llvm::FoldingSet<HLSLAttributedResourceType> HLSLAttributedResourceTypes;
llvm::FoldingSet<HLSLInlineSpirvType> HLSLInlineSpirvTypes;
@@ -899,6 +901,8 @@ class ASTContext : public RefCountedBase<ASTContext> {
bool isTypeIgnoredBySanitizer(const SanitizerMask &Mask,
const QualType &Ty) const;
+ bool isUnaryOverflowPatternExcluded(const UnaryOperator *UO);
+
const XRayFunctionFilter &getXRayFilter() const {
return *XRayFilter;
}
@@ -999,6 +1003,15 @@ class ASTContext : public RefCountedBase<ASTContext> {
comments::FullComment *getCommentForDecl(const Decl *D,
const Preprocessor *PP) const;
+ /// Attempts to merge two types that may be OverflowBehaviorTypes.
+ ///
+ /// \returns A QualType if the types were handled, std::nullopt otherwise.
+ /// A null QualType indicates an incompatible merge.
+ std::optional<QualType>
+ tryMergeOverflowBehaviorTypes(QualType LHS, QualType RHS, bool OfBlockPointer,
+ bool Unqualified, bool BlockReturnType,
+ bool IsConditionalOperator);
+
/// Return parsed documentation comment attached to a given declaration.
/// Returns nullptr if no comment is attached. Does not look at any
/// redeclarations of the declaration.
@@ -1843,6 +1856,13 @@ class ASTContext : public RefCountedBase<ASTContext> {
QualType getBTFTagAttributedType(const BTFTypeTagAttr *BTFAttr,
QualType Wrapped) const;
+ QualType getOverflowBehaviorType(const OverflowBehaviorAttr *Attr,
+ QualType Wrapped) const;
+
+ QualType
+ getOverflowBehaviorType(OverflowBehaviorType::OverflowBehaviorKind Kind,
+ QualType Wrapped) const;
+
QualType getHLSLAttributedResourceType(
QualType Wrapped, QualType Contained,
const HLSLAttributedResourceType::Attributes &Attrs);
diff --git a/clang/include/clang/AST/ASTNodeTraverser.h b/clang/include/clang/AST/ASTNodeTraverser.h
index 8ebabb2bde10d..e684dcd5c27e8 100644
--- a/clang/include/clang/AST/ASTNodeTraverser.h
+++ b/clang/include/clang/AST/ASTNodeTraverser.h
@@ -445,6 +445,9 @@ class ASTNodeTraverser
void VisitBTFTagAttributedType(const BTFTagAttributedType *T) {
Visit(T->getWrappedType());
}
+ void VisitOverflowBehaviorType(const OverflowBehaviorType *T) {
+ Visit(T->getUnderlyingType());
+ }
void VisitHLSLAttributedResourceType(const HLSLAttributedResourceType *T) {
QualType Contained = T->getContainedType();
if (!Contained.isNull())
diff --git a/clang/include/clang/AST/Expr.h b/clang/include/clang/AST/Expr.h
index 523c0326d47ef..4fbea7272d004 100644
--- a/clang/include/clang/AST/Expr.h
+++ b/clang/include/clang/AST/Expr.h
@@ -1477,6 +1477,14 @@ class DeclRefExpr final
return DeclRefExprBits.IsImmediateEscalating;
}
+ bool isOverflowBehaviorDiscarded() const {
+ return DeclRefExprBits.IsOverflwBehaviorDiscarded;
+ }
+
+ void setOverflowBehaviorDiscarded(bool Set) {
+ DeclRefExprBits.IsOverflwBehaviorDiscarded = Set;
+ }
+
void setIsImmediateEscalating(bool Set) {
DeclRefExprBits.IsImmediateEscalating = Set;
}
diff --git a/clang/include/clang/AST/PropertiesBase.td b/clang/include/clang/AST/PropertiesBase.td
index 1215056ffde1b..26dd53760c3a2 100644
--- a/clang/include/clang/AST/PropertiesBase.td
+++ b/clang/include/clang/AST/PropertiesBase.td
@@ -81,6 +81,8 @@ def AutoTypeKeyword : EnumPropertyType;
def Bool : PropertyType<"bool">;
def BuiltinTypeKind : EnumPropertyType<"BuiltinType::Kind">;
def BTFTypeTagAttr : PropertyType<"const BTFTypeTagAttr *">;
+def OverflowBehaviorKind
+ : EnumPropertyType<"OverflowBehaviorType::OverflowBehaviorKind">;
def CallingConv : EnumPropertyType;
def DeclarationName : PropertyType;
def DeclarationNameKind : EnumPropertyType<"DeclarationName::NameKind">;
diff --git a/clang/include/clang/AST/RecursiveASTVisitor.h b/clang/include/clang/AST/RecursiveASTVisitor.h
index 5cb2f57edffe4..4a681a58b1cc9 100644
--- a/clang/include/clang/AST/RecursiveASTVisitor.h
+++ b/clang/include/clang/AST/RecursiveASTVisitor.h
@@ -1151,6 +1151,9 @@ DEF_TRAVERSE_TYPE(CountAttributedType, {
DEF_TRAVERSE_TYPE(BTFTagAttributedType,
{ TRY_TO(TraverseType(T->getWrappedType())); })
+DEF_TRAVERSE_TYPE(OverflowBehaviorType,
+ { TRY_TO(TraverseType(T->getUnderlyingType())); })
+
DEF_TRAVERSE_TYPE(HLSLAttributedResourceType,
{ TRY_TO(TraverseType(T->getWrappedType())); })
@@ -1462,6 +1465,9 @@ DEF_TRAVERSE_TYPELOC(CountAttributedType,
DEF_TRAVERSE_TYPELOC(BTFTagAttributedType,
{ TRY_TO(TraverseTypeLoc(TL.getWrappedLoc())); })
+DEF_TRAVERSE_TYPELOC(OverflowBehaviorType,
+ { TRY_TO(TraverseTypeLoc(TL.getWrappedLoc())); })
+
DEF_TRAVERSE_TYPELOC(HLSLAttributedResourceType,
...
[truncated]
|
fb45ed3
to
c301e29
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple comments from a quick glance.
645069d
to
a5d8ad5
Compare
gentle ping. |
fad8d73
to
579a796
Compare
ping. I rebased due to some small conflicts. question to reviewers: Should I squash these ~10 commits into one for ease of review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, I will try a more through review later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about inner qualifiers? The patch doesn't address the situation.
For a test like:
using ConstInt = const int;
using WrapConstInt1 = ConstInt __attribute__((overflow_behavior(wrap)));
using WrapConstInt2 = const int __attribute__((overflow_behavior(wrap)));
WrapConstInt1 and WrapConstInt2 should be the same type, as this is what makes most sense, I think.
This would be the same rules as qualifiers on array types, the inner and outer qualifiers mean the same thing and are added together for canonicalization.
thanks @mizvekov for the initial review. I've tried to address all your concerns with commit As for your comment:
Do you think maybe that should be another PR that I can send later on? Should it come before this PR? Where do you think a better home is for that overflow pattern elision stuff? And for your inner qualifiers question:
How should we address these? FWIW, currently, the two types in your example do not compare equal: std::cout << std::boolalpha
<< std::is_same<WrapConstInt1, WrapConstInt2>::value << "\n"; // false I am not sure what you mean by "inner qualifiers" from your example. My understanding of inner qualifiers involves pointer types like |
I think this is a simple enough change, it's just about moving where the function is implemented. I think Type.h would be a better fit, but I'd wait a little bit for other opinions before changing anything.
So there are two positions qualifiers can appear in this type.
It doesn't seem plausible to me that these qualifiers have different meanings, that is to say, that the two types in my example should be different types, instead of two spellings of the same type. One solution here would be to adopt the same rules as ArrayTypes for this situation. You can implement this by changing your canonicalization for |
I'm getting conflicting ideas from this sentence. Do you mean to say that the two types from your example should be the same type? |
All good, your review is vital. I really have no personal attachments to any particular design decisions made -- I want to make sure this feature is useful to as many developers as possible. Quick question before I go meet with Kees: How do you propose we handle overflow semantics with less-than-int stuff. The core reason all of this narrowing and promotion magic exists is to solve the problem of: u8 __ob_wrap a = 255;
if (a+1 == 0){...} // 1 is `int`, therefore whole expression promoted to `int`.
// Whoops! We no longer wrap-around at the 8-bit boundary because we are in `int` type boundaries now. Not truncating things back down to the wrap type renders the So, @ojhunt how do you ideally want all these narrowing semantics to work? |
You would need to give me some indication that this is a behavior developers want or expect. I think it's useful to think about how this attribute will be used in practice, including the linux kernel: #if __has_attribute(...) // or has keyword or whatever
#define __wrap_attr __obt_wrap
#else
#define __wrap_attr
#endif
Given that usage - which is how it would have to be set up - how would your example of uint8_t __wrap_attr a = 255;
if (a + 1 == 0) { ... } To work in any real use case? It will not work on any compiler that does not have this attribute. But this code is also fundamentally not realistic - a real version of this code would look like: uint8_t __wrap_attr a = 255;
if (a + 1 > 0xff) { ... } Except with the __wrap_attr this will no longer ever be true. In fact any type bounds check like this does not work: if (some_int + some_intN > __INTN_MAX__) The only case that the truncation of large values does is your above
Why do you think you need to do this? The problem with overflow is not "the value is larger than I expected" it is "overflow leads to UB allowing compiler introduced vulnerabilities", and "ftrapv and fwrapv are global, but we want different behavior in different contexts". This change to narrowing does not have anything to do with that.
The exact same way promotion works today: there should not be any narrowing. There is no reason to change this behavior to narrow: it makes the feature confusing, introduces security vulnerabilities, makes adoption unusable, and literally discards data in ways that produce warnings in many default configuration. Implicit narrowing is widely regarded as a mistake nowadays, and this not only adds a new version of implicit narrowing, it removes the warnings about that narrowing. Returning to your example: if (a + 1 == 0) { .. } this is not something I believe anyone would expect, and more to the point is not something anyone could actually use. Removing narrowing does not make the wrapping behavior meaningless: int f(bool b) {
int8_t __obt_trap a = 1;
int32_t i = __INT_MAX__ ;
if (b)
i = i + a; // int32_t + uint8_t __obt_trap => int32_t + int32_t __obt_trap => trap due to the overflow
return i;
} But more to the point, consider this: int64_t a = ...;
int32_t __obt_wrap /* or __obt_trap */ b = ...;
int64_t result = a + b; Would any developer would expect a truncated 32bit result? or an overflow trap? That doesn't match We can also consider an unsigned version of the above: uint64_t a = ...;
uint32_t __obt_wrap /* or __obt_trap */ b = ...;
uint64_t result = a + b; Overflow of Basically, rather that asking "how would template <class A, class B> void f(A a, B b) {
static_assert(same_type_v<decltype((A)0 + (B __wrap)0), decltype((A)0 + (B)0) __obt_wrap>);
static_assert(same_type_v<decltype((A __wrap)0 + (B)0), decltype((A)0 + (A)0) __obt_wrap>);
static_assert(same_type_v<decltype((A)0 + (B __obt_trap)0), decltype((A)0 + (B)0) __obt_trap>);
static_assert(same_type_v<decltype((A __obt_trap)0 + (B)0), decltype((A)0 + (A)0) __obt_trap>);
} Because these qualifiers introduce implicit exposure that is not directly exposed in local source, e.g that types below could be uint64_t u64 = 1ull<<32;
uint32_t __wrap u32 = 1;
u64 = u64 + u32; Which from what I can tell results in Let's imagine a warning about this then all of the existing correct code now needs to have casts everywhere, and those casts can easily drop the qualifier, ie. the above would need to be "corrected" with uint64_t u64 = 1ull<<32;
uint32_t __obt_wrap u32 = 1;
u64 = u64 + (some_type)u32; But what should |
I apologize I haven't carefully read @ojhunt 's self-admitted wall of text, but it seems to me like a possible next step would be to set up a call to come up with some acceptable compromise. My high-level input is, I wonder if Oliver's concerns about
If you think about the canonical use case for intentional unsigned integer overflow, it's probably cryptographic algorithms, where the algorithms are usually defined in terms of modular arithmetic. If we call it that, something math-y programmers will be less likely to reach for the feature when they are simply trying to implement an overflow check. As for |
As I said previously, this behavior would be unusable in any codebase that needs to share code with compilers that don't have this behavior, and special casing literals would be even worse. Consider real world existing code: // the existing *correct* version of the above:
if (x + 1 == 256) { .. } // now broken
// or more realistically
if (x + 1 > 255) { .. } // now broken For special casing literals: uint8_t __wrap x = 255;
if (x + 1 == 0) { ... } // "works"
constexpr uint8_t some_const = 1;
if (x + some_const == 0) { ... } // does not work Of course neither of these cases are compatible with code that needs to support non-__obt_wrap providing compilers.
my understanding is that this is: u64 = (u64 % __UINT_MAX__) + u32 I simply cannot believe any developer would ever expect this behavior.
There is no There's also some basic consistency questions: uint32_t __wrap u32 = 32;
uint64_t u64 = 1ull<<u32; The same logic that says
The question is: What is the use case for this feature? It sounds like the intent is to remove UB from overflow, and the presented use case is the linux kernel, but I cannot see how the kernel could possibly use this feature as currently defined.
It does not. The narrowing behavior means it is not remotely equivalent. If we insist on using this narrowing semantic, it should not be pretending that an unqualified type is "compatible" with the other sized type. There is absolutely no reason to produce an error message for uint32_t __wrap u32 = ..;
uint64_t __wrap u64 = ..;
return u32 + u64; but not produce an error for uint32_t __wrap u32 = ..;
uint64_t u64 = ..;
return u32 + u64; The correct way to approach to thinking about these qualifiers is that they are not a boolean: This has to be an error because as previously stated the alternative is literally introducing security vulnerabilities, and introducing them non locally. Consider: // SomeHeaderSomeWhere.h
struct S {
uint32_t count;
...
};
// SomeUnrelatedFile.cpp
...
SomeType *alloc(struct S* s, ) {
return (SomeType *)malloc(sizeof(SomeThing) * s->count);
} Later someone removes the UB: // SomeHeaderSomeWhere.h
struct S {
- uint32_t count;
+ uint32_t __wrap count;
...
}; The Code that is doing the correct bounds checks also now fails: uint64_t a = ...; // Generally comes from something like sizeof, so it would be size_t
uint32_t __wrap b = ...; // May come from a field, intermediate computation, etc
if ((a + b) < b) { .. } // unsigned overflow is well defined, so this test is correct, but now it is not Again, I cannot think of any case where anyone would expect @JustinStitt supporting the linux Kernel was presented explicitly as part of the justification, so I would really like you to check with the people in the linux kernel who are wanting this feature to see if they believe that this feature is usable, and in those discussions I would want their feedback on the examples of what I consider unsafe or incompatible behavioural changes directly, and the behavioral differences if they use the obvious adoption path of #if some_check_for_the_feature
#define __wrap_qualifier_name __obt_wrap
#define __trap_qualifier_name __obt_trap
#else
#define __wrap_qualifier_name
#define __trap_qualifier_name
#endif
struct S {
uint32_t __wrap_qualifier_name field;
};
...
alloc(sizeof(Thing) * someS.field); // under-allocates if __obt_wrap is available, works correctly if it is not
if (some_uint64 + someS.field < __UINT64_MAX__) ... // always succeeds if __obt_wrap is available I'm not sure what the exact ordering of narrowing, but that only matters for uint32_t __obt_trap a = ...;
uint64_t b = ...;
uint64_t c = a * b; if the order of operations is From a usability point of view, the narrowing behavior is the reason you are diagnosing an error on If the narrowing behavior is removed the result is:
Basically, the only real justification you have given for this behavior that you have given is the The only thing I have ever heard developers ask for is:
Again, I'd like to you to ask kernel folk if this is a behavior that they actually want (if they do, it would be extremely surprising to me), but certainly in every codebase I'm aware of, the feature would not only not be usable, it would need to be explicitly banned (as in: static assertions that is not enabled, linters to prevent it being introduced, etc). |
Note: I haven't read your entire response yet. I met with @kees (kernel hardening fellow) off-thread and I believe he is crafting a comment regarding kernel use to post here. We read your review comments (except this most recent one) and discussed the path forward. I don't want to preempt his reply, though. At any rate, we should get some clang folks together and have some more live discussions. ...Now, back to reading another wall of text 😉 |
It's a skill.... whether it's a good skill is an entirely different question :D |
Hello! Time for my wall of text. ;) But no joke, @ojhunt thank you for your walls of text. I found your examples and thought processes really helpful in trying to find a way to conceptually navigate this space. I'm also glad to see some themes in your more recent reply mirror some of the discussion @JustinStitt and I had today. But let me dive in: Firstly, I see much of the discussion focuses on I took a look through a random handful of recent Linux integer overflow flaws that got fixed, and 9 out of 10 had no dependency on the alternative narrowing, so using standard narrowing still shows a strong benefit to Linux. So I'm not opposed to using standard narrowing, but I do want to explore the goals around why we wanted it originally. First, to give an example of where alternative narrowing is helpful, which I'll discuss below: u16 calculate(...) {
u16 __ob_trap result = 0;
int something;
// ...
return result - something;
} If we wanted anything involving a Observations:
So, the goal of narrowing the mismatched Kind, I realize now, was mainly around C's ambiguity of arithmetic operation's width given commutative operation. If operations were directional we might infer bit width from the "value being operated on": "subtract Looking at Rust's arithmetic, there is just simply no mismatched bit widths allowed. Things need to be explicitly cast, so there is no ambiguity about bit width. Going all the way to this requirement feels like the furthest swing away from usability in C, especially with the goal of slowly adding So, I think the spectrum of solution is: "require matching kinds and widths" to "use standard narrowing". @JustinStitt and I considered two possible compromises:
The case that I find most troublesome (and has the largest security impact) is that of wrapping positive during int promotion, which I think is handled by making sure that promotions always promote toward int a, b, c;
a = INT_MAX;
b = INT_MAX - 99;
c = a * b; // wraps to 100 If I'm not sure my wall of text ended up as organized as I want it, but I think the problem we saw as unsolved was dealing with commutativity for catching overflows meaningfully: we either need directionality or homogeneity. Directionality is hard to articulate in the language, and homogeneity requires assumptions/ambiguity. |
I'm sorry for not having previously contributed to this discussion, but I was wondering if we had considered using a pragma which could be attached to a lexical scope (like |
To be honest, not having to worry all the time about that in Rust is actually quite nice, and while skimming this discussion I felt it would be the simplest approach here too. But, yeah, some kernel developers may not be thrilled about it being an error -- it is very not-C-like. It would also require building the kernel often with a newer Clang, to catch all missing casts. It also feels a bit orthogonal to the overflow behavior (but since they do not behave the same anyway, it is a fair opportunity to introduce integers with less surprises...). |
I'm really sorry, I tried to trim this down. So here's a couple of the new problems I've added to this one:
** actual wall **
The focus on But narrowing with uint32_t __obt_trap u32 = ...;
uint64_t u64 = ...;
uint64_t result = u32 + u64; Lets assume
And of course the same virality issues applies, where a single such attributed type will cause narrowing throughout an entire expression evaluation, which is either truncated incorrectly, or traps incorrectly.
Right - my entire problem is the narrowing, as described previously I believe it renders it unadoptable in the kernel due to the completely different semantics, and even ignoring the compatibility problems, I believe that the semantics will be completely unexpected. The question I'm wanting kernel folk to answer is whether they would be ok with something like uint8_t __obt_trap_wrapper u;
if (u + some_other_value > 0xff) {
} Either always returning false (wrapping to uint8_t), or trapping in the overflow (?) check due to early narrowing, or returning false incorrectly but only if obt is enabled, and if obt is not enabled the only overflow exists on the promoted, but that can be prevented with: uint8_t __obt_trap_wrapper u;
if (__UINT32_T_MAX__ - u < some_other_value)
return;
if (u + some_other_value > 0xff) {
} But now even that overflow test is incorrect, no matter how or when the truncation happens: the overflow test now traps in some cases and produces incorrect false results in others.
I don't believe this statement to be true. The model I believe is correct is that the above is equivalent to: u16 calculate(...) {
u16 __ob_trap result = 0;
int something;
i32 __ob_trap widened_result = result; // I _think_ this is the correct widening rule :D
i32 __ob_trap temp1 = widened_result - (i32 __ob_trap)something;
u16 __ob_trap temp2 = (u16 __ob_trap)temp1; // traps if this overflows
return temp2;
} The issue is not trapping/wrapping on narrowing, it's that the narrowing is happening at a point where all existing mismatched width integer arithmetic either widens, or is an error.
I think this is the misunderstanding - the overflow mode would propagate: if there's an operation where only one input has an explicit overflow behavior, that is propagated, and I would really want removal of those tags to require an explicit cast, e.g. u16 calculate(...) {
u16 __ob_trap result = 0;
int something;
return (u16)(result - something);
} To address unintentional truncation, my belief is that this would be semantically equivalent to One thing I think would be interesting would be whether such explicit casts would be required to explicitly drop the obt mode in such casts, something like
The thing is that I believe developers have no difficulty understanding the widening behavior, but this narrowing behaviour is extremely unintuitive: why would adding a narrower value to a wider one ever be expected to produce a narrowed result? Thing I just realised All this discussion has been in the context of differing width, what happens with same width but mismatched signedness? e.g u32 __obt_X a = ...;
i32 b = ...;
type1 r1 = a + b;
u32 c = ...;
i32 __obt_X d = ...;
type2 r2 = c + d; I think my understanding of the expected semantics would mean I think we would also want to warn if we see at least a warning for things like: (target_type)differently_signed_value + target_type_value Because the unqualified target_type cast could unintentionally truncate/overflow (dropping negative, wrapping to negative, etc).
Honestly I think that would not be terrible, it would certainly be preferable to unexpected narrowing, but you get into ergonomic issues with literals u16 __obt_mode x = ...;
x + 1 // a literal
x + (1 << 10) // not a literal but generally treated as such
const int refactored = 1 << 10;
x + refactored // no longer a literal but what should be a semantic no-op
Right, I think widening should be permitted, though the widening would propagate obt qualifiers, dropping obt qualifiers should require a cast. It's probably worth emitting a warning in compound expressions that include arithmetic on unqualified values leading into arithmetic with qualified ones.
I still don't understand what hazard you are trying to prevent here, narrowing only applies if the obt type is the narrower/different type, so:
These semantics also mean you get sensible and consistent behaviour for obt qualified bitfields as well (i'm not sure if the current proposal permits application to bitfields, but these semantics result in this being reasonable): struct S {
uint32_t __obt_trap field: 20;
}
// with narrowing consistency would imply that this actually should be
// uint20_t __obt_trap. With standard widening that weirdness goes away
// and the result is uint32_t __obt_trap
auto u32 = 100 + someS.field;
S.field = u32; // this then traps if the value is >= 1<<20
See i don't understand why this would be expected behavior - I (and i would expect most devs) would expect the result to be
As stated above that safety depends on the point at which the narrowing truncation happens. If such operations are performed by first narrowing the wider type, if that loses information without trapping/wrapping, you truncate from the correct behavior, resulting in incorrect result and that result may be invalid, and you trap on the narrowing operation you can end up with operations that would have produced an in bounds result would trap. I've given examples of where the cast point may impact out of bounds result, but the in bounds result case is also worth considering: uint64_t u64 = 1ull<<32;
uint32_t __obt_trap u32 = 1;
// converting u64 to u32 at the start traps incorrectly
uint32_t __obt_trap result = u64 - u32;
Yup, that's why I wondered if we want to issue warnings if part of an integer expression is qualified and part is not but performs operations that may overflow
I do not believe any approach that involves implicit narrowing is safe - it makes conditional adoption unsound due to the extremely different semantics across compilers, it changes deduced types, and it makes code that would reasonably be expected to produce a widened result (both due to that being the semantics everywhere else, and because it does not make sense to narrow one of the inputs). I think that the different types that this implicit narrowing produces can possibly result in abi/odr problems, that introduce security bugs: struct APIStruct {
u32 __conditional_obt_mode_wrapper a;
template <integral intype> auto f(std::span<intype> in) -> std::vector<decltype(a + in[0])> {
std::vector<decltype(a + *out)> v;
for (auto x : in)
v.push_back(x + a);
return v;
}
} These now have different sizes, and if the output of This is technically an ODR violation, but if your code is compiled with different compilers, or different modules, this seemingly innocuous "remove overflow UB" flag results in an ABI mismatch. The fix for this is to not ever use the qualifier in any non-local context - e.g. local variables only. The ODR violation is only due to the narrowing behaviour, but the viral nature of that narrowing means that the qualifier essentially becomes unsafe in any ABI facing position, or any environment where the Firstly, I really believe it would be entirely warranted to reject expressions with mismatching signedness if any are obt qualified. Beyond that change I really believe that the safe semantics are one of:
I think (1) is still problematic due to the smaller than int narrowing behaviour, e.g template<class T> struct Buffer {
unsigned size; // type and qualifier not relevant here
using data_type_t = T;
data_type_t *data;
std::span<data_type_t> span() { return std::span(data, size); }
};
// Again these types may not be specified explicitly, but aquired via macros,
// typedefs, template deduction, etc
template <class In> auto zip(auto f, std::span<In> a, std::span<In> b) -> std::vector<decltype(f(a[0], b[0]))> {
std::vector<decltype(f(a[0], b[0]))> result;
for (size_t i = 0; i < b.size(); i++) // Sssshhhh, assume a and b are the same size :D
result.push_back(f(a[i], b[i]));
return result;
}
void f(Buffer<uint8_t> buffer1, Buffer<uint8_t> buffer2) {
...
auto v = zip([](auto a, auto b){ return a + b; }, buffer1.span(), buffer2.span());
...
} At this point this produces a
It's unclear if even a warning for passing/storing values of type |
@ojhunt For what it's worth, check out #100272 and #104889 and #105709 (all related to Overflow Pattern Exclusions) Anyways, it is clear we cannot move forward with any of the current non-traditional narrowing rules. The question is where do we go from here? @rjmccall suggested some models in the RFC recently and @ojhunt has suggested more traditional and more strict options as well. My estimate from reading everyone's feedback is that there are maybe 3 or 4 different approaches. Of the approaches, most suggest that OBTs should behave more like regular arithmetic. Certain models mentioned by @kees and @ojhunt (and endorsed by @ojeda) introduce strict typing -- involving manual casts to strip or add behavior; a rust-style approach. Everything must be the same bitwidth, signed ness and OBT kind. This particular strictness seems hard to adopt in the Linux kernel but removes 99% of mathematical ambiguity. Other models (like wraps attribute feature I tried to land 1.5 years ago) involve persisting OBT kind through the usual arithmetic conversions and using them to trap on truncation. From what I've read, most agree that We should schedule a meeting, when does the Clang Area Team meet next? Can we hijack part of that meeting? @AaronBallman |
Yeah, many of these idioms are "UB" despite clearly having well defined behavior on every real platform, but I see the general idea of this feature being to aid these cases, but due to other browsers we may want to warn on those patterns regardless :-/
I think honestly both solutions are worthwhile independently: there are numerous cases where you want specific types to behavior deterministically in ways that are not UB, don't trigger sanitizers, etc. But there are also places where you have a sequence of code where you know overflows are not expected anywhere, and overflow is bad - e.g scoped specification of overflow semantics is separately useful. If both mechanisms are present at the same time with different semantics (say a scoped "trap on overflow", and a "wrap on overflow" variable might require a warning/error?)
I think that this is the only way to make this feature safe and adoptable - you can't have different semantics (beyond the UB removal) of unrelated behavior due to differing compilers (different compilers entirely, or different versions, that don't have this feature).
I think propagating the obt semantics is what would be what developers expect, but that might simply be projection of what I would expect. Basically I don't think anyone would expect an operation on an obt qualified type and a non-qualified type to discard that obt qualifier, and requiring explicit casts would be very verbose. Similarly I suspect mismatched size warnings/errors would be too noisy to be useful. That said I think the signedness mismatch is reasonably worth making an error - I think there are already warning modes that complain about such behavior without obt available, so this seems reasonable.
I disagree (edit: disagree with #if /* debug or fuzzing build */
#define overflow_qualifier obt_strict_trap
#else
#define overflow_qualifier
#129159 Which would change behavior between the modes in security relevant ways - e.g.
Good idea - there's also the dev meeting soon if people are present (though holy heck that's expensive O_o) (I think I successfully avoided wall of text!!!! \o/) |
As the clang area team, we generally prefer to schedule specific meetings for RFC discussions, so we can pick a timeslot that works for the people we expect to attend. If you think you need a meeting, please send an email to me and @rnk with the topic of discussion, who's leading the discussion (that person should prepare a few slides in advance), and a list of a few high-priority attendees. (The clang area team does have biweekly meetings, but we don't have time for deep technical discussions in that timeslot.) That said, looking over the discussion, I'm not sure anyone is advocating for a specific solution at the moment. You might want to just reach out directly to a couple people for a brainstorming session. |
With all the review it is clear that we have to choose some new semantics for OBTs. It is also important that OBTs are useful in their design and purpose for many projects. OBTs should provide type-level overflow behavior handling. With this goal in mind, there's two customers: 1) large existing codebases and 2) new projects. To help all kinds of projects I met with @kees again and we talked about separating some behaviors into separate modes. We identified two modes that would give the best path forward for the Linux kernel, other existing projects, and new projects. We were thinking of a "compliant" mode and a "strict" mode. Name bike-shedding welcome. :) @ojhunt We see your concerns with a strict mode and want to address them by ensuring code compatibility between all modes. Further below are some examples and design principles for the modes. First, let's make the distinction clear: The "compliant" mode would use traditional C promotion rules with the exception that the OBT qualifier is persisted through implicit casts. This allows us to get truncation signal during storage of less-than-int arithmetic results and overflow signal on other results. This mode would be the introductory mode for large projects that cannot make the direct jump to strict mode. @kees has shown that this compliant mode would still provide useful signal in the Linux kernel, where truncation accounts for a large percentage of the integer overflow flaws. An example of compliant mode: typedef unsigned short __ob_trap tu16;
tu16 a = 65535; // USHORT_MAX
int b = 1;
tu16 c = a + b; // a and b promoted to "__ob_trap int", traps on truncated assignment
// a+b is implicitly promoted to __ob_trap int
// result of a=b is 65536
// 65536 doesn't fit within tu16's storage space, so we can trap on the assignment The "strict" mode would require matching bitwidths and obt kinds which results in no ambiguity and provides homogeneity of arithmetic results to gain full visibility into potential overflows. For example, this matches the semantics of Rust. Nothing implicit happens in this mode. To make this mode most useful, it would need that casts to strict obt kinds get instrumented for overflow (more on this below). Same example with strict mode: typedef unsigned short __ob_strict_trap tu16;
tu16 a = 65535; // USHORT_MAX
int b = some();
tu16 c = a + b; // error: a and b have different types
solution 1:
// Make sure everything has the same type
tu16 a = 65535;
tu16 b = some(); // some() must also return tu16.
tu16 c = a + b; // OK, everything is the same type.
solution 2:
// add explicit casts
tu16 a = 65535;
int b = some();
tu16 c = a + (tu16)b; // OK, everything is the same type...
// ... but we must avoid potential silent dataloss during c-style cast. To make the "strict" mode usable, it would also necessitate the need for c-style casts to be instrumentable, otherwise we risk silent truncations. This cast instrumentation would be used to get full signal across arithmetic expressions being converted from compliant to strict mode. Take a look at this toy example case which shows the usefulness of strict obts, which catches the other major class of integer overflow the Linux kernel wants to catch: // vanilla C with -fwrapv (i.e. Linux kernel today)
int x = INT_MAX;
int y = (INT_MAX-99);
u8 sz = x * y; // sz is 100 due to wrap-around, no truncation
... = malloc(sz); // buggy malloc of 100 bytes
// compliant trap mode (would not catch this kind of overflow)
typedef unsigned char __ob_trap tu8;
int x = INT_MAX;
int y = (INT_MAX-99);
tu8 sz = x * y; // sz is 100 due to wrap-around, no truncation so no trap
... = malloc(sz); // buggy malloc 100 bytes
// strict trap mode
typedef unsigned char __ob_strict_trap tu8;
int x = INT_MAX;
int y = (INT_MAX-99);
tu8 sz = (tu8)x * (tu8)y; // strict mode forces same types, so add casts
// The casts will add instrumentation to catch data loss at runtime.
... = malloc(sz);
// When using strict obts the ultimate goal is to have code changed to all
// matching types instead of littering casts everywhere
// So, imagine a more opaque example with some() and other()
tu8 x = some(); // refactor these apis to use tu8
tu8 y = other();
tu8 sz = x * y; // now all types are the same with no casts locally.
... = malloc(sz); Refactoring to use __ob_strict_trap is still safe/stable when obts are unsupported, because the casts don't make anything worse. If, instead, we added optional bit-width/kind mismatch warnings to the "compliant" mode, we run the risk of bad casts (e.g. just "u8" above) being added, which would silently hide the overflow and silence the bit-width warning. It's also possible that we just need a stand-alone "strict" type qualifier that requires annotated types cannot participate in any implicit promotions. And this could then just be applied to the existing obts (or any other types). So the process for an existing project would be to migrate some types (e.g. size_t) via the compliant obts, and in other places (e.g. new types, new code, APIs, etc), use the strict obts. This should provide the greatest flexibility without compromising on coverage. For example: #if __has_attribute(overflow_behavior)
// "compliant" obt for an "existing" type
typedef unsigned long __ob_trap size_t;
// "strict" obt for a "new" type
typedef unsigned char __strict __ob_trap tu8;
#else
typedef unsinged long size_t;
typedef unsigned char tu8;
#endif It is important that strict mode doesn't carry different conversion semantics. It may only enforce stricter type rules requring explicit casts or type changes. typedef unsigned char __strict __ob_trap tu8;
extern void some(int);
void foo(int x, int y) {
// must get 'x' and 'y' to be of type 'tu8'
// old compilers will build this code just fine (and trap on truncation)
// obt-enabled compilers will fail to build as there are mismatching types
tu8 a = x * y;
some(a);
} ... Now convert the code to build with 'strict' mode. void foo(int x, int y) {
// old compiler will still build this just fine
// obt-enabled compilers will now build (and instrument explicit casts for data loss)
// no difference in result between compilers
tu8 a = (tu8)x * (tu8)y;
some(a);
} |
I certainly understand that "strict mode" is a model that works well in other languages — we also use this model in Swift — but I don't think it makes any sense to introduce it as a global mode in C, because it's simply too different from the normal semantics of the language. Having a You linked me to this post promising a detailed discussion of the model, but I can't quite piece out what overflow model you're actually proposing for the compliant mode. :) |
Sorry, I tried to fit all the information in my post without writing a book. To summarize, the model I propose for "compliant" mode best matches Model 2 from your comment on the RFC. The usual arithmetic conversions take place normally and we bubble up any obt qualifiers through implicit casts. Most of the time we can catch trivial overflow with int-or-greater types like int __ob_trap a = INT_MAX;
(a+1); // trap! but for less-than-int cases we can instrument the assignment or storage to trap on truncation: u8 __ob_trap a = 255;
u8 __ob_trap b = a + 1; // trap! (on assignment truncation)
// (a+1) is brought up to an `__ob_trap int` as per usual arithmetic conversions.
// that doesn't fit into u8 without data loss. so trap. |
Thanks. So IIUC, the rule is:
If that's right, that seems pretty reasonable. I still favor a scope-based approach, but this is a substantially improved type-based approach from how it was before. |
Would it be reasonable to split this into two phases?
|
I think two phases is a good idea. This lets us get OBTs sooner and design __strict separately. Does anyone know WG14s consensus over strong typedefs? __strict is similar enough that I'm curious to know what folks thought about them. |
There's a proposal which the committee has discussed briefly. It's not at a point where I'd feel comfortable predicting what the committee will ultimately decide about it. |
Agreed, some people in the committee wanted/want a feature like that, but it is hard to say what form it would take, if any. |
OK, I'll work on scaling back my custom promotion/conversion rules. Might take a couple days because every test needs to be rewritten too. I'll also work on a |
Signed-off-by: Justin Stitt <[email protected]>
373d622
to
cdc7493
Compare
I rewrote OBT conversion rules to adhere to the spec, with the exception that obt behaviors persist through implicit conversions. I've updated the docs and the tests then squashed everything back to a single commit. If you've reviewed this PR before, I highly recommend re-reading the docs. Hopefully this implementation is going more in the right direction, looking forward to feedback :) CC'ing folks who sent a post in the past week or so: @kees @ojeda @rjmccall @ojhunt @efriedma-quic |
Introduce
OverflowBehaviorType
(OBT), a new type attribute in Clang that provides developers with fine-grained control over the overflow behavior of integer types. This feature allows for a more nuanced approach to integer safety, achieving better granularity than global compiler flags like-fwrapv
and-ftrapv
. Type specifiers are also available via__ob_wrap
and__ob_trap
.These can be applied to integer types (both signed and unsigned) as well as typedef declarations, where the behavior is one of the following:
wrap
: Guarantees that arithmetic operations on the type will wrap on overflow, similar to-fwrapv
. This suppresses UBSan's integer overflow checks for the attributed type and prevents eager compiler optimizations.trap
: Enforces overflow checking for the type, even when global flags like-fwrapv
would otherwise suppress it.A key aspect of this feature is its interaction with existing mechanisms.
OverflowBehaviorType
takes precedence over global flags and, notably, over entries in the Sanitizer Special Case List (SSCL). This allows developers to "allowlist" critical types for overflow instrumentation, even if they are disabled by a broad rule in an SSCL.See the docs and tests for examples and more info.
RFCs
RFC v2
RFC v1
CCs (because of your participation in RFCs or previous implementations)
@kees @AaronBallman @vitalybuka @melver @efriedma-quic