Skip to content

Conversation

turran
Copy link

@turran turran commented Aug 12, 2025

This overcomes the limitation of WebAssemblyFixFunctionBitcasts LLVM module after the introduction of Opaque Pointers. The function pointers no longer have a prototype; therefore, the solution has to be done at clang. This solves many open issues, like:

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@turran turran force-pushed the wasm-function-pointers branch from b51657b to 3c2b6dc Compare August 12, 2025 14:06
@turran turran changed the title [wasm] Handle casted function pointers with different number of arguments [clang][WebAssembly] Handle casted function pointers with different number of arguments Aug 13, 2025
@turran turran force-pushed the wasm-function-pointers branch from 3c2b6dc to 1960925 Compare August 13, 2025 05:39
@turran turran force-pushed the wasm-function-pointers branch from 1960925 to c17adfa Compare August 13, 2025 05:41
@turran turran marked this pull request as ready for review August 13, 2025 05:41
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:WebAssembly clang:codegen IR generation bugs: mangling, exceptions, etc. labels Aug 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 13, 2025

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-backend-webassembly

Author: Jorge Zapata (turran)

Changes

This overcomes the limitation of WebAssemblyFixFunctionBitcasts LLVM module after the introduction of Opaque Pointers. The function pointers no longer have a prototype; therefore, the solution has to be done at clang. This solves many open issues, like:


Full diff: https://github.com/llvm/llvm-project/pull/153168.diff

6 Files Affected:

  • (modified) clang/lib/CodeGen/CGCall.cpp (+20)
  • (modified) clang/lib/CodeGen/CGExprConstant.cpp (+13)
  • (modified) clang/lib/CodeGen/TargetInfo.h (+11)
  • (modified) clang/lib/CodeGen/Targets/WebAssembly.cpp (+244)
  • (added) clang/test/CodeGenWebAssembly/function-pointer-arg.c (+25)
  • (added) clang/test/CodeGenWebAssembly/function-pointer-field.c (+30)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index b959982809911..8eb3afee08ff5 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -4932,6 +4932,26 @@ void CodeGenFunction::EmitCallArg(CallArgList &args, const Expr *E,
     return;
   }
 
+  // For WebAssembly target we need to create thunk functions
+  // to properly handle function pointers args with a different signature.
+  // Due to opaque pointers, this can not be handled in LLVM
+  // (WebAssemblyFixFunctionBitcast) anymore
+  if (CGM.getTriple().isWasm() && type->isFunctionPointerType()) {
+    if (const DeclRefExpr *DRE =
+            CGM.getTargetCodeGenInfo().getWasmFunctionDeclRefExpr(
+                E, CGM.getContext())) {
+      llvm::Value *V = EmitLValue(DRE).getPointer(*this);
+      llvm::Function *Thunk =
+          CGM.getTargetCodeGenInfo().getOrCreateWasmFunctionPointerThunk(
+              CGM, V, DRE->getDecl()->getType(), type);
+      if (Thunk) {
+        RValue R = RValue::get(Thunk);
+        args.add(R, type);
+        return;
+      }
+    }
+  }
+
   args.add(EmitAnyExprToTemp(E), type);
 }
 
diff --git a/clang/lib/CodeGen/CGExprConstant.cpp b/clang/lib/CodeGen/CGExprConstant.cpp
index a96c1518d2a1d..956f780ad71e3 100644
--- a/clang/lib/CodeGen/CGExprConstant.cpp
+++ b/clang/lib/CodeGen/CGExprConstant.cpp
@@ -2243,6 +2243,19 @@ ConstantLValueEmitter::tryEmitBase(const APValue::LValueBase &base) {
 
     if (const auto *FD = dyn_cast<FunctionDecl>(D)) {
       llvm::Constant *C = CGM.getRawFunctionPointer(FD);
+      // ForWebAssembly target we need to create thunk functions
+      // to properly handle function pointers args with a different signature
+      // Due to opaque pointers, this can not be handled in LLVM
+      // (WebAssemblyFixFunctionBitcast) anymore
+      if (CGM.getTriple().isWasm() && DestType->isFunctionPointerType()) {
+        llvm::Function *Thunk =
+            CGM.getTargetCodeGenInfo().getOrCreateWasmFunctionPointerThunk(
+                CGM, C, D->getType(), DestType);
+        if (Thunk) {
+          C = Thunk;
+        }
+      }
+
       if (FD->getType()->isCFIUncheckedCalleeFunctionType())
         C = llvm::NoCFIValue::get(cast<llvm::GlobalValue>(C));
       return PtrAuthSign(C);
diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h
index d0edae1295094..962178b2f549c 100644
--- a/clang/lib/CodeGen/TargetInfo.h
+++ b/clang/lib/CodeGen/TargetInfo.h
@@ -421,6 +421,17 @@ class TargetCodeGenInfo {
   /// Return the WebAssembly funcref reference type.
   virtual llvm::Type *getWasmFuncrefReferenceType() const { return nullptr; }
 
+  virtual const DeclRefExpr *getWasmFunctionDeclRefExpr(const Expr *E,
+                                                        ASTContext &Ctx) const {
+    return nullptr;
+  }
+
+  virtual llvm::Function *getOrCreateWasmFunctionPointerThunk(
+      CodeGenModule &CGM, llvm::Value *OriginalFnPtr, QualType SrcType,
+      QualType DstType) const {
+    return nullptr;
+  }
+
   /// Emit the device-side copy of the builtin surface type.
   virtual bool emitCUDADeviceBuiltinSurfaceDeviceCopy(CodeGenFunction &CGF,
                                                       LValue Dst,
diff --git a/clang/lib/CodeGen/Targets/WebAssembly.cpp b/clang/lib/CodeGen/Targets/WebAssembly.cpp
index ac8dcd2a0540a..4aa62948c4685 100644
--- a/clang/lib/CodeGen/Targets/WebAssembly.cpp
+++ b/clang/lib/CodeGen/Targets/WebAssembly.cpp
@@ -8,10 +8,15 @@
 
 #include "ABIInfoImpl.h"
 #include "TargetInfo.h"
+#include "llvm/ADT/StringMap.h"
+
+#include "clang/AST/ParentMapContext.h"
 
 using namespace clang;
 using namespace clang::CodeGen;
 
+#define DEBUG_TYPE "clang-target-wasm"
+
 //===----------------------------------------------------------------------===//
 // WebAssembly ABI Implementation
 //
@@ -52,6 +57,7 @@ class WebAssemblyTargetCodeGenInfo final : public TargetCodeGenInfo {
       : TargetCodeGenInfo(std::make_unique<WebAssemblyABIInfo>(CGT, K)) {
     SwiftInfo =
         std::make_unique<SwiftABIInfo>(CGT, /*SwiftErrorInRegister=*/false);
+    ThunkCache = llvm::StringMap<llvm::Function *>();
   }
 
   void setTargetAttributes(const Decl *D, llvm::GlobalValue *GV,
@@ -93,6 +99,127 @@ class WebAssemblyTargetCodeGenInfo final : public TargetCodeGenInfo {
   virtual llvm::Type *getWasmFuncrefReferenceType() const override {
     return llvm::Type::getWasm_FuncrefTy(getABIInfo().getVMContext());
   }
+
+  virtual const DeclRefExpr *
+  getWasmFunctionDeclRefExpr(const Expr *E, ASTContext &Ctx) const override {
+    // Go down in the tree until finding the DeclRefExpr
+    const DeclRefExpr *DRE = findDeclRefExpr(E);
+    if (!DRE)
+      return nullptr;
+
+    // Final case. The argument is a declared function
+    if (isa<FunctionDecl>(DRE->getDecl())) {
+      return DRE;
+    }
+
+    // Complex case. The argument is a variable, we need to check
+    // every assignment of the variable and see if we are bitcasting
+    // or not.
+    if (const auto *VD = dyn_cast<VarDecl>(DRE->getDecl())) {
+      DRE = findDeclRefExprForVarUp(E, VD, Ctx);
+      if (DRE)
+        return DRE;
+
+      // If no assignment exists on every parent scope, check for the
+      // initialization
+      if (!DRE && VD->hasInit()) {
+        return getWasmFunctionDeclRefExpr(VD->getInit(), Ctx);
+      }
+    }
+
+    return nullptr;
+  }
+
+  virtual llvm::Function *getOrCreateWasmFunctionPointerThunk(
+      CodeGenModule &CGM, llvm::Value *OriginalFnPtr, QualType SrcType,
+      QualType DstType) const override {
+
+    // Get the signatures
+    const FunctionProtoType *SrcProtoType = SrcType->getAs<FunctionProtoType>();
+    const FunctionProtoType *DstProtoType = DstType->getAs<PointerType>()
+                                                ->getPointeeType()
+                                                ->getAs<FunctionProtoType>();
+
+    // This should only work for different number of arguments
+    if (DstProtoType->getNumParams() <= SrcProtoType->getNumParams())
+      return nullptr;
+
+    // Get the llvm function types
+    llvm::FunctionType *DstFunctionType = llvm::cast<llvm::FunctionType>(
+        CGM.getTypes().ConvertType(QualType(DstProtoType, 0)));
+    llvm::FunctionType *SrcFunctionType = llvm::cast<llvm::FunctionType>(
+        CGM.getTypes().ConvertType(QualType(SrcProtoType, 0)));
+
+    // Construct the Thunk function with the Target (destination) signature
+    std::string ThunkName = getThunkName(OriginalFnPtr->getName().str(),
+                                         DstProtoType, CGM.getContext());
+    // Check if we already have a thunk for this function
+    if (auto It = ThunkCache.find(ThunkName); It != ThunkCache.end()) {
+      LLVM_DEBUG(llvm::dbgs() << "getOrCreateWasmFunctionPointerThunk: "
+                              << "found existing thunk for "
+                              << OriginalFnPtr->getName().str() << " as "
+                              << ThunkName << "\n");
+      return It->second;
+    }
+
+    // Create the thunk function
+    llvm::Module &M = CGM.getModule();
+    llvm::Function *Thunk = llvm::Function::Create(
+        DstFunctionType, llvm::Function::InternalLinkage, ThunkName, M);
+
+    // Build the thunk body
+    llvm::IRBuilder<> Builder(
+        llvm::BasicBlock::Create(M.getContext(), "entry", Thunk));
+
+    // Gather the arguments for calling the original function
+    std::vector<llvm::Value *> CallArgs;
+    unsigned CallN = SrcProtoType->getNumParams();
+
+    auto ArgIt = Thunk->arg_begin();
+    for (unsigned i = 0; i < CallN && ArgIt != Thunk->arg_end(); ++i, ++ArgIt) {
+      llvm::Value *A = &*ArgIt;
+      CallArgs.push_back(A);
+    }
+
+    // Create the call to the original function pointer
+    llvm::CallInst *Call =
+        Builder.CreateCall(SrcFunctionType, OriginalFnPtr, CallArgs);
+
+    // Handle return type
+    llvm::Type *ThunkRetTy = DstFunctionType->getReturnType();
+
+    if (ThunkRetTy->isVoidTy()) {
+      Builder.CreateRetVoid();
+    } else {
+      llvm::Value *Ret = Call;
+      if (Ret->getType() != ThunkRetTy)
+        Ret = Builder.CreateBitCast(Ret, ThunkRetTy);
+      Builder.CreateRet(Ret);
+    }
+    LLVM_DEBUG(llvm::dbgs() << "getOrCreateWasmFunctionPointerThunk:"
+                            << " from " << OriginalFnPtr->getName().str()
+                            << " to " << ThunkName << "\n");
+    // Cache the thunk
+    ThunkCache[ThunkName] = Thunk;
+    return Thunk;
+  }
+
+private:
+  // The thunk cache
+  mutable llvm::StringMap<llvm::Function *> ThunkCache;
+  // Build the thunk name: "%s_{OrigName}_{WasmSig}"
+  std::string getThunkName(std::string OrigName,
+                           const FunctionProtoType *DstProto,
+                           const ASTContext &Ctx) const;
+  char getTypeSig(const QualType &Ty, const ASTContext &Ctx) const;
+  std::string sanitizeTypeString(const std::string &typeStr) const;
+  std::string getTypeName(const QualType &qt, const ASTContext &Ctx) const;
+  const DeclRefExpr *findDeclRefExpr(const Expr *E) const;
+  const DeclRefExpr *findDeclRefExprForVarDown(const Stmt *Parent,
+                                               const VarDecl *V,
+                                               ASTContext &Ctx) const;
+  const DeclRefExpr *findDeclRefExprForVarUp(const Expr *E, const VarDecl *V,
+                                             ASTContext &Ctx) const;
 };
 
 /// Classify argument of given type \p Ty.
@@ -173,3 +300,120 @@ CodeGen::createWebAssemblyTargetCodeGenInfo(CodeGenModule &CGM,
                                             WebAssemblyABIKind K) {
   return std::make_unique<WebAssemblyTargetCodeGenInfo>(CGM.getTypes(), K);
 }
+
+// Helper to get the type signature character for a given QualType
+// Returns a character that represents the given QualType in a wasm signature.
+// See getInvokeSig() in WebAssemblyAsmPrinter for related logic.
+char WebAssemblyTargetCodeGenInfo::getTypeSig(const QualType &Ty,
+                                              const ASTContext &Ctx) const {
+  if (Ty->isAnyPointerType()) {
+    return Ctx.getTypeSize(Ctx.VoidPtrTy) == 32 ? 'i' : 'j';
+  }
+  if (Ty->isIntegerType()) {
+    return Ctx.getTypeSize(Ty) <= 32 ? 'i' : 'j';
+  }
+  if (Ty->isFloatingType()) {
+    return Ctx.getTypeSize(Ty) <= 32 ? 'f' : 'd';
+  }
+  if (Ty->isVectorType()) {
+    return 'V';
+  }
+  if (Ty->isWebAssemblyTableType()) {
+    return 'F';
+  }
+  if (Ty->isWebAssemblyExternrefType()) {
+    return 'X';
+  }
+
+  llvm_unreachable("Unhandled QualType");
+}
+
+std::string
+WebAssemblyTargetCodeGenInfo::getThunkName(std::string OrigName,
+                                           const FunctionProtoType *DstProto,
+                                           const ASTContext &Ctx) const {
+
+  std::string ThunkName = "__" + OrigName + "_";
+  QualType RetTy = DstProto->getReturnType();
+  if (RetTy->isVoidType()) {
+    ThunkName += 'v';
+  } else {
+    ThunkName += getTypeSig(RetTy, Ctx);
+  }
+  for (unsigned i = 0; i < DstProto->getNumParams(); ++i) {
+    ThunkName += getTypeSig(DstProto->getParamType(i), Ctx);
+  }
+  return ThunkName;
+}
+
+/// Recursively find the first DeclRefExpr in an Expr subtree.
+/// Returns nullptr if not found.
+const DeclRefExpr *
+WebAssemblyTargetCodeGenInfo::findDeclRefExpr(const Expr *E) const {
+  if (!E)
+    return nullptr;
+
+  // In case it is a function call, abort
+  if (isa<CallExpr>(E))
+    return nullptr;
+
+  // If this node is a DeclRefExpr, return it.
+  if (const auto *DRE = dyn_cast<DeclRefExpr>(E))
+    return DRE;
+
+  // Otherwise, recurse into children.
+  for (const Stmt *Child : E->children()) {
+    if (const auto *ChildExpr = dyn_cast_or_null<Expr>(Child)) {
+      if (const DeclRefExpr *Found = findDeclRefExpr(ChildExpr))
+        return Found;
+    }
+  }
+  return nullptr;
+}
+
+const DeclRefExpr *WebAssemblyTargetCodeGenInfo::findDeclRefExprForVarDown(
+    const Stmt *Parent, const VarDecl *V, ASTContext &Ctx) const {
+  if (!Parent)
+    return nullptr;
+
+  // Find down every assignment of V
+  // FIXME we need to stop before the expression where V is used
+  const BinaryOperator *A = nullptr;
+  for (const Stmt *Child : Parent->children()) {
+    if (const auto *BO = dyn_cast_or_null<BinaryOperator>(Child)) {
+      if (!BO->isAssignmentOp())
+        continue;
+      auto *LHS = llvm::dyn_cast<DeclRefExpr>(BO->getLHS());
+      if (LHS && LHS->getDecl() == V) {
+        A = BO;
+      }
+    }
+  }
+
+  // We have an assignment of the Var, recurse in it
+  if (A) {
+    return getWasmFunctionDeclRefExpr(A->getRHS(), Ctx);
+  }
+
+  return nullptr;
+}
+
+const DeclRefExpr *WebAssemblyTargetCodeGenInfo::findDeclRefExprForVarUp(
+    const Expr *E, const VarDecl *V, ASTContext &Ctx) const {
+  const clang::Stmt *cur = E;
+  while (cur) {
+    auto parents = Ctx.getParentMapContext().getParents(*cur);
+    if (parents.empty())
+      break;
+    const clang::Stmt *parentStmt = parents[0].get<clang::Stmt>();
+    if (!parentStmt)
+      break;
+    if (const auto *CS = dyn_cast<clang::CompoundStmt>(parentStmt)) {
+      const DeclRefExpr *DRE = findDeclRefExprForVarDown(CS, V, Ctx);
+      if (DRE)
+        return DRE;
+    }
+    cur = parentStmt;
+  }
+  return nullptr;
+}
diff --git a/clang/test/CodeGenWebAssembly/function-pointer-arg.c b/clang/test/CodeGenWebAssembly/function-pointer-arg.c
new file mode 100644
index 0000000000000..ff7b4186bbf7b
--- /dev/null
+++ b/clang/test/CodeGenWebAssembly/function-pointer-arg.c
@@ -0,0 +1,25 @@
+// REQUIRES: webassembly-registered-target
+// RUN: %clang_cc1 -triple wasm32-unknown-unknown -emit-llvm -O0 -o - %s | FileCheck %s
+
+// Test of function pointer bitcast in a function argument with different argument number in wasm32
+
+#define FUNCTION_POINTER(f) ((FunctionPointer)(f))
+typedef int (*FunctionPointer)(int a, int b);
+
+int fp_as_arg(FunctionPointer fp, int a, int b) {
+  return fp(a, b);
+}
+
+int fp_less(int a) {
+  return a;
+}
+
+// CHECK-LABEL: @test
+// CHECK: call i32 @fp_as_arg(ptr noundef @__fp_less_iii, i32 noundef 10, i32 noundef 20)
+void test() {
+  fp_as_arg(FUNCTION_POINTER(fp_less), 10, 20);
+}
+
+// CHECK: define internal i32 @__fp_less_iii(i32 %0, i32 %1)
+// CHECK: %2 = call i32 @fp_less(i32 %0)
+// CHECK: ret i32 %2
\ No newline at end of file
diff --git a/clang/test/CodeGenWebAssembly/function-pointer-field.c b/clang/test/CodeGenWebAssembly/function-pointer-field.c
new file mode 100644
index 0000000000000..103a265ebf5fb
--- /dev/null
+++ b/clang/test/CodeGenWebAssembly/function-pointer-field.c
@@ -0,0 +1,30 @@
+// REQUIRES: webassembly-registered-target
+// RUN: %clang_cc1 -triple wasm32-unknown-unknown -emit-llvm -O0 -o - %s | FileCheck %s
+
+// Test of function pointer bitcast in a struct field with different argument number in wasm32
+
+#define FUNCTION_POINTER(f) ((FunctionPointer)(f))
+typedef int (*FunctionPointer)(int a, int b); 
+
+// CHECK: @__const.test.sfp = private unnamed_addr constant %struct._StructWithFunctionPointer { ptr @__fp_less_iii }, align 4
+
+typedef struct _StructWithFunctionPointer {
+  FunctionPointer fp;
+} StructWithFunctionPointer;
+
+int fp_less(int a) {
+  return a;
+}
+                                                  
+// CHECK-LABEL: @test
+void test() {
+  StructWithFunctionPointer sfp = {
+    FUNCTION_POINTER(fp_less)
+  };
+
+  int a1 = sfp.fp(10, 20);
+}
+
+// CHECK: define internal i32 @__fp_less_iii(i32 %0, i32 %1)
+// CHECK: %2 = call i32 @fp_less(i32 %0)
+// CHECK: ret i32 %2

@juj
Copy link

juj commented Aug 13, 2025

Thanks for working on this.

If I understand correctly, this will silently paper over function arg count mismatches in signature? That might help some code patterns, though there is a potential issue of silent code size bloat, if the developer did not intend to write a mismatching signature in the first place(?) (and if they didn't see it, Clang will then silently generate these thunks, which were not desired?)

In our use case of #64526, I still wish there was a diagnostics feature that could be used like -Wbad-wasm-function-cast and -Werror=bad-wasm-function-cast where Clang would diagnose those casts from->to a bad function signature. So this PR does not quite address #64526 it seems.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C++ standard has exactly one rule for the result of function pointer casts: "converting a prvalue of type 'pointer to T1' to the type 'pointer to T2' (where T1 and T2 are function types) and back to its original type yields the original pointer value". This trivially violates that rule. (C has a similar rule.)

So we can't do this. At least, not by default.

@turran
Copy link
Author

turran commented Aug 13, 2025

Thanks @juj

Thanks for working on this.

If I understand correctly, this will silently paper over function arg count mismatches in signature? That might help some code patterns, though there is a potential issue of silent code size bloat, if the developer did not intend to write a mismatching signature in the first place(?) (and if they didn't see it, Clang will then silently generate these thunks, which were not desired?)

In our use case of #64526, I still wish there was a diagnostics feature that could be used like -Wbad-wasm-function-cast and -Werror=bad-wasm-function-cast where Clang would diagnose those casts from->to a bad function signature. So this PR does not quite address #64526 it seems.

I see. IIRC, clang already has a flag, not for wasm, but in general to detect bad signature casts, I saw that while investigating the code. But maybe it is missing in some specific use case?

@turran
Copy link
Author

turran commented Aug 13, 2025

Thanks @efriedma-quic

The C++ standard has exactly one rule for the result of function pointer casts: "converting a prvalue of type 'pointer to T1' to the type 'pointer to T2' (where T1 and T2 are function types) and back to its original type yields the original pointer value". This trivially violates that rule. (C has a similar rule.)

So we can't do this. At least, not by default.

I see. I wasn't aware of this rule. What would be the preferred way to handle this? With a specific flag, I guess?

@efriedma-quic
Copy link
Collaborator

Maybe a flag. But I'd prefer to explore the design space first. What patterns do we care about? Can we detect all the relevant cases at compile-time? Can we generalize this by performing some checks at runtime?

@juj
Copy link

juj commented Aug 13, 2025

What I would like is a new flag exactly like the current -Wcast-function-type, but one that was just narrower in its emitted warnings.

I.e. currently -Wcast-function-type emits warnings for all sorts of casts that are safe with Wasm ABI (like casting between unsigned int to int in a function arg).. but I'd like to not get warnings for those. And only get warnings for a narrower set of casts that will cause a trap at Wasm runtime. That would help audit through the millions of lines of code that e.g. Unity3D engine compiles through. (-Wcast-function-type is too much, and -Wbad-function-cast is not enough, as per illustrated in #64526)

Like was pointed out, of course detecting all scenarios that result to Wasm VM trapping at runtime is not possible (e.g. casting void(int,int) -> void* -> void(int) and then attempting to call), but that is besides the point. Because we couldn't detect all such scenarios doesn't mean it wouldn't be extremely useful to diagnose all the scenarios that can reasonably be compile-time diagnosed.

@juj
Copy link

juj commented Aug 13, 2025

In our case, I am interested in getting warnings emitted. So the usual -Wfoo, -Werror=foo mechanism would be useful.

In this PR, the idea of transparently supporting the cast via a thunk was brought up, that might be something to consider under a -fwasm-... flag kind of behavior, though would be good to be alternative to the emit warnings/emit errors mode.

@turran
Copy link
Author

turran commented Aug 13, 2025

Maybe a flag. But I'd prefer to explore the design space first. What patterns do we care about? Can we detect all the relevant cases at compile-time? Can we generalize this by performing some checks at runtime?

I added a couple of tests that highly represent the main set of use cases that this PR solves. Using a cast for a function pointer struct field declaration, and when using a cast for call arguments.

About other potential relevant cases, I guess there might be more, but for my own scenario with the PR, it is enough.
Let me know if the design is reasonable for these cases

@turran
Copy link
Author

turran commented Aug 14, 2025

What I would like is a new flag exactly like the current -Wcast-function-type, but one that was just narrower in its emitted warnings.

I.e. currently -Wcast-function-type emits warnings for all sorts of casts that are safe with Wasm ABI (like casting between unsigned int to int in a function arg).. but I'd like to not get warnings for those. And only get warnings for a narrower set of casts that will cause a trap at Wasm runtime. That would help audit through the millions of lines of code that e.g. Unity3D engine compiles through. (-Wcast-function-type is too much, and -Wbad-function-cast is not enough, as per illustrated in #64526)

Like was pointed out, of course detecting all scenarios that result to Wasm VM trapping at runtime is not possible (e.g. casting void(int,int) -> void* -> void(int) and then attempting to call), but that is besides the point. Because we couldn't detect all such scenarios doesn't mean it wouldn't be extremely useful to diagnose all the scenarios that can reasonably be compile-time diagnosed.

Okay, I've re-read the issue you mentioned. Indeed, it makes sense to provide a diagnosis of such use cases. Maybe as part of another PR? This was more in the direction of a different number of arguments instead of a different type of arguments.

@juj
Copy link

juj commented Aug 14, 2025

This was more in the direction of a different number of arguments instead of a different type of arguments.

I see these would be good to be handled as the same thing: i.e. a signature mismatch is a mismatch, be it in types or counts of args? That is how the existing -Wcast-function-type and -Wbad-function-cast also work like.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really, I think the biggest problem with this approach is that it special-cases compile-time casts, and treats them differently from runtime casts. That might be just barely sufficient for specific cases from glib, but it's not comprehensive, and it's very hard for users to understand why specific cases are/are not handled.

The semantics would be much simpler if the code for performing indirect calls was changed. Instead of trying to fix function pointers when they're cast, fix them when something tries to call them. Some kind of table lookup, or something with ref.test. Not sure exactly what this looks like; I'm not that familiar with wasm.

@turran
Copy link
Author

turran commented Aug 21, 2025

Really, I think the biggest problem with this approach is that it special-cases compile-time casts, and treats them differently from runtime casts. That might be just barely sufficient for specific cases from glib, but it's not comprehensive, and it's very hard for users to understand why specific cases are/are not handled.

I understand your point, but it is difficult to find a compromise in this situation. In this PR, specific -handable cases- are transformed at compile time to avoid a SEGV, which looks reasonable. Developers have pushed this function pointer casting (un)defined behavior because it has always worked in other targets, but not in WebAssembly. Maybe other cases are missing and need to be added?

The semantics would be much simpler if the code for performing indirect calls was changed. Instead of trying to fix function pointers when they're cast, fix them when something tries to call them. Some kind of table lookup, or something with ref.test. Not sure exactly what this looks like; I'm not that familiar with wasm.

Doing a runtime fix like using ref.test would mean that every function pointer called must be first checked, for every combination of fewer arguments (calling a function pointer of three args would require a runtime check for two args, one arg and no args), transferring the complexity to runtime and adding this penalty when you know beforehand at compilation time how you are casting the function pointer. This is a considerable overhead.

A LUT approach would mean that for every ptr called as a function, it is needed to traverse back such a pointer to know what value it has been set to, confirm that it is callable, and, if not, change the value set to a thunk. I'm not sure if it is actually doable at the LLVM IR level. I'm not an expert on that.

Initially, the code at https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyFixFunctionBitcasts.cpp was doing a similar approach as the one found on this PR, but after the Opaque Pointer change, the function pointer information found on LLVM's IR is only a generic pointer and such an approach doesn't scale anymore.

@efriedma-quic
Copy link
Collaborator

Doing a runtime fix like using ref.test would mean that every function pointer called must be first checked, for every combination of fewer arguments (calling a function pointer of three args would require a runtime check for two args, one arg and no args), transferring the complexity to runtime and adding this penalty when you know beforehand at compilation time how you are casting the function pointer. This is a considerable overhead.

Yes.

I understand your point, but it is difficult to find a compromise in this situation. In this PR, specific -handable cases- are transformed at compile time to avoid a SEGV, which looks reasonable. Developers have pushed this function pointer casting (un)defined behavior because it has always worked in other targets, but not in WebAssembly. Maybe other cases are missing and need to be added?

You can theoretically ensure function identity by perform some checks when you cast a function pointer, instead of when you call the function pointer. But then not only are casts expensive, but the resulting code isn't ABI-compatible with normal wasm code.

We could add a flag that forbids casting function pointers at runtime, I guess?

Or I guess we can go with the current design, behind a flag that indicates you understand what you're doing, with the understanding that it isn't standard-compliant, and only handles specifically the patterns that show up in some subset of glib. Which is not at all pretty, but I guess it doesn't break anyone else.

@llvmbot llvmbot added clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 3, 2025
@turran
Copy link
Author

turran commented Sep 3, 2025

Or I guess we can go with the current design, behind a flag that indicates you understand what you're doing, with the understanding that it isn't standard-compliant, and only handles specifically the patterns that show up in some subset of glib. Which is not at all pretty, but I guess it doesn't break anyone else.

I added a new flag, -fwasm-fix-function-bitcasts, to enable the proposed behavior. I've been struggling with the implementation, and I'm not sure it is well done. The WASM-specific flags differ from other targets, so I preferred to be somewhat compatible with the others. Let me know if the implementation is correct.

@juj
Copy link

juj commented Sep 3, 2025

Re-reading back the original description of the PR, I see there's

This solves many open issues, like:

* https://github.com/llvm/llvm-project/issues/64526

though this PR doesn't quite solve this, since it (silently?) will try to fix up the call sites with mismatching pointers. In issue 64526, I am looking after a way to get a warning or an error at compile time for call sites that perform casts that are unsafe specifically for Wasm.

In other words, in production I am not sure if I would like to use -fwasm-fix-function-bitcasts to "hide the problem under the rug", but I would like to have a more focused warning to guide analysing casts that are problematic for WebAssembly, with the intent of fixing the code of such call sites by hand.

Of course it is a fantastic idea to have a build option that can work on these automatically. I am sure that some use cases will find it very useful.

@turran
Copy link
Author

turran commented Sep 3, 2025

Re-reading back the original description of the PR, I see there's

This solves many open issues, like:

* https://github.com/llvm/llvm-project/issues/64526

though this PR doesn't quite solve this, since it (silently?) will try to fix up the call sites with mismatching pointers. In issue 64526, I am looking after a way to get a warning or an error at compile time for call sites that perform casts that are unsafe specifically for Wasm.

In other words, in production I am not sure if I would like to use -fwasm-fix-function-bitcasts to "hide the problem under the rug", but I would like to have a more focused warning to guide analysing casts that are problematic for WebAssembly, with the intent of fixing the code of such call sites by hand.

Of course it is a fantastic idea to have a build option that can work on these automatically. I am sure that some use cases will find it very useful.

Indeed. I'm aware of your particular case and have some clue on where/how to do it. It will touch other parts as it needs to modify the SemaCast.cpp and improve the WebAssemblyABIInfo. If you prefer, I can remove your issue from the description. I'll need some time to draft something, as compiling/linking/testing, at least for me, is very tedious.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have any further comments on the code... but I'm not familiar enough with wasm to judge whether this is actually worth adding.

@dschuff @aheejin Thoughts?

// For WebAssembly target we need to create thunk functions
// to properly handle function pointers args with a different signature.
// Due to opaque pointers, this can not be handled in LLVM
// (WebAssemblyFixFunctionBitcast) anymore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talking about opaque pointers isn't really useful for a code comment, because it's not about the current state of the code.

@dschuff
Copy link
Member

dschuff commented Sep 12, 2025

OK, sorry for the delay in looking at this. I do have some general thoughts.
I think this is worth working on; yes, it's a really annoying problem that real-world code takes advantage of laxness in existing calling conventions in so many cases, but it is what it is; that's why WebAssemblyFixFunctionBitcasts exists in the first place.
But I do share @efriedma-quic's concerns, especially about:

  1. Standards compliance outside of the particular bad uses, i.e. round-tripping pointers (This can be mitigated by putting the solution behind a flag, but that just emphasizes the importance of # 2)
  2. Developer comprehensibility/transparency on what the solution actually does (i.e. big +1 to the concern above).

I also think it's worth exploring the design space a little more, to see if we can come up with something that could be more broadly applicable and comprehensible that still has acceptable cost.

One thing we haven't discussed yet, is whether it would be acceptable to have some kind of explicit operation (either a cast or something associated with an indirect call), perhaps in the form of builtin/intrinsic. Obviously it would require a source change, but if the use sites are constrained enough then it might work. Then we could limit the costs to only be paid when needed, and make the effect much more comprehensible.

@juj
Copy link

juj commented Sep 12, 2025

it's a really annoying problem that real-world code takes advantage of laxness

In Unity's case, when in our millions of lines of code some line of code does a bad cast, I find it has not been out of intent to rely on x86 casting laxness, but rather an oversight, or, even more common, a memory corruption when a random memory location is interpreted as a function pointer.

But because the existing diagnostics flags provided by LLVM are absolutely useless to diagnose this issue, and Chrome browser debugger is unable to produce informative error messages in these cast situations, we find fixing these problems to be notoriously difficult to resolve, because we can't begin to tell if it was a random memory corruption-based signature mismatch, or a problem by developer mistake.

In our case these issues would be trivial to fix, if:
a) LLVM had a diagnostic flag to narrow down casts to specifically only the offending ones that don't work in WebAssembly, and
b) Chrome debugger was able to print out "we expected this signature, but we got this signature instead", when attempting to invoke a badly casted function pointer occurs.

I.e. so far as I've seen, in the Unity codebase it has never been a situation of "lazy code intentionally relied on x86 ABI casting behavior." (the majority of devs don't seem to even know what those are)

@dschuff
Copy link
Member

dschuff commented Sep 12, 2025

Yeah, let me be clear here: I see the GLib problem as being a completely separate problem from the one you have. AFAICS this patch doesn't attempt to address your issue at all, and the solutions I'm brainstorming here also don't attempt to address the issue.

I wouldn't necessarily be against a wasm-specific warning in clang that's essentially a subset of -Wcast-function-type. As I mentioned somewhere else, I'm not sure it would be easy to implement given where the lowering logic lives, but we could maybe compromise with some kind of approximation.
also for b), have you filed a bug on Chrome with that as a feature request? It might be possible. It could be that the the cost might be too much for the optimizing tier but maybe the baseline tier.

@dschuff
Copy link
Member

dschuff commented Sep 12, 2025

Or another random idea (for the "semi-transparently call with signature mismatches" problem, not the "fix the bugs in my code" problem): What if we represented function pointers (in the source) as being in a different address space, using the existing address_space(N) clang/LLVM attributes? They could be freely casted, written into memory and have the same integer values as they otherwise would, but when you call through them, we could either use a generic catch-all thunk like Binaryen's fpcast-emu pass does today, or perhaps something more fancy, like a LUT check and a more-specific conversion thunk. That could be feasible if e.g. the FunctionPointer C type and FUNCTION_POINTER macro could be suitably redefined. Then we maybe wouldn't need any changes to the code at callsites.

@juj
Copy link

juj commented Sep 12, 2025

Yeah, let me be clear here: I see the GLib problem as being a completely separate problem from the one you have. AFAICS this patch doesn't attempt to address your issue at all, and the solutions I'm brainstorming here also don't attempt to address the issue.

Thanks, that makes sense, agreed.

also for b), have you filed a bug on Chrome with that as a feature request?

I did, though unfortunately it was closed on the basis of being too costly to implement.

I got started on an Emscripten SAFE_HEAP style codegen pass to wrap each generated function pointer dispatch, but it was quite tricky to implement from Emscripten repository side alone.

@turran
Copy link
Author

turran commented Sep 23, 2025

Hello @dschuff. Thanks for your comments.

Or another random idea (for the "semi-transparently call with signature mismatches" problem, not the "fix the bugs in my code" problem): What if we represented function pointers (in the source) as being in a different address space, using the existing address_space(N) clang/LLVM attributes? They could be freely casted, written into memory and have the same integer values as they otherwise would, but when you call through them, we could either use a generic catch-all thunk like Binaryen's fpcast-emu pass does today, or perhaps something more fancy, like a LUT check and a more-specific conversion thunk. That could be feasible if e.g. the FunctionPointer C type and FUNCTION_POINTER macro could be suitably redefined. Then we maybe wouldn't need any changes to the code at callsites.

I'm not aware of what address_space does, but I understand it is something that would require a change in the original source code, which I think the approach should be transparent to the programmer.

About the LUT, isn't it similar to the approach commented on #153168 (review) and replied on #153168 (comment) ?

@turran
Copy link
Author

turran commented Sep 23, 2025

OK, sorry for the delay in looking at this. I do have some general thoughts. I think this is worth working on; yes, it's a really annoying problem that real-world code takes advantage of laxness in existing calling conventions in so many cases, but it is what it is; that's why WebAssemblyFixFunctionBitcasts exists in the first place.

Indeed. However, AFAIK, it no longer works after the opaque pointer change because the function type information is missing. It only works for the main() function as it is checked explicitly.

@turran
Copy link
Author

turran commented Oct 13, 2025

Hello @dschuff , would you happen to have any other insights on how to proceed with this PR?

@dschuff
Copy link
Member

dschuff commented Oct 15, 2025

Hello @dschuff. Thanks for your comments.

Or another random idea (for the "semi-transparently call with signature mismatches" problem, not the "fix the bugs in my code" problem): What if we represented function pointers (in the source) as being in a different address space, using the existing address_space(N) clang/LLVM attributes? They could be freely casted, written into memory and have the same integer values as they otherwise would, but when you call through them, we could either use a generic catch-all thunk like Binaryen's fpcast-emu pass does today, or perhaps something more fancy, like a LUT check and a more-specific conversion thunk. That could be feasible if e.g. the FunctionPointer C type and FUNCTION_POINTER macro could be suitably redefined. Then we maybe wouldn't need any changes to the code at callsites.

I'm not aware of what address_space does, but I understand it is something that would require a change in the original source code, which I think the approach should be transparent to the programmer.

About the LUT, isn't it similar to the approach commented on #153168 (review) and replied on #153168 (comment) ?

Let me refine what I meant here a little bit. The key problem I have with your proposed approach is that it redefines the semantics of the language in a subtle and hard-to-understand way (i.e. compile-time vs runtime casts, as mentioned above). I think that this isn't going to be very "transparent to the programmer" if you have a flag that just changes this behavior across the program, it could have effects outside of the very-specially-designated intended uses.

I agree with the statement above that changing the indirect call behavior is better than changing the casting behavior because it's much easier to understand and more predictable. It's also why I'm wondering if we can localize the effect to specific callsites which can mitigate both any increased cost for the calls, and any potentially-surprising behavior changes. The example you've given have centralized macros such as FUNCTION_POINTER and typedefs for the various types involved. If we can, for example, define the macro such that the calls can call a compiler builtin to make the dispatch call, then we could localize any extra runtime costs. Or, if we could use the typedef to designate the pointer types in a special way, we could do something similar (this is what I was speculating about with the addrspace thing). The point is that the special calls or casts could be explicit rather than implicit. It's true that this isn't completely transparent in that it would require no source changes at all, but I think it's pretty reasonable, and I would hope that the GLib code is organized in such a way that only the special signal dispatch code needs this behavior, and it might therefore be acceptable to take some macro changes.

@turran
Copy link
Author

turran commented Oct 16, 2025

Let me refine what I meant here a little bit. The key problem I have with your proposed approach is that it redefines the semantics of the language in a subtle and hard-to-understand way (i.e. compile-time vs runtime casts, as mentioned above). I think that this isn't going to be very "transparent to the programmer" if you have a flag that just changes this behavior across the program, it could have effects outside of the very-specially-designated intended uses.

I agree, but as I already commented, finding a middle ground, even if desirable, is sometimes impossible.

I agree with the statement above that changing the indirect call behavior is better than changing the casting behavior because it's much easier to understand and more predictable. It's also why I'm wondering if we can localize the effect to specific callsites which can mitigate both any increased cost for the calls, and any potentially-surprising behavior changes.

The cost is unavoidable if the developer wants the webassembly ABI to work like other ABIs, where you simply pass fewer arguments than the ones expected, and don't change their own code. The proposed approach mimics such behavior; indeed, some collateral potential problems might occur, but that's the purpose of having an optional flag and knowing the implications.

The example you've given have centralized macros such as FUNCTION_POINTER and typedefs for the various types involved. If we can, for example, define the macro such that the calls can call a compiler builtin to make the dispatch call, then we could localize any extra runtime costs. Or, if we could use the typedef to designate the pointer types in a special way, we could do something similar (this is what I was speculating about with the addrspace thing).

Yes, my examples have such a macro to be explicit about usage. But in the end, you can have a common macro like my example or a manual cast of the target function to the desired type in many ways and forms, which makes your approach developer-related, as they will have to know how to cast to make their cast to work. Maybe we can make the flag to be more explicit about the potential consequences?

The point is that the special calls or casts could be explicit rather than implicit. It's true that this isn't completely transparent in that it would require no source changes at all, but I think it's pretty reasonable, and I would hope that the GLib code is organized in such a way that only the special signal dispatch code needs this behavior, and it might therefore be acceptable to take some macro changes.

We already tried some alternatives to avoid this problem (intermediate functions, different casts, adding missing parameters, etc), but those weren't accepted. In fact, there is a hard requirement in their documentation. This is mandatory at the compiler level. Nevertheless, there is a misunderstanding in the documentation about the EMULATE_FUNCTION_POINTER_CASTS as it has nothing to do with the number of arguments, only with the type of arguments.

The signals are not the problem; is the whole GOBject inheritance model that is based on this ABI assumption. It is difficult for me to find a middle ground here ...

@dschuff
Copy link
Member

dschuff commented Oct 16, 2025

The cost is unavoidable if the developer wants the webassembly ABI to work like other ABIs, where you simply pass fewer arguments than the ones expected, and don't change their own code. The proposed approach mimics such behavior; indeed, some collateral potential problems might occur, but that's the purpose of having an optional flag and knowing the implications.

This is part of what I'm trying to understand about the use case. Namely, exactly who has to know the implications? (equivalently, who would need to enable this flag)? is it just the GLib core developers, i.e. the flag just needs to be enable to build GLib code? Or is it everyone who wants to link against GLib?

The signals are not the problem; is the whole GOBject inheritance model that is based on this ABI assumption. It is difficult for me to find a middle ground here ...

This phrasing suggests that everyone who wants to use GLib has calls in their code (or at least their object files) that need to use mismatched signatures. The fact that IIUC you were able to make things work with hacks that were localized to GLib means that your changes just leaked into public headers?

Nevertheless, there is a misunderstanding in the documentation about the EMULATE_FUNCTION_POINTER_CASTS as it has nothing to do with the number of arguments, only with the type of arguments.

EMULATE_FUNCTION_POINTER_CASTS (implemented by Binaryen's fpcast-emu transformation pass) makes all indirect calls use the same number of arguments, and creates thunks that pass those calls through with the right number of arguments. So it should fix both the number and type of arguments.

That was actually one question I had for you about the patch you wrote here vs EMULATE_FUNCTION_POINTER_CASTS, namely whether EMULATE_FUNCTION_POINTER_CASTS actually worked and was sufficient for GLib (it was unclear from the discussion; also a side question was if so, how bad was the performance effect?). It should work for both number and type of arguments. IIRC it only works for indirect calls, but we could probably fix that on the Binaryen side.

edit: for that matter, was the old FixFunctionBitcasts pass from before opaque pointers sufficient? I think not, right?

@turran
Copy link
Author

turran commented Oct 17, 2025

This is part of what I'm trying to understand about the use case. Namely, exactly who has to know the implications? (equivalently, who would need to enable this flag)? is it just the GLib core developers, i.e. the flag just needs to be enable to build GLib code? Or is it everyone who wants to link against GLib?

This flag needs to be enabled whenever your code does function casting to a different kind of function. In the particular case of GLib (but not only GLIb, but for example GStreamer, or many others), this is needed only when building GLib, so yes, that's for the core developers. Linking to GLib doesn't require the flag, unless your code is also doing the same function casting ABI assumption.

The signals are not the problem; is the whole GOBject inheritance model that is based on this ABI assumption. It is difficult for me to find a middle ground here ...

This phrasing suggests that everyone who wants to use GLib has calls in their code (or at least their object files) that need to use mismatched signatures. The fact that IIUC you were able to make things work with hacks that were localized to GLib means that your changes just leaked into public headers?

The changes we made to make GLib work implied internal code only, not at the headers or public API.

EMULATE_FUNCTION_POINTER_CASTS (implemented by Binaryen's fpcast-emu transformation pass) makes all indirect calls use the same number of arguments, and creates thunks that pass those calls through with the right number of arguments. So it should fix both the number and type of arguments.

That was actually one question I had for you about the patch you wrote here vs EMULATE_FUNCTION_POINTER_CASTS, namely whether EMULATE_FUNCTION_POINTER_CASTS actually worked and was sufficient for GLib (it was unclear from the discussion; also a side question was if so, how bad was the performance effect?). It should work for both number and type of arguments. IIRC it only works for indirect calls, but we could probably fix that on the Binaryen side.

I'm not sure, to be honest. In different posts, it seems that EMULATE_FUNCTION_POINTER_CASTS used to work, but since three years ago, when the gst.wasm project started, it has never worked, and thus our own fork was needed. I see that the code in binaryen is very small and has received recent changes. I'll need to check if, at least, the PR's examples work properly, but I doubt. Especially for a callback-field kind of cast:

void test() {
  StructWithFunctionPointer sfp = {
    FUNCTION_POINTER(fp_less)
  };

  int a1 = sfp.fp(10, 20);
}

Here fp needs to be traced back to fp_less which I doubt the FuncCastEmulation will handle. In a real-life case, they belong to two different modules, being sfp and the call to sfp.fp in GLib, for example, and the casted function in GStreamer. The definition of sfp.fp is a pointer (int) without type information, at least on the IR generated. If the logic has to be done on the call (sfp.fp) and not the cast on sfp's field, I don't know how it is going to be done properly with FuncCastEmulation.

edit: for that matter, was the old FixFunctionBitcasts pass from before opaque pointers sufficient? I think not, right?

Correct, it wasn't working before either. The first set of changes I made to FixFunctionBitcasts was to handle the particularities of the casts found in the examples. Those were missing. After that, due to the opaque pointer thing, it was impossible to know the source and target function types. That's why I guessed that the EMULATE_FUNCTION_POINTER_CASTS was suffering from the same problem.

@dschuff
Copy link
Member

dschuff commented Oct 17, 2025

I still have concerns about what exactly the semantics of this change is, and it's a problem that it's not really documented, there are only 2 tests

Here fp needs to be traced back to fp_less which I doubt the FuncCastEmulation will handle. In a real-life case, they belong to two different modules, being sfp and the call to sfp.fp in GLib, for example, and the casted function in GStreamer. The definition of sfp.fp is a pointer (int) without type information, at least on the IR generated. If the logic has to be done on the call (sfp.fp) and not the cast on sfp's field, I don't know how it is going to be done properly with FuncCastEmulation.

FuncCastEmulation doesn't actually operate on casts, it actually operates only at callsites. Function pointers always have their usual values no matter how they are generated, casted, or passed around; the transformation only happens at call time. This is why it works across modules despite being so simple. I just want to reiterate again how much simpler a solution that operates at the callsite is compared to trying to trace the origin of all the pointers and casts. Currently there are only 2 tests, but there would need to be a lot more to cover the various ways function pointers can be materialized and casted, including ones that have nothing to do with the GLib use case, because we need to make sure we can adequately describe the behavior of the language change. This is another simplicity advantage of solutions that work on the wasm level; the language is simpler, so the transformation is simpler and easier to understand, describe, and test, and there's less risk of issues when e.g. new frontend or C++ features (or LLVM features like ptr, for that matter) are added or changed.

@dschuff
Copy link
Member

dschuff commented Oct 17, 2025

(BTW I'm going to be out for the next week, and then at the Wasm CG meeting the following week, so I probably won't get back to this until after that. I will get back to it though, despite my pushback here, I still want to come up with something here that works.).

@turran
Copy link
Author

turran commented Oct 18, 2025

I still have concerns about what exactly the semantics of this change is, and it's a problem that it's not really documented, there are only 2 tests

I added the most common patterns found in GLib in those two examples. At least with that, GLib properly works. I guess there might be other use cases, but I haven't found them yet.

FuncCastEmulation doesn't actually operate on casts, it actually operates only at callsites. Function pointers always have their usual values no matter how they are generated, casted, or passed around; the transformation only happens at call time. This is why it works across modules despite being so simple. I just want to reiterate again how much simpler a solution that operates at the callsite is compared to trying to trace the origin of all the pointers and casts. Currently there are only 2 tests, but there would need to be a lot more to cover the various ways function pointers can be materialized and casted, including ones that have nothing to do with the GLib use case, because we need to make sure we can adequately describe the behavior of the language change. This is another simplicity advantage of solutions that work on the wasm level; the language is simpler, so the transformation is simpler and easier to understand, describe, and test, and there's less risk of issues when e.g. new frontend or C++ features (or LLVM features like ptr, for that matter) are added or changed.

I took a look again to EMULATE_FUNCTION_POINTER_CASTS and both examples work correctly. So something else is missing to be able to make GLib run properly. The examples are very narrow cases and don't expose the multi. c file problem, for example, but might be another thing

Checking the generated wasm/wat, IIUC their approach, it wraps every indirect call with another call with a maximum number of arguments of type i64 for later pushing them into the stack and calling the real indirect call afterwards. The real indirect call will just pop the desired arguments, and everything will simply work. It is fixing the problem with a hammer IMHO, making all indirect calls slower when they are not really needed. I understand the benefit, but the cost is very high.

@dschuff
Copy link
Member

dschuff commented Oct 18, 2025

Indeed, this is why I'm hoping we can make something explicit and targeted like a compiler builtin, rather than having to blindly transform the whole program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:WebAssembly clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants