Skip to content

[WebAssembly] Don't remove function signatures in ThinLTOBitcodeWriter #126552

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aheejin
Copy link
Member

@aheejin aheejin commented Feb 10, 2025

In the split LTO unit mode in ThinLTO, functions that meet the a specific criteria are moved to the split module, but also the declarations for the other functions that are still in the oritinal module are written to the split module.

In simplifyExternals in ThinLTOBitcodeWriter.cpp, the function signatures are removed from those declarations in the split module, presumably in order to reduce code size or they are not necessary for other targets. But Wasm needs them, because if function signatures don't match from two modules, function signature mismatch warnings like this occur in the linker:

wasm-ld: warning: function signature mismatch:
_ZN12_GLOBAL__N_116itanium_demangle19PointerToMemberTypeD0Ev.cb45941bcc851b6faf99a8b8c1ca5f22
>>> defined as () -> void in
/usr/local/google/home/aheejin/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++abi-wasmexcept.a(cxa_demangle.o)
  >>> defined as (i32) -> void in lto.tmp

And this can result in an error at runtime.

This skips the part that removes function signatures from simplifyExternals when the architecture is Wasm.

In the split LTO unit mode in ThinLTO, functions that meet the a
specific criteria are moved to the split module, but also the
declarations for the other functions that are still in the oritinal
module are written to the split module.

In `simplifyExternals` in `ThinLTOBitcodeWriter.cpp`, the function
signatures are removed from those declarations in the split module,
presumably in order to reduce code size or they are not necessary for
other targets. But Wasm needs them, because if function signatures don't
match from two modules, function signature mismatch warnings like this
occur in the linker:
```
wasm-ld: warning: function signature mismatch:
_ZN12_GLOBAL__N_116itanium_demangle19PointerToMemberTypeD0Ev.cb45941bcc851b6faf99a8b8c1ca5f22
>>> defined as () -> void in
/usr/local/google/home/aheejin/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++abi-wasmexcept.a(cxa_demangle.o)
  >>> defined as (i32) -> void in lto.tmp
```
And this can result in an error at runtime.

This skips the part that removes function signatures from
`simplifyExternals` when the architecture is Wasm.
@aheejin aheejin requested review from pcc and sbc100 February 10, 2025 17:29
@llvmbot llvmbot added LTO Link time optimization (regular/full LTO or ThinLTO) llvm:transforms labels Feb 10, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 10, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Heejin Ahn (aheejin)

Changes

In the split LTO unit mode in ThinLTO, functions that meet the a specific criteria are moved to the split module, but also the declarations for the other functions that are still in the oritinal module are written to the split module.

In simplifyExternals in ThinLTOBitcodeWriter.cpp, the function signatures are removed from those declarations in the split module, presumably in order to reduce code size or they are not necessary for other targets. But Wasm needs them, because if function signatures don't match from two modules, function signature mismatch warnings like this occur in the linker:

wasm-ld: warning: function signature mismatch:
_ZN12_GLOBAL__N_116itanium_demangle19PointerToMemberTypeD0Ev.cb45941bcc851b6faf99a8b8c1ca5f22
>>> defined as () -> void in
/usr/local/google/home/aheejin/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++abi-wasmexcept.a(cxa_demangle.o)
  >>> defined as (i32) -> void in lto.tmp

And this can result in an error at runtime.

This skips the part that removes function signatures from simplifyExternals when the architecture is Wasm.


Full diff: https://github.com/llvm/llvm-project/pull/126552.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (+20-19)
  • (added) llvm/test/ThinLTO/WebAssembly/lit.local.cfg (+2)
  • (added) llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll (+29)
diff --git a/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp b/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
index cd0e412bdf353b2..2dc60e8f168e650 100644
--- a/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
+++ b/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
@@ -178,28 +178,29 @@ void simplifyExternals(Module &M) {
   FunctionType *EmptyFT =
       FunctionType::get(Type::getVoidTy(M.getContext()), false);
 
-  for (Function &F : llvm::make_early_inc_range(M)) {
-    if (F.isDeclaration() && F.use_empty()) {
+  for (Function &F : llvm::make_early_inc_range(M))
+    if (F.isDeclaration() && F.use_empty())
       F.eraseFromParent();
-      continue;
-    }
 
-    if (!F.isDeclaration() || F.getFunctionType() == EmptyFT ||
-        // Changing the type of an intrinsic may invalidate the IR.
-        F.getName().starts_with("llvm."))
-      continue;
+  // Wasm should preserve function signatures for the linker.
+  if (!Triple(M.getTargetTriple()).isWasm()) {
+    for (Function &F : llvm::make_early_inc_range(M)) {
+      if (!F.isDeclaration() || F.getFunctionType() == EmptyFT ||
+          // Changing the type of an intrinsic may invalidate the IR.
+          F.getName().starts_with("llvm."))
+        continue;
 
-    Function *NewF =
-        Function::Create(EmptyFT, GlobalValue::ExternalLinkage,
-                         F.getAddressSpace(), "", &M);
-    NewF->copyAttributesFrom(&F);
-    // Only copy function attribtues.
-    NewF->setAttributes(AttributeList::get(M.getContext(),
-                                           AttributeList::FunctionIndex,
-                                           F.getAttributes().getFnAttrs()));
-    NewF->takeName(&F);
-    F.replaceAllUsesWith(NewF);
-    F.eraseFromParent();
+      Function *NewF = Function::Create(EmptyFT, GlobalValue::ExternalLinkage,
+                                        F.getAddressSpace(), "", &M);
+      NewF->copyAttributesFrom(&F);
+      // Only copy function attribtues.
+      NewF->setAttributes(AttributeList::get(M.getContext(),
+                                             AttributeList::FunctionIndex,
+                                             F.getAttributes().getFnAttrs()));
+      NewF->takeName(&F);
+      F.replaceAllUsesWith(NewF);
+      F.eraseFromParent();
+    }
   }
 
   for (GlobalIFunc &I : llvm::make_early_inc_range(M.ifuncs())) {
diff --git a/llvm/test/ThinLTO/WebAssembly/lit.local.cfg b/llvm/test/ThinLTO/WebAssembly/lit.local.cfg
new file mode 100644
index 000000000000000..d5f39ab4dbc8ca2
--- /dev/null
+++ b/llvm/test/ThinLTO/WebAssembly/lit.local.cfg
@@ -0,0 +1,2 @@
+if not "WebAssembly" in config.root.targets:
+    config.unsupported = True
diff --git a/llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll b/llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll
new file mode 100644
index 000000000000000..201aa81c3007371
--- /dev/null
+++ b/llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll
@@ -0,0 +1,29 @@
+; RUN: opt -thinlto-bc -thinlto-split-lto-unit %s -o %t.o
+; RUN: llvm-modextract -b -n 0 %t.o -o - | llvm-dis | FileCheck %s --check-prefix=UNIT0
+; RUN: llvm-modextract -b -n 1 %t.o -o - | llvm-dis | FileCheck %s --check-prefix=UNIT1
+
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+; After split LTO unit generation, @test_gv will move to UNIT1 (because it has a
+; type metadata), and because it refers to @test, @test will be promoted by
+; adding its module ID hash to its name, and .lto_set_conditional directive will
+; be written in Unit 0.
+
+; UNIT0: module asm ".lto_set_conditional test,test.{{[0-9a-f]+}}"
+; UNIT0: define hidden i32 @test.{{[0-9a-f]+}}()
+
+; Unit 1 will contain @test's declaration. The normal ThinLTO split bitcode
+; writing removes the signatures from the split unit's declaration, but in Wasm
+; you should not do this in order to avoid function signature mismatch in
+; linker.
+
+; UNIT1: declare hidden i32 @test.{{[0-9a-f]+}}()
+
+@test_gv = constant ptr @test, align 4, !type !0
+
+define internal i32 @test() {
+  ret i32 0
+}
+
+!0 = !{i64 0, !{}}

@llvmbot
Copy link
Member

llvmbot commented Feb 10, 2025

@llvm/pr-subscribers-lto

Author: Heejin Ahn (aheejin)

Changes

In the split LTO unit mode in ThinLTO, functions that meet the a specific criteria are moved to the split module, but also the declarations for the other functions that are still in the oritinal module are written to the split module.

In simplifyExternals in ThinLTOBitcodeWriter.cpp, the function signatures are removed from those declarations in the split module, presumably in order to reduce code size or they are not necessary for other targets. But Wasm needs them, because if function signatures don't match from two modules, function signature mismatch warnings like this occur in the linker:

wasm-ld: warning: function signature mismatch:
_ZN12_GLOBAL__N_116itanium_demangle19PointerToMemberTypeD0Ev.cb45941bcc851b6faf99a8b8c1ca5f22
>>> defined as () -> void in
/usr/local/google/home/aheejin/emscripten/cache/sysroot/lib/wasm32-emscripten/thinlto/libc++abi-wasmexcept.a(cxa_demangle.o)
  >>> defined as (i32) -> void in lto.tmp

And this can result in an error at runtime.

This skips the part that removes function signatures from simplifyExternals when the architecture is Wasm.


Full diff: https://github.com/llvm/llvm-project/pull/126552.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (+20-19)
  • (added) llvm/test/ThinLTO/WebAssembly/lit.local.cfg (+2)
  • (added) llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll (+29)
diff --git a/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp b/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
index cd0e412bdf353b2..2dc60e8f168e650 100644
--- a/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
+++ b/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
@@ -178,28 +178,29 @@ void simplifyExternals(Module &M) {
   FunctionType *EmptyFT =
       FunctionType::get(Type::getVoidTy(M.getContext()), false);
 
-  for (Function &F : llvm::make_early_inc_range(M)) {
-    if (F.isDeclaration() && F.use_empty()) {
+  for (Function &F : llvm::make_early_inc_range(M))
+    if (F.isDeclaration() && F.use_empty())
       F.eraseFromParent();
-      continue;
-    }
 
-    if (!F.isDeclaration() || F.getFunctionType() == EmptyFT ||
-        // Changing the type of an intrinsic may invalidate the IR.
-        F.getName().starts_with("llvm."))
-      continue;
+  // Wasm should preserve function signatures for the linker.
+  if (!Triple(M.getTargetTriple()).isWasm()) {
+    for (Function &F : llvm::make_early_inc_range(M)) {
+      if (!F.isDeclaration() || F.getFunctionType() == EmptyFT ||
+          // Changing the type of an intrinsic may invalidate the IR.
+          F.getName().starts_with("llvm."))
+        continue;
 
-    Function *NewF =
-        Function::Create(EmptyFT, GlobalValue::ExternalLinkage,
-                         F.getAddressSpace(), "", &M);
-    NewF->copyAttributesFrom(&F);
-    // Only copy function attribtues.
-    NewF->setAttributes(AttributeList::get(M.getContext(),
-                                           AttributeList::FunctionIndex,
-                                           F.getAttributes().getFnAttrs()));
-    NewF->takeName(&F);
-    F.replaceAllUsesWith(NewF);
-    F.eraseFromParent();
+      Function *NewF = Function::Create(EmptyFT, GlobalValue::ExternalLinkage,
+                                        F.getAddressSpace(), "", &M);
+      NewF->copyAttributesFrom(&F);
+      // Only copy function attribtues.
+      NewF->setAttributes(AttributeList::get(M.getContext(),
+                                             AttributeList::FunctionIndex,
+                                             F.getAttributes().getFnAttrs()));
+      NewF->takeName(&F);
+      F.replaceAllUsesWith(NewF);
+      F.eraseFromParent();
+    }
   }
 
   for (GlobalIFunc &I : llvm::make_early_inc_range(M.ifuncs())) {
diff --git a/llvm/test/ThinLTO/WebAssembly/lit.local.cfg b/llvm/test/ThinLTO/WebAssembly/lit.local.cfg
new file mode 100644
index 000000000000000..d5f39ab4dbc8ca2
--- /dev/null
+++ b/llvm/test/ThinLTO/WebAssembly/lit.local.cfg
@@ -0,0 +1,2 @@
+if not "WebAssembly" in config.root.targets:
+    config.unsupported = True
diff --git a/llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll b/llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll
new file mode 100644
index 000000000000000..201aa81c3007371
--- /dev/null
+++ b/llvm/test/ThinLTO/WebAssembly/thinlto-split-lto-unit.ll
@@ -0,0 +1,29 @@
+; RUN: opt -thinlto-bc -thinlto-split-lto-unit %s -o %t.o
+; RUN: llvm-modextract -b -n 0 %t.o -o - | llvm-dis | FileCheck %s --check-prefix=UNIT0
+; RUN: llvm-modextract -b -n 1 %t.o -o - | llvm-dis | FileCheck %s --check-prefix=UNIT1
+
+target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128-ni:1:10:20"
+target triple = "wasm32-unknown-unknown"
+
+; After split LTO unit generation, @test_gv will move to UNIT1 (because it has a
+; type metadata), and because it refers to @test, @test will be promoted by
+; adding its module ID hash to its name, and .lto_set_conditional directive will
+; be written in Unit 0.
+
+; UNIT0: module asm ".lto_set_conditional test,test.{{[0-9a-f]+}}"
+; UNIT0: define hidden i32 @test.{{[0-9a-f]+}}()
+
+; Unit 1 will contain @test's declaration. The normal ThinLTO split bitcode
+; writing removes the signatures from the split unit's declaration, but in Wasm
+; you should not do this in order to avoid function signature mismatch in
+; linker.
+
+; UNIT1: declare hidden i32 @test.{{[0-9a-f]+}}()
+
+@test_gv = constant ptr @test, align 4, !type !0
+
+define internal i32 @test() {
+  ret i32 0
+}
+
+!0 = !{i64 0, !{}}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants