Skip to content

[IRMover] Don't consider opaque types isomorphic to other types #138241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nikic
Copy link
Contributor

@nikic nikic commented May 2, 2025

The type mapping in IRMover currently has a decent amount of complexity related to establishing isomorphism between opaque struct types and non-opaque struct types. I believe that this is both largely useless at this point (after some recent clarifications, essentially the only place where opaque types can still appear are external gobals) and has never been entirely correct in the first place (because it does this in part purely based on name rather than use, which means that we effectively end up assigning semantics to the type name, which is illegal).

As such, I'd like to remove this functionality entirely.

The type mapping in IRMover currently has a decent amount of
complexity related to establishing isomorphism between opaque
struct types and non-opaque struct types. I believe that this is
both largely useless at this point (after some recent clarifications,
essentially the only place where opaque types can still appear are
external gobals) and has never been entirely correct in the first
place (because it does this in part purely based on name rather
than use, which means that we effectively end up assigning
semantics to the type name, which is illegal).

As such, I'd like to remove this functionality entirely.
@nikic nikic requested review from jayfoad and teresajohnson May 2, 2025 08:52
@llvmbot llvmbot added the LTO Link time optimization (regular/full LTO or ThinLTO) label May 2, 2025
@llvmbot
Copy link
Member

llvmbot commented May 2, 2025

@llvm/pr-subscribers-lto

Author: Nikita Popov (nikic)

Changes

The type mapping in IRMover currently has a decent amount of complexity related to establishing isomorphism between opaque struct types and non-opaque struct types. I believe that this is both largely useless at this point (after some recent clarifications, essentially the only place where opaque types can still appear are external gobals) and has never been entirely correct in the first place (because it does this in part purely based on name rather than use, which means that we effectively end up assigning semantics to the type name, which is illegal).

As such, I'd like to remove this functionality entirely.


Full diff: https://github.com/llvm/llvm-project/pull/138241.diff

4 Files Affected:

  • (modified) llvm/lib/Linker/IRMover.cpp (+18-98)
  • (modified) llvm/test/Linker/opaque.ll (+4-3)
  • (modified) llvm/test/Linker/pr22807.ll (+4-2)
  • (modified) llvm/test/Linker/type-unique-dst-types.ll (+4-8)
diff --git a/llvm/lib/Linker/IRMover.cpp b/llvm/lib/Linker/IRMover.cpp
index 3df3ee84ce214..c07a538561a01 100644
--- a/llvm/lib/Linker/IRMover.cpp
+++ b/llvm/lib/Linker/IRMover.cpp
@@ -49,21 +49,6 @@ class TypeMapTy : public ValueMapTypeRemapper {
   /// This is a mapping from a source type to a destination type to use.
   DenseMap<Type *, Type *> MappedTypes;
 
-  /// When checking to see if two subgraphs are isomorphic, we speculatively
-  /// add types to MappedTypes, but keep track of them here in case we need to
-  /// roll back.
-  SmallVector<Type *, 16> SpeculativeTypes;
-
-  SmallVector<StructType *, 16> SpeculativeDstOpaqueTypes;
-
-  /// This is a list of non-opaque structs in the source module that are mapped
-  /// to an opaque struct in the destination module.
-  SmallVector<StructType *, 16> SrcDefinitionsToResolve;
-
-  /// This is the set of opaque types in the destination modules who are
-  /// getting a body from the source module.
-  SmallPtrSet<StructType *, 16> DstResolvedOpaqueTypes;
-
 public:
   TypeMapTy(IRMover::IdentifiedStructTypeSet &DstStructTypesSet)
       : DstStructTypesSet(DstStructTypesSet) {}
@@ -73,10 +58,6 @@ class TypeMapTy : public ValueMapTypeRemapper {
   /// equivalent to the specified type in the source module.
   void addTypeMapping(Type *DstTy, Type *SrcTy);
 
-  /// Produce a body for an opaque type in the dest module from a type
-  /// definition in the source module.
-  Error linkDefinedTypeBodies();
-
   /// Return the mapped type to use for the specified input type from the
   /// source module.
   Type *get(Type *SrcTy);
@@ -93,35 +74,8 @@ class TypeMapTy : public ValueMapTypeRemapper {
 }
 
 void TypeMapTy::addTypeMapping(Type *DstTy, Type *SrcTy) {
-  assert(SpeculativeTypes.empty());
-  assert(SpeculativeDstOpaqueTypes.empty());
-
-  // Check to see if these types are recursively isomorphic and establish a
-  // mapping between them if so.
-  if (!areTypesIsomorphic(DstTy, SrcTy)) {
-    // Oops, they aren't isomorphic.  Just discard this request by rolling out
-    // any speculative mappings we've established.
-    for (Type *Ty : SpeculativeTypes)
-      MappedTypes.erase(Ty);
-
-    SrcDefinitionsToResolve.resize(SrcDefinitionsToResolve.size() -
-                                   SpeculativeDstOpaqueTypes.size());
-    for (StructType *Ty : SpeculativeDstOpaqueTypes)
-      DstResolvedOpaqueTypes.erase(Ty);
-  } else {
-    // SrcTy and DstTy are recursively ismorphic. We clear names of SrcTy
-    // and all its descendants to lower amount of renaming in LLVM context
-    // Renaming occurs because we load all source modules to the same context
-    // and declaration with existing name gets renamed (i.e Foo -> Foo.42).
-    // As a result we may get several different types in the destination
-    // module, which are in fact the same.
-    for (Type *Ty : SpeculativeTypes)
-      if (auto *STy = dyn_cast<StructType>(Ty))
-        if (STy->hasName())
-          STy->setName("");
-  }
-  SpeculativeTypes.clear();
-  SpeculativeDstOpaqueTypes.clear();
+  // areTypesIsomorphic() will also establish the type mapping as a side effect.
+  areTypesIsomorphic(DstTy, SrcTy);
 }
 
 /// Recursively walk this pair of types, returning true if they are isomorphic,
@@ -145,29 +99,10 @@ bool TypeMapTy::areTypesIsomorphic(Type *DstTy, Type *SrcTy) {
 
   // Okay, we have two types with identical kinds that we haven't seen before.
 
-  // If this is an opaque struct type, special case it.
+  // Always consider opaque struct types non-isomorphic.
   if (StructType *SSTy = dyn_cast<StructType>(SrcTy)) {
-    // Mapping an opaque type to any struct, just keep the dest struct.
-    if (SSTy->isOpaque()) {
-      Entry = DstTy;
-      SpeculativeTypes.push_back(SrcTy);
-      return true;
-    }
-
-    // Mapping a non-opaque source type to an opaque dest.  If this is the first
-    // type that we're mapping onto this destination type then we succeed.  Keep
-    // the dest, but fill it in later. If this is the second (different) type
-    // that we're trying to map onto the same opaque type then we fail.
-    if (cast<StructType>(DstTy)->isOpaque()) {
-      // We can only map one source type onto the opaque destination type.
-      if (!DstResolvedOpaqueTypes.insert(cast<StructType>(DstTy)).second)
-        return false;
-      SrcDefinitionsToResolve.push_back(SSTy);
-      SpeculativeTypes.push_back(SrcTy);
-      SpeculativeDstOpaqueTypes.push_back(cast<StructType>(DstTy));
-      Entry = DstTy;
-      return true;
-    }
+    if (SSTy->isOpaque() || cast<StructType>(DstTy)->isOpaque())
+      return false;
   }
 
   // If the number of subtypes disagree between the two types, then we fail.
@@ -196,38 +131,27 @@ bool TypeMapTy::areTypesIsomorphic(Type *DstTy, Type *SrcTy) {
       return false;
   }
 
-  // Otherwise, we speculate that these two types will line up and recursively
-  // check the subelements.
-  Entry = DstTy;
-  SpeculativeTypes.push_back(SrcTy);
-
+  // Recursively check the subelements.
   for (unsigned I = 0, E = SrcTy->getNumContainedTypes(); I != E; ++I)
     if (!areTypesIsomorphic(DstTy->getContainedType(I),
                             SrcTy->getContainedType(I)))
       return false;
 
   // If everything seems to have lined up, then everything is great.
-  return true;
-}
+  [[maybe_unused]] auto Res = MappedTypes.insert({SrcTy, DstTy});
+  assert(!Res.second && "Recursive type?");
 
-Error TypeMapTy::linkDefinedTypeBodies() {
-  SmallVector<Type *, 16> Elements;
-  for (StructType *SrcSTy : SrcDefinitionsToResolve) {
-    StructType *DstSTy = cast<StructType>(MappedTypes[SrcSTy]);
-    assert(DstSTy->isOpaque());
-
-    // Map the body of the source type over to a new body for the dest type.
-    Elements.resize(SrcSTy->getNumElements());
-    for (unsigned I = 0, E = Elements.size(); I != E; ++I)
-      Elements[I] = get(SrcSTy->getElementType(I));
-
-    if (auto E = DstSTy->setBodyOrError(Elements, SrcSTy->isPacked()))
-      return E;
-    DstStructTypesSet.switchToNonOpaque(DstSTy);
+  if (auto *STy = dyn_cast<StructType>(SrcTy)) {
+    // We clear name of SrcTy to lower amount of renaming in LLVM context.
+    // Renaming occurs because we load all source modules to the same context
+    // and declaration with existing name gets renamed (i.e Foo -> Foo.42).
+    // As a result we may get several different types in the destination
+    // module, which are in fact the same.
+    if (STy->hasName())
+      STy->setName("");
   }
-  SrcDefinitionsToResolve.clear();
-  DstResolvedOpaqueTypes.clear();
-  return Error::success();
+
+  return true;
 }
 
 Type *TypeMapTy::get(Type *Ty) {
@@ -845,10 +769,6 @@ void IRLinker::computeTypeMapping() {
     if (TypeMap.DstStructTypesSet.hasType(DST))
       TypeMap.addTypeMapping(DST, ST);
   }
-
-  // Now that we have discovered all of the type equivalences, get a body for
-  // any 'opaque' types in the dest module that are now resolved.
-  setError(TypeMap.linkDefinedTypeBodies());
 }
 
 static void getArrayElements(const Constant *C,
diff --git a/llvm/test/Linker/opaque.ll b/llvm/test/Linker/opaque.ll
index 7d2c49af7797f..2b66e3d8c7dd1 100644
--- a/llvm/test/Linker/opaque.ll
+++ b/llvm/test/Linker/opaque.ll
@@ -1,6 +1,7 @@
 ; RUN: llvm-link %p/opaque.ll %p/Inputs/opaque.ll -S -o - | FileCheck %s
 
-; CHECK-DAG: %A =   type {}
+; CHECK-DAG: %A =   type opaque
+; CHECK-DAG: %A.0 = type {}
 ; CHECK-DAG: %B =   type { %C, %C, ptr }
 ; CHECK-DAG: %B.1 = type { %D, %E, ptr }
 ; CHECK-DAG: %C =   type { %A }
@@ -8,10 +9,10 @@
 ; CHECK-DAG: %E =   type opaque
 
 ; CHECK-DAG: @g1 = external global %B
-; CHECK-DAG: @g2 = external global %A
+; CHECK-DAG: @g2 = external global %A.0
 ; CHECK-DAG: @g3 = external global %B.1
 
-; CHECK-DAG: getelementptr %A, ptr null, i32 0
+; CHECK-DAG: getelementptr %A.0, ptr null, i32 0
 
 %A = type opaque
 %B = type { %C, %C, ptr }
diff --git a/llvm/test/Linker/pr22807.ll b/llvm/test/Linker/pr22807.ll
index 24bcf1732112b..69944b84cd50a 100644
--- a/llvm/test/Linker/pr22807.ll
+++ b/llvm/test/Linker/pr22807.ll
@@ -1,6 +1,8 @@
-; RUN: not llvm-link -S -o - %p/pr22807.ll %p/Inputs/pr22807.ll 2>&1 | FileCheck %s
+; RUN: llvm-link -S -o - %p/pr22807.ll %p/Inputs/pr22807.ll 2>&1 | FileCheck %s
 
-; CHECK: error: identified structure type 'struct.A' is recursive
+; CHECK: %struct.B = type { %struct.A }
+; CHECK: %struct.A = type opaque
+; CHECK: @g = external global %struct.B
 
 %struct.B = type { %struct.A }
 %struct.A = type opaque
diff --git a/llvm/test/Linker/type-unique-dst-types.ll b/llvm/test/Linker/type-unique-dst-types.ll
index c5fcbd3f51fdb..54f0ef40333b2 100644
--- a/llvm/test/Linker/type-unique-dst-types.ll
+++ b/llvm/test/Linker/type-unique-dst-types.ll
@@ -2,21 +2,17 @@
 ; RUN:           %p/Inputs/type-unique-dst-types2.ll \
 ; RUN:           %p/Inputs/type-unique-dst-types3.ll -S -o %t1.ll
 ; RUN: cat %t1.ll | FileCheck %s
-; RUN: cat %t1.ll | FileCheck --check-prefix=RENAMED %s
 
-; This tests the importance of keeping track of which types are part of the
-; destination module.
-; When the second input is merged in, the context gets an unused A.11. When
-; the third module is then merged, we should pretend it doesn't exist.
+; The types of @g1 and @g3 can be deduplicated, but @g2 should retain its
+; opaque type, even if it has the same name as a type from a different module.
 
 ; CHECK: %A = type { %B }
 ; CHECK-NEXT: %B = type { i8 }
+; CHECK-NEXT: %A.11 = type opaque
 
 ; CHECK: @g3 = external global %A
 ; CHECK: @g1 = external global %A
-; CHECK: @g2 = external global %A
-
-; RENAMED-NOT: A.11
+; CHECK: @g2 = external global %A.11
 
 %A = type { %B }
 %B = type { i8 }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test illustrates why the existing behavior is wrong. Inputs/type-unique-dst-types3.ll is:

%A.11 = type opaque
@g2 = external global %A.11

define ptr @use_g2() {
  ret ptr @g2
}

And @g2 is not mentioned by the other linked files. As such, LLVM IR semantics do not make any statement about the type (and thus size) of @g2. The fact that a %A.11 type exists in a different module should be completely irrelevant in this context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants