Skip to content

[LTO] Fix a crash with thin LTO caching and asm output #138203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 14, 2025

Conversation

quic-akaryaki
Copy link
Contributor

The commit function of CacheFile deletes the underlying raw stream. Some output streamers may hold a pointer to it, which in this case will outlive the stream object. In particular, MCAsmStreamer keeps the pointer to the raw stream though formatted_raw_stream, which calls flush() in its destructor. If commit is called before MCAsmStreamer destruction, it will access freed memory.

Fixes: #138194.

@quic-akaryaki quic-akaryaki added the LTO Link time optimization (regular/full LTO or ThinLTO) label May 1, 2025
@llvmbot
Copy link
Member

llvmbot commented May 1, 2025

@llvm/pr-subscribers-lto

Author: Alexey Karyakin (quic-akaryaki)

Changes

The commit function of CacheFile deletes the underlying raw stream. Some output streamers may hold a pointer to it, which in this case will outlive the stream object. In particular, MCAsmStreamer keeps the pointer to the raw stream though formatted_raw_stream, which calls flush() in its destructor. If commit is called before MCAsmStreamer destruction, it will access freed memory.

Fixes: #138194.


Full diff: https://github.com/llvm/llvm-project/pull/138203.diff

1 Files Affected:

  • (modified) llvm/lib/LTO/LTOBackend.cpp (+27-21)
diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp
index dd9197efa7718..905e841291cf8 100644
--- a/llvm/lib/LTO/LTOBackend.cpp
+++ b/llvm/lib/LTO/LTOBackend.cpp
@@ -439,27 +439,33 @@ static void codegen(const Config &Conf, TargetMachine *TM,
   std::unique_ptr<CachedFileStream> &Stream = *StreamOrErr;
   TM->Options.ObjectFilenameForDebug = Stream->ObjectPathName;
 
-  legacy::PassManager CodeGenPasses;
-  TargetLibraryInfoImpl TLII(Mod.getTargetTriple());
-  CodeGenPasses.add(new TargetLibraryInfoWrapperPass(TLII));
-  // No need to make index available if the module is empty.
-  // In theory these passes should not use the index for an empty
-  // module, however, this guards against doing any unnecessary summary-based
-  // analysis in the case of a ThinLTO build where this might be an empty
-  // regular LTO combined module, with a large combined index from ThinLTO.
-  if (!isEmptyModule(Mod))
-    CodeGenPasses.add(
-        createImmutableModuleSummaryIndexWrapperPass(&CombinedIndex));
-  if (Conf.PreCodeGenPassesHook)
-    Conf.PreCodeGenPassesHook(CodeGenPasses);
-  if (TM->addPassesToEmitFile(CodeGenPasses, *Stream->OS,
-                              DwoOut ? &DwoOut->os() : nullptr,
-                              Conf.CGFileType))
-    report_fatal_error("Failed to setup codegen");
-  CodeGenPasses.run(Mod);
-
-  if (DwoOut)
-    DwoOut->keep();
+  // Create the LTO pipeline in its own scope so it gets deleted before
+  // Stream->commit() is called. The commit function of CacheFile deletes
+  // the raw stream, which is too early as streamers (e.g. MCAsmStreamer)
+  // keep the pointer and may use it until their destruction. See #138194.
+  {
+    legacy::PassManager CodeGenPasses;
+    TargetLibraryInfoImpl TLII(Mod.getTargetTriple());
+    CodeGenPasses.add(new TargetLibraryInfoWrapperPass(TLII));
+    // No need to make index available if the module is empty.
+    // In theory these passes should not use the index for an empty
+    // module, however, this guards against doing any unnecessary summary-based
+    // analysis in the case of a ThinLTO build where this might be an empty
+    // regular LTO combined module, with a large combined index from ThinLTO.
+    if (!isEmptyModule(Mod))
+      CodeGenPasses.add(
+          createImmutableModuleSummaryIndexWrapperPass(&CombinedIndex));
+    if (Conf.PreCodeGenPassesHook)
+      Conf.PreCodeGenPassesHook(CodeGenPasses);
+    if (TM->addPassesToEmitFile(CodeGenPasses, *Stream->OS,
+                                DwoOut ? &DwoOut->os() : nullptr,
+                                Conf.CGFileType))
+      report_fatal_error("Failed to setup codegen");
+    CodeGenPasses.run(Mod);
+
+    if (DwoOut)
+      DwoOut->keep();
+  }
 
   if (Error Err = Stream->commit())
     report_fatal_error(std::move(Err));

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test for this? (At least one that would fail in an asan build?)

@teresajohnson
Copy link
Contributor

Is this fall out of #136121? If so, I wonder if that PR should be reverted instead - it turns out that went in without a full review / approval, and if it is causing subtle bugs I would prefer to revert that one.

@quic-akaryaki
Copy link
Contributor Author

Is it possible to add a test for this? (At least one that would fail in an asan build?)

I have added a test.

@quic-akaryaki
Copy link
Contributor Author

Is this fall out of #136121? If so, I wonder if that PR should be reverted instead - it turns out that went in without a full review / approval, and if it is causing subtle bugs I would prefer to revert that one.

I don't have a strong opinion, someone has to make a decision. If #136121 is reverted, I would like to keep the test.

@teresajohnson
Copy link
Contributor

Is this fall out of #136121? If so, I wonder if that PR should be reverted instead - it turns out that went in without a full review / approval, and if it is causing subtle bugs I would prefer to revert that one.

I don't have a strong opinion, someone has to make a decision. If #136121 is reverted, I would like to keep the test.

Keeping the test is good. What is the failure mode of the test without your fix? I've asked for a revert of PR136121 as it was supposed to make error handling easier, but here it sounds like there was a much more subtle issue produced by that change.

@quic-akaryaki
Copy link
Contributor Author

Is this fall out of #136121? If so, I wonder if that PR should be reverted instead - it turns out that went in without a full review / approval, and if it is causing subtle bugs I would prefer to revert that one.

I don't have a strong opinion, someone has to make a decision. If #136121 is reverted, I would like to keep the test.

Keeping the test is good. What is the failure mode of the test without your fix? I've asked for a revert of PR136121 as it was supposed to make error handling easier, but here it sounds like there was a much more subtle issue produced by that change.

Without the "fix", the test crashes.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again and at the commit function, I'm not really sure how they are related. Can you clarify (see questions below)?

if (DwoOut)
DwoOut->keep();
// Create the LTO pipeline in its own scope so it gets deleted before
// Stream->commit() is called. The commit function of CacheFile deletes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/CacheFile/CachedFileStream/

However, can you clarify the sequence of events causing the crash? Looking at PR136121 and CachedFileStream::commit, that function doesn't actually delete anything. I'm also unclear on how deleting the codegen pass manager early prevents use after the call to commit() further below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CachedFileStream::commit calls OS.reset() which deletes the underlying raw stream. A pointer to this stream is kept by the formatted_raw_ostream created by CodeGenTargetMachineImpl::createMCStreamer(). This is then kept by the MCAsmStreamer object. #136121 does change the point at which the deletion of the stream happens since the legacy::PassManager object is destructed before the std::unique_ptr (since the constructor of the former happens later). With #136121 the commit happens before the legacy::PassManager destructor (and hence the MCAsmStreamer destructor, and hence the formatted_raw_stream destructor, and hence the flush() call). This PR causes the legacy::PassManager destructor to be called before the commit operation.

I have checked the other uses of CachedFileStream::commit() for similar lifetime issues, and did not find any.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CachedFileStream::commit calls OS.reset() which deletes the underlying raw stream

I don't see this?

Ah - the OS.reset() call is in derived class CacheStream::commit, so I missed this earlier (not in the base CachedFileStream nor "CacheFile" noted in the comments of this PR).

The ownership issue and fix seem fragile to me - how do other users of CacheStream avoid this lifetime issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of ownership issue is unfortunately a common pitfall of C++ - short of rewriting the entire compiler in Rust, there isn't really a general way for object A to protect itself against object B taking a pointer to one of A's sub-objects C that outlasts the lifetime that A expected. However, it's no more fragile than it was before #136121 - if the legacy::PassManager constructor had been moved to before the std::unique_ptr<CachedFileStream> constructor, this problem would have been seen even without #136121 .

As for the other users of CacheStream, I just checked that there was no object constructed after the std::unique_ptr<CachedFileStream> constructor which could outlast the commit() call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, it's no more fragile than it was before #136121 - if the legacy::PassManager constructor had been moved to before the std::unique_ptr constructor, this problem would have been seen even without #136121 .

This isn't really the case the way the APIs are set up, however, because the MCAsmStreamer constructor takes an existing raw_pwrite_stream passed down via addPassesToEmitFile(). CacheFileStream is the owner of that raw_pwrite_stream, and which is created at the same time as CacheStream in the caching code at

return std::make_unique<CacheStream>(
std::make_unique<raw_fd_ostream>(Temp->FD, /* ShouldClose */ false),
.

With the change to add the destruction in CacheStream::commit, the lifetime of the underlying raw_pwrite_stream has been somewhat artificially shortened, leading to the non-overlapping lifetimes. Ideally MCAsmStreamer would not become the owner of an object that depends on the lifetime of an object it doesn't own, but that would require quite a bit of change to the interfaces of those functions, which have presumably existed that way for a long time.

Could the original issue be addressed by calling flush() on the owned raw_pwrite_stream in CacheStream::commit() instead of destructing it? If you look at the construction of the raw_fd_ostream object in the code I linked above, it's specifying that the file shouldn't even be closed on destruction, so all it is doing is the flush() afaict.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But formatted_raw_ostream doesn't define its own flush(). Its destructor calls flush() which is the base raw_ostream::flush().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But formatted_raw_ostream doesn't define its own flush(). Its destructor calls flush() which is the base raw_ostream::flush().

Yes, buffering is implemented in the base class (raw_ostream) but it's used by the formatted_raw_ostream object (outer stream). What happens when the buffer is not empty is raw_ostream calls virtual write_imp() which is (re?)defined in formatted_raw_ostream to write pending data to the other (inner) stream.
Sorry it's so complicated.

Copy link
Contributor

@teresajohnson teresajohnson May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see the issue, which you described earlier but I didn't quite follow until poking around more. As you note there are 2 totally separate raw_ostream objects. The first which is owned by CacheFileStream is of derived type raw_fd_stream, and constructed when invoking the CacheStream (and base class CacheFileStream) constructor. The second is formatted_raw_ostream, which is also derived from raw_ostream but a different one, and it holds a pointer to the other raw_ostream object (via data member TheStream). The formatted_raw_ostream::write_impl calls write() on TheStream.

Presumably if CacheStream::commit was changed to invoke flush() instead, because formatted_raw_ostream has not yet been flushed (and therefore TheStream->write() hadn't been called yet on CacheStream's raw_ostream), we don't actually get the file written as expected by commit()?

This is unfortunately gnarly, and I agree that the only quick fixes are the one in this PR or reverting PR136121, or longer term changing some APIs so that e.g. CacheStream owns the formatted_raw_ostream. My preference is to revert PR136121, as improved errors are nice but IMO not worth the introduced complexity. Most of the callers of the new commit() function still call report_fatal_error on the result, so I don't know that it has improved the situation significantly anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to strongly advocate for leaving PR136121 in place and for merging this PR. My reasoning is as follows:

  1. There is a general principle in C++ that destructors should never do anything except release resources - if they perform operations that can fail then those errors cannot be handled except out-of-band (tearing down the process). PR136121 improves the localCache API in this respect. While doing so has uncovered other instances of destructors doing non-release operations, that in itself is not a reason to abandon this engineering principle.
  2. While most of the callers of the new commit() function currently still call report_fatal_error on the result (as I wanted to keep PR136121 as unimpactful as possible), they may wish to improve their error-handling in the future and PR136121 is a necessary component of that. The motivating case is an out-of-tree use https://github.com/ROCm/llvm-project/blob/amd-staging/amd/comgr/src/comgr-cache.cpp for which tearing down the process would be highly undesirable - if that was an unavoidable aspect of the localCache API then ROCm would need to have its own fork of the localCache API (that didn't do the commit in the destructor) with the maintenance headaches that would incur. It seems likely to me that other users of localCache will in the future realise that they have similar requirements.
  3. Reverting PR136121 now would also mean that the other pieces of code that have had to be changed in response to PR136121 would have to be changed back - it would be more churn.
  4. I stand by what I said about there being latent fragility in the codegen() function in llvm/lib/LTO/LTOBackend.cpp . The MCAsmStreamer object is destroyed (and hence the formatted_raw_ostream object) when CodeGenPasses is destroyed. If the CodeGenPasses constructor were in the future to be moved above the Stream constructor then it would be destroyed after the CachedFileStream (the order in which these destructors are run is the reverse of the order in which the corresponding constructors are run - the position of the addPassesToEmitFile() call is not relevant as the end of the lifetime of the MCAsmStreamer object is tied to the end of the lifetime of the legacy::PassManager object and hence the time when that object was constructed). So reverting PR136121 wouldn't actually solve the latent fragility that is the purported reason for reverting it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go ahead and approve this PR to address the crash, because I think the underlying issue lies in the asm printing passes and their setup.

Specifically, createMCStreamer creates the formatted_raw_ostream, which stores a pointer to the provided raw_ostream reference that it doesn't own, and addAsmPrinter which in turn stores that in the pass manager. All clients of addPassesToEmitFile, which takes the raw_ostream and pass manager references and invokes these APIs, currently have to ensure that the lifetime of the provided raw_ostream object outlasts the lifetime of the pass manager.

I stand by what I said about there being latent fragility in the codegen() function in llvm/lib/LTO/LTOBackend.cpp . The MCAsmStreamer object is destroyed (and hence the formatted_raw_ostream object) when CodeGenPasses is destroyed. If the CodeGenPasses constructor were in the future to be moved above the Stream constructor then it would be destroyed after the CachedFileStream
the position of the addPassesToEmitFile() call is not relevant

Yes, I see what you mean. It is relevant to the problem as it takes both the raw_ostream and the pass manager, and it is down in its handling where the latter takes a pointer to the former, but not to the relative ordering of their construction/destruction. As noted earlier, this issue is not specific to the LTOBackend codegen(), the potential for a problem exists in any client of this code gen pass setup.

@quic-akaryaki
Copy link
Contributor Author

The forced push is just to edit the comment as requested by @teresajohnson

The commit function of CacheStream deletes the underlying raw stream.
Some output streamers may hold a pointer to it, which in this case will
outlive the stream object. In particular, MCAsmStreamer keeps the
pointer to the raw stream though formatted_raw_stream, which calls
flush() in its destructor. If commit is called before MCAsmStreamer
destruction, it will access freed memory.

S
@quic-akaryaki quic-akaryaki force-pushed the lto-thin-crash-commit branch from 1ad72ca to c5402c1 Compare May 12, 2025 22:11
@quic-akaryaki quic-akaryaki merged commit eac7466 into llvm:main May 14, 2025
11 checks passed
@quic-akaryaki quic-akaryaki deleted the lto-thin-crash-commit branch May 14, 2025 15:34
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 14, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-devrel-x86-64 running on ml-opt-devrel-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/175/builds/18505

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ThinLTO/X86/cache-emit-asm.ll' FAILED ********************
Exit Code: 127

Command Output (stderr):
--
rm -rf /b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache # RUN: at line 5
+ rm -rf /b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache
/b/ml-opt-devrel-x86-64-b1/build/bin/opt -module-hash -module-summary -thinlto-bc /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 6
+ /b/ml-opt-devrel-x86-64-b1/build/bin/opt -module-hash -module-summary -thinlto-bc /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld --thinlto-cache-dir=/b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 7
+ ld.lld --thinlto-cache-dir=/b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
/b/ml-opt-devrel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.script: line 3: ld.lld: command not found

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented May 14, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-rel-x86-64 running on ml-opt-rel-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/185/builds/18307

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ThinLTO/X86/cache-emit-asm.ll' FAILED ********************
Exit Code: 127

Command Output (stderr):
--
rm -rf /b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache # RUN: at line 5
+ rm -rf /b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache
/b/ml-opt-rel-x86-64-b1/build/bin/opt -module-hash -module-summary -thinlto-bc /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 6
+ /b/ml-opt-rel-x86-64-b1/build/bin/opt -module-hash -module-summary -thinlto-bc /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld --thinlto-cache-dir=/b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 7
+ ld.lld --thinlto-cache-dir=/b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
/b/ml-opt-rel-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.script: line 3: ld.lld: command not found

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented May 14, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-dev-x86-64 running on ml-opt-dev-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/137/builds/18541

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ThinLTO/X86/cache-emit-asm.ll' FAILED ********************
Exit Code: 127

Command Output (stderr):
--
rm -rf /b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache # RUN: at line 5
+ rm -rf /b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache
/b/ml-opt-dev-x86-64-b1/build/bin/opt -module-hash -module-summary -thinlto-bc /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 6
+ /b/ml-opt-dev-x86-64-b1/build/bin/opt -module-hash -module-summary -thinlto-bc /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld --thinlto-cache-dir=/b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 7
+ ld.lld --thinlto-cache-dir=/b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
/b/ml-opt-dev-x86-64-b1/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.script: line 3: ld.lld: command not found

--

********************


quic-akaryaki added a commit to quic-akaryaki/llvm-project that referenced this pull request May 14, 2025
This is a follow-up to llvm#138203. The added test used lld but lld is
not always available. Switch to using llvm-lto2.
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 14, 2025

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/25867

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ThinLTO/X86/cache-emit-asm.ll' FAILED ********************
Exit Code: 127

Command Output (stderr):
--
rm -rf /b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache # RUN: at line 5
+ rm -rf /b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache
/b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -module-hash -module-summary -thinlto-bc /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 6
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -module-hash -module-summary -thinlto-bc /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld --thinlto-cache-dir=/b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 7
+ ld.lld --thinlto-cache-dir=/b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
/b/1/clang-x86_64-debian-fast/llvm.obj/test/ThinLTO/X86/Output/cache-emit-asm.ll.script: line 3: ld.lld: command not found

--

********************


quic-akaryaki added a commit that referenced this pull request May 14, 2025
This is a follow-up to #138203. The added test used lld but lld is not
always available, which breaks builds. Make the test use llvm-lto2. Also
make the test a bit more similar to other tests in the same directory.
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 14, 2025

LLVM Buildbot has detected a new failure on builder llvm-nvptx64-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/160/builds/17438

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: ThinLTO/X86/cache-emit-asm.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
rm -rf /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache # RUN: at line 5
+ rm -rf /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt -module-hash -module-summary -thinlto-bc /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 6
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/bin/opt -module-hash -module-summary -thinlto-bc /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld --thinlto-cache-dir=/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 7
+ ld.lld --thinlto-cache-dir=/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx64-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld: error: Invalid summary version 12. Version should be in the range [1-9].

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented May 14, 2025

LLVM Buildbot has detected a new failure on builder llvm-nvptx-nvidia-ubuntu running on as-builder-7 while building llvm at step 6 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/180/builds/17581

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: ThinLTO/X86/cache-emit-asm.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
rm -rf /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache # RUN: at line 5
+ rm -rf /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache
/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt -module-hash -module-summary -thinlto-bc /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 6
+ /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/bin/opt -module-hash -module-summary -thinlto-bc /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/llvm-project/llvm/test/ThinLTO/X86/cache-emit-asm.ll -o /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld --thinlto-cache-dir=/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc # RUN: at line 7
+ ld.lld --thinlto-cache-dir=/home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp.cache --lto-emit-asm /home/buildbot/worker/as-builder-7/ramdisk/llvm-nvptx-nvidia-ubuntu/build/test/ThinLTO/X86/Output/cache-emit-asm.ll.tmp1.bc
ld.lld: error: Invalid summary version 12. Version should be in the range [1-9].

--

********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash in lld with thin LTO cache and assembly output
6 participants