Skip to content

llvm-reduce: Add values to return reduction #132686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Mar 24, 2025

In void functions, try to replace instruction uses
with a new non-void return. If the return type matches
the instruction, also try to directly return it.

This handles most of the cases, but doesn't try to handle
all of the weird exception related terminators.

Also doesn't try to replace argument uses, although it could. We
could also handle cases where we can insert a simple cast to an
original return value. I didn't think too hard about where to put this
in the default pass order. In many cases it obviates the need for most
of the CFG folds, but I've left it near the end initially.

I also think this is too aggressive about removing dead code, and
should leave existing dead code alone. I'm also not sure why we have
both "removeUnreachableBlocks" and EliminateUnreachableBlocks" in Utils.

Fixes #66039, fixes #107327

@arsenm arsenm marked this pull request as ready for review March 24, 2025 07:36
@llvmbot
Copy link
Member

llvmbot commented Mar 24, 2025

@llvm/pr-subscribers-llvm-ir

Author: Matt Arsenault (arsenm)

Changes

In void functions, try to replace instruction uses
with a new non-void return. If the return type matches
the instruction, also try to directly return it.

This handles most of the cases, but doesn't try to handle
all of the weird exception related terminators.

Also doesn't try to replace argument uses, although it could. We
could also handle cases where we can insert a simple cast to an
original return value. I didn't think too hard about where to put this
in the default pass order. In many cases it obviates the need for most
of the CFG folds, but I've left it near the end initially.

I also think this is too aggressive about removing dead code, and
should leave existing dead code alone. I'm also not sure why we have
both "removeUnreachableBlocks" and EliminateUnreachableBlocks" in Utils.

Fixes #66039, fixes #107327


Patch is 43.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132686.diff

10 Files Affected:

  • (modified) llvm/include/llvm/IR/Function.h (+8)
  • (modified) llvm/lib/IR/Function.cpp (+82)
  • (modified) llvm/test/tools/llvm-reduce/reduce-operands-fp.ll (+1-1)
  • (modified) llvm/test/tools/llvm-reduce/reduce-operands-int.ll (+1-1)
  • (modified) llvm/test/tools/llvm-reduce/reduce-operands-ptr.ll (+1-1)
  • (added) llvm/test/tools/llvm-reduce/reduce-values-to-return.ll (+959)
  • (modified) llvm/tools/llvm-reduce/CMakeLists.txt (+1)
  • (modified) llvm/tools/llvm-reduce/DeltaManager.cpp (+2)
  • (added) llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.cpp (+245)
  • (added) llvm/tools/llvm-reduce/deltas/ReduceValuesToReturn.h (+18)
diff --git a/llvm/include/llvm/IR/Function.h b/llvm/include/llvm/IR/Function.h
index f17b7140ca29c..6d4a53da7ff22 100644
--- a/llvm/include/llvm/IR/Function.h
+++ b/llvm/include/llvm/IR/Function.h
@@ -1048,6 +1048,14 @@ class LLVM_ABI Function : public GlobalObject, public ilist_node<Function> {
   void setValueSubclassDataBit(unsigned Bit, bool On);
 };
 
+namespace CallingConv {
+
+// TODO: Need similar function for support of argument in position. General
+// version on FunctionType + Attributes + CallingConv::ID?
+LLVM_READNONE
+bool supportsNonVoidReturnType(CallingConv::ID CC);
+} // namespace CallingConv
+
 /// Check whether null pointer dereferencing is considered undefined behavior
 /// for a given function or an address space.
 /// Null pointer access in non-zero address space is not considered undefined.
diff --git a/llvm/lib/IR/Function.cpp b/llvm/lib/IR/Function.cpp
index 3644fab913b10..79308e4787b67 100644
--- a/llvm/lib/IR/Function.cpp
+++ b/llvm/lib/IR/Function.cpp
@@ -1177,3 +1177,85 @@ bool llvm::NullPointerIsDefined(const Function *F, unsigned AS) {
 
   return false;
 }
+
+bool llvm::CallingConv::supportsNonVoidReturnType(CallingConv::ID CC) {
+  switch (CC) {
+  case CallingConv::C:
+  case CallingConv::Fast:
+  case CallingConv::Cold:
+  case CallingConv::GHC:
+  case CallingConv::HiPE:
+  case CallingConv::AnyReg:
+  case CallingConv::PreserveMost:
+  case CallingConv::PreserveAll:
+  case CallingConv::Swift:
+  case CallingConv::CXX_FAST_TLS:
+  case CallingConv::Tail:
+  case CallingConv::CFGuard_Check:
+  case CallingConv::SwiftTail:
+  case CallingConv::PreserveNone:
+  case CallingConv::X86_StdCall:
+  case CallingConv::X86_FastCall:
+  case CallingConv::ARM_APCS:
+  case CallingConv::ARM_AAPCS:
+  case CallingConv::ARM_AAPCS_VFP:
+  case CallingConv::MSP430_INTR:
+  case CallingConv::X86_ThisCall:
+  case CallingConv::PTX_Device:
+  case CallingConv::SPIR_FUNC:
+  case CallingConv::Intel_OCL_BI:
+  case CallingConv::X86_64_SysV:
+  case CallingConv::Win64:
+  case CallingConv::X86_VectorCall:
+  case CallingConv::DUMMY_HHVM:
+  case CallingConv::DUMMY_HHVM_C:
+  case CallingConv::X86_INTR:
+  case CallingConv::AVR_INTR:
+  case CallingConv::AVR_SIGNAL:
+  case CallingConv::AVR_BUILTIN:
+    return true;
+  case CallingConv::AMDGPU_KERNEL:
+  case CallingConv::SPIR_KERNEL:
+  case CallingConv::AMDGPU_CS_Chain:
+  case CallingConv::AMDGPU_CS_ChainPreserve:
+    return false;
+  case CallingConv::AMDGPU_VS:
+  case CallingConv::AMDGPU_HS:
+  case CallingConv::AMDGPU_GS:
+  case CallingConv::AMDGPU_PS:
+  case CallingConv::AMDGPU_CS:
+  case CallingConv::AMDGPU_LS:
+  case CallingConv::AMDGPU_ES:
+  case CallingConv::MSP430_BUILTIN:
+  case CallingConv::AArch64_VectorCall:
+  case CallingConv::AArch64_SVE_VectorCall:
+  case CallingConv::WASM_EmscriptenInvoke:
+  case CallingConv::AMDGPU_Gfx:
+  case CallingConv::M68k_INTR:
+  case CallingConv::AArch64_SME_ABI_Support_Routines_PreserveMost_From_X0:
+  case CallingConv::AArch64_SME_ABI_Support_Routines_PreserveMost_From_X2:
+  case CallingConv::M68k_RTD:
+  case CallingConv::GRAAL:
+  case CallingConv::ARM64EC_Thunk_X64:
+  case CallingConv::ARM64EC_Thunk_Native:
+  case CallingConv::RISCV_VectorCall:
+  case CallingConv::AArch64_SME_ABI_Support_Routines_PreserveMost_From_X1:
+  case CallingConv::RISCV_VLSCall_32:
+  case CallingConv::RISCV_VLSCall_64:
+  case CallingConv::RISCV_VLSCall_128:
+  case CallingConv::RISCV_VLSCall_256:
+  case CallingConv::RISCV_VLSCall_512:
+  case CallingConv::RISCV_VLSCall_1024:
+  case CallingConv::RISCV_VLSCall_2048:
+  case CallingConv::RISCV_VLSCall_4096:
+  case CallingConv::RISCV_VLSCall_8192:
+  case CallingConv::RISCV_VLSCall_16384:
+  case CallingConv::RISCV_VLSCall_32768:
+  case CallingConv::RISCV_VLSCall_65536:
+    return true;
+  default:
+    return false;
+  }
+
+  llvm_unreachable("covered callingconv switch");
+}
diff --git a/llvm/test/tools/llvm-reduce/reduce-operands-fp.ll b/llvm/test/tools/llvm-reduce/reduce-operands-fp.ll
index b547c819bf0de..e10a3f8c010ce 100644
--- a/llvm/test/tools/llvm-reduce/reduce-operands-fp.ll
+++ b/llvm/test/tools/llvm-reduce/reduce-operands-fp.ll
@@ -27,7 +27,7 @@
 ; CHECK-INTERESTINGNESS: = fadd <2 x float>
 ; CHECK-INTERESTINGNESS: = fadd <2 x float>
 
-; CHECK-LABEL: define void @foo(
+; CHECK-LABEL: define {{(void|<2 x float>)}} @foo(
 
 
 ; ONE: %fadd0 = fadd float %arg0, 1.000000e+00
diff --git a/llvm/test/tools/llvm-reduce/reduce-operands-int.ll b/llvm/test/tools/llvm-reduce/reduce-operands-int.ll
index 397a1595ca6b2..66ad30832aabc 100644
--- a/llvm/test/tools/llvm-reduce/reduce-operands-int.ll
+++ b/llvm/test/tools/llvm-reduce/reduce-operands-int.ll
@@ -22,7 +22,7 @@
 ; CHECK-INTERESTINGNESS: = add <2 x i32>
 ; CHECK-INTERESTINGNESS: = add <2 x i32>
 
-; CHECK-LABEL: define void @foo(
+; CHECK-LABEL: define {{(void|<2 x i32>)}} @foo(
 
 
 ; ONE: %add0 = add i32 %arg0, 1
diff --git a/llvm/test/tools/llvm-reduce/reduce-operands-ptr.ll b/llvm/test/tools/llvm-reduce/reduce-operands-ptr.ll
index 3e163b30f6b38..4669cf76074d4 100644
--- a/llvm/test/tools/llvm-reduce/reduce-operands-ptr.ll
+++ b/llvm/test/tools/llvm-reduce/reduce-operands-ptr.ll
@@ -8,7 +8,7 @@
 ; RUN: llvm-reduce --abort-on-invalid-reduction --test FileCheck --test-arg --check-prefixes=CHECK-INTERESTINGNESS --test-arg %s --test-arg --input-file %s -o %t
 ; RUN: FileCheck --check-prefixes=CHECK,ZERO %s < %t
 
-; CHECK-LABEL: define void @foo(
+; CHECK-LABEL: define {{(void|ptr)}} @foo(
 
 ; ONE: load i32, ptr %a0
 ; ONE: load i32, ptr @g
diff --git a/llvm/test/tools/llvm-reduce/reduce-values-to-return.ll b/llvm/test/tools/llvm-reduce/reduce-values-to-return.ll
new file mode 100644
index 0000000000000..0c36db8ebc278
--- /dev/null
+++ b/llvm/test/tools/llvm-reduce/reduce-values-to-return.ll
@@ -0,0 +1,959 @@
+; Test that llvm-reduce can move intermediate values by inserting
+; early returns
+;
+; RUN: llvm-reduce --abort-on-invalid-reduction --delta-passes=values-to-return --test FileCheck --test-arg --check-prefixes=CHECK,INTERESTING --test-arg %s --test-arg --input-file %s -o %t
+; RUN: FileCheck --check-prefixes=CHECK,RESULT %s < %t
+
+@gv = global i32 0, align 4
+@gv_struct = global { i32, float } zeroinitializer, align 4
+@gv_array = global [3 x i32] zeroinitializer, align 4
+@gv_empty_struct = global { } zeroinitializer, align 4
+
+; CHECK: @global.func.user = global ptr @store_instruction_to_return_with_uses
+@global.func.user = global ptr @store_instruction_to_return_with_uses
+
+; INTERESTING-LABEL: @store_instruction_to_return_with_uses(
+; INTERESTING-NEXT: = load
+
+; RESULT-LABEL: define i32 @store_instruction_to_return_with_uses(ptr %arg) {
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @store_instruction_to_return_with_uses(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: define void @user(
+; INTERESTING: call
+
+; RESULT-LABEL: define void @user(
+; RESULT-NEXT: call i32 @store_instruction_to_return_with_uses(ptr %a, ptr %b)
+; RESULT-NEXT: ret void
+; RESULT-NEXT: }
+define void @user(ptr %a, ptr %b) {
+  call void @store_instruction_to_return_with_uses(ptr %a, ptr %b)
+  ret void
+}
+
+; INTERESTING-LABEL: @store_instruction_to_return_no_uses(
+; INTERESTING: = load i32
+
+; RESULT-LABEL: define i32 @store_instruction_to_return_no_uses(
+; RESULT-NEXT: %load = load i32
+; RESULT-NEXT: ret i32 %load
+define void @store_instruction_to_return_no_uses(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @store_instruction_to_return_preserve_attrs(
+; INTERESTING: = load
+
+; RESULT: ; Function Attrs: nounwind
+; RESULT-NEXT: define weak i32 @store_instruction_to_return_preserve_attrs(ptr byref(i32) %arg) #0 {
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define weak void @store_instruction_to_return_preserve_attrs(ptr byref(i32) %arg) nounwind "some-attr" {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @store_instruction_to_return_preserve_addrspace(
+; INTERESTING: = load
+
+; RESULT-LABEL: define i32 @store_instruction_to_return_preserve_addrspace(ptr %arg) addrspace(1) {
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @store_instruction_to_return_preserve_addrspace(ptr %arg) addrspace(1) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @store_instruction_to_return_no_uses_unreachable(
+; INTERESTING: = load
+
+; RESULT-LABEL: define i32 @store_instruction_to_return_no_uses_unreachable(ptr %arg) {
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @store_instruction_to_return_no_uses_unreachable(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  unreachable
+}
+
+; INTERESTING-LABEL: @store_instruction_to_return_with_non_callee_use(
+; INTERESTING: = load
+
+; RESULT-LABEL: define i32 @store_instruction_to_return_with_non_callee_use(ptr %arg) {
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @store_instruction_to_return_with_non_callee_use(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+declare void @takes_fptr(ptr)
+
+; CHECK: @non_callee_user(
+; CHECK: ret void
+define void @non_callee_user(ptr %a, ptr %b) {
+  call void @takes_fptr(ptr @store_instruction_to_return_with_non_callee_use)
+  ret void
+}
+
+declare i32 @convergent_call() convergent
+
+; CHECK-LABEL: @no_return_token_def(
+; CHECK: call token
+; RESULT: ret void
+define void @no_return_token_def(ptr %arg) convergent {
+  %t = call token @llvm.experimental.convergence.entry()
+  ret void
+}
+
+; INTERESTING-LABEL: @no_return_token_def_other(
+; INTERESTING: call token
+
+; RESULT-LABEL: define i32 @no_return_token_def_other(
+; RESULT: call token
+; RESULT: call i32
+; RESULT: ret i32
+define void @no_return_token_def_other(ptr %arg) convergent {
+  %t = call token @llvm.experimental.convergence.entry()
+  %call = call i32 @convergent_call() [ "convergencectrl"(token %t) ]
+  store i32 %call, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @store_instruction_to_return_variadic_func(
+; INTERESTING: = load
+
+; RESULT-LABEL: define i32 @store_instruction_to_return_variadic_func(ptr %arg, ...)
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @store_instruction_to_return_variadic_func(ptr %arg, ...) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; Has a callsite use that is invoking the function with a non-void
+; return type, that does not match the new return type.
+
+; INTERESTING-LABEL: @inst_to_return_has_nonvoid_wrong_type_caller(
+
+; RESULT-LABEL: define void @inst_to_return_has_nonvoid_wrong_type_caller(
+; RESULT-NEXT: %load = load i32, ptr %arg
+; RESULT-NEXT: store i32 %load, ptr @gv
+; RESULT-NEXT: ret void
+define void @inst_to_return_has_nonvoid_wrong_type_caller(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @wrong_callsite_return_type(
+
+; RESULT-LABEL: define i64 @wrong_callsite_return_type(
+; RESULT-NEXT: %ret = call i64 @inst_to_return_has_nonvoid_wrong_type_caller(ptr %arg)
+; RESULT-NEXT: ret i64 %ret
+define i64 @wrong_callsite_return_type(ptr %arg) {
+  %ret = call i64 @inst_to_return_has_nonvoid_wrong_type_caller(ptr %arg)
+  ret i64 %ret
+}
+
+; INTERESTING-LABEL: @inst_to_return_already_has_new_type_caller(
+
+; RESULT-LABEL: define i32 @inst_to_return_already_has_new_type_caller(
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @inst_to_return_already_has_new_type_caller(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; Callsite has UB signature mismatch, but the return type happens to
+; match the new return type.
+;
+; INTERESTING-LABEL: @callsite_already_new_return_type(
+
+; RESULT-LABEL: define i32 @callsite_already_new_return_type(
+; RESULT-NEXT: %ret = call i32 @inst_to_return_already_has_new_type_caller(ptr %arg)
+; RESULT-NEXT: ret i32 %ret
+define i32 @callsite_already_new_return_type(ptr %arg) {
+  %ret = call i32 @inst_to_return_already_has_new_type_caller(ptr %arg)
+  ret i32 %ret
+}
+
+; INTERESTING-LABEL: @non_void_no_op(
+; INTERESTING: = load
+; INTERESTING: ret
+
+; RESULT-LABEL: define ptr @non_void_no_op(
+; RESULT: ret ptr null
+define ptr @non_void_no_op(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret ptr null
+}
+
+; INTERESTING-LABEL: @non_void_same_type_use(
+; INTERESTING: = load
+; INTERESTING: ret
+
+; RESULT-LABEL: define i32 @non_void_same_type_use(
+; RESULT-NEXT: %load = load i32, ptr %arg
+; RESULT-NEXT: ret i32 %load
+define i32 @non_void_same_type_use(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret i32 0
+}
+
+; INTERESTING-LABEL: @non_void_bitcastable_type_use(
+; INTERESTING: = load
+; INTERESTING: ret
+
+; RESULT-LABEL: define i32 @non_void_bitcastable_type_use(
+; RESULT-NEXT: %load = load float, ptr %arg
+; RESULT-NEXT: store float %load,
+; RESULT-NEXT: ret i32 0
+define i32 @non_void_bitcastable_type_use(ptr %arg) {
+  %load = load float, ptr %arg
+  store float %load, ptr @gv
+  ret i32 0
+}
+
+; INTERESTING-LABEL: @form_return_struct(
+; INTERESTING: = load { i32, float }
+
+; RESULT-LABEL: define { i32, float } @form_return_struct(ptr %arg) {
+; RESULT-NEXT: %load = load { i32, float }, ptr %arg, align 4
+; RESULT-NEXT: ret { i32, float } %load
+define void @form_return_struct(ptr %arg) {
+  %load = load { i32, float }, ptr %arg
+  store { i32, float } %load, ptr @gv_struct
+  ret void
+}
+
+; INTERESTING-LABEL: define void @return_struct_user(
+; INTERESTING-NEXT: call
+; RESULT: call { i32, float } @form_return_struct(ptr %arg)
+define void @return_struct_user(ptr %arg) {
+  call void @form_return_struct(ptr %arg)
+  ret void
+}
+
+; INTERESTING-LABEL: @form_return_array(
+; INTERESTING: = load
+
+; RESULT-LABEL: define [3 x i32] @form_return_array(
+; RESULT-NEXT: %load = load [3 x i32]
+; RESULT-NEXT: ret [3 x i32] %load
+define void @form_return_array(ptr %arg) {
+  %load = load [3 x i32], ptr %arg
+  store [3 x i32] %load, ptr @gv_array
+  ret void
+}
+
+; CHECK-LABEL: @return_array_user(
+; RESULT: call [3 x i32] @form_return_array(ptr %arg)
+define void @return_array_user(ptr %arg) {
+  call void @form_return_array(ptr %arg)
+  ret void
+}
+
+; INTERESTING-LABEL: @form_return_empty_struct(
+; INTERESTING: = load
+
+; RESULT: define {} @form_return_empty_struct(
+; RESULT-NEXT: %load = load {}
+; RESULT-NEXT: ret {} %load
+define void @form_return_empty_struct(ptr %arg) {
+  %load = load { }, ptr %arg
+  store { } %load, ptr @gv_empty_struct
+  ret void
+}
+
+; CHECK-LABEL: define void @return_empty_struct_user(
+; RESULT: call {} @form_return_empty_struct(ptr %arg)
+define void @return_empty_struct_user(ptr %arg) {
+  call void @form_return_empty_struct(ptr %arg)
+  ret void
+}
+
+define target("sometarget.sometype") @target_type_func() {
+  ret target("sometarget.sometype") poison
+}
+
+define void @target_type_user(target("sometarget.sometype") %a) {
+  ret void
+}
+
+; INTERESTING-LABEL: @form_return_target_ty(
+; INTERESTING: call target("sometarget.sometype") @target_type_func()
+
+; RESULT: define target("sometarget.sometype") @form_return_target_ty(
+; RESULT-NEXT: %call = call target("sometarget.sometype") @target_type_func()
+; RESULT-NEXT:  ret target("sometarget.sometype") %call
+define void @form_return_target_ty(ptr %arg) {
+  %call = call target("sometarget.sometype") @target_type_func()
+  call void @target_type_user(target("sometarget.sometype") %call)
+  ret void
+}
+
+; CHECK-LABEL: define void @return_target_ty_user(
+; RESULT-NEXT: %1 = call target("sometarget.sometype") @form_return_target_ty(ptr %arg)
+; RESULT-NEXT: ret void
+define void @return_target_ty_user(ptr %arg) {
+  call void @form_return_target_ty(ptr %arg)
+  ret void
+}
+
+; Make sure an invalid reduction isn't attempted for a function with
+; an sret argument
+
+; CHECK-LABEL: @no_sret_nonvoid_return
+define void @no_sret_nonvoid_return(ptr sret(i32) %out.sret, ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr %out.sret
+  ret void
+}
+
+; Test a calling convention where it's illegal to use a non-void
+; return. No invalid reduction should be introduced.
+
+; INTERESTING-LABEL: @no_void_return_callingconv(
+; INTERESTING: = load i32
+
+; RESULT-LABEL: define amdgpu_kernel void @no_void_return_callingconv(
+; RESULT-NEXT: %load = load i32
+; RESULT-NEXT: store i32 %load
+; RESULT-NEXT: ret void
+define amdgpu_kernel void @no_void_return_callingconv(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @keep_first_of_3(
+; INTERESTING: %load0 = load i32, ptr %arg0
+; INTERESTING: ret
+
+; RESULT-LABEL: define i32 @keep_first_of_3(
+; RESULT-NEXT: %load0 = load i32, ptr %arg0, align 4
+; RESULT-NEXT: ret i32 %load0
+define void @keep_first_of_3(ptr %arg0, ptr %arg1, ptr %arg2) {
+  %load0 = load i32, ptr %arg0
+  %load1 = load i32, ptr %arg1
+  %load2 = load i32, ptr %arg2
+  store i32 %load0, ptr @gv
+  store i32 %load1, ptr @gv
+  store i32 %load2, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @keep_second_of_3(
+; INTERESTING: %load1 = load i32, ptr %arg1
+
+; RESULT-LABEL: define i32 @keep_second_of_3(
+; RESULT-NEXT: %load0 = load i32, ptr %arg0
+; RESULT-NEXT: %load1 = load i32, ptr %arg1
+; RESULT-NEXT: ret i32 %load1
+define void @keep_second_of_3(ptr %arg0, ptr %arg1, ptr %arg2) {
+  %load0 = load i32, ptr %arg0
+  %load1 = load i32, ptr %arg1
+  %load2 = load i32, ptr %arg2
+  store i32 %load0, ptr @gv
+  store i32 %load1, ptr @gv
+  store i32 %load2, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @keep_third_of_3(
+; INTERESTING: %load2 = load i32, ptr %arg2
+
+; RESULT-LABEL: define i32 @keep_third_of_3(
+; RESULT-NEXT: %load0 = load i32, ptr %arg0, align 4
+; RESULT-NEXT: %load1 = load i32, ptr %arg1, align 4
+; RESULT-NEXT: %load2 = load i32, ptr %arg2, align 4
+; RESULT-NEXT: ret i32 %load2
+define void @keep_third_of_3(ptr %arg0, ptr %arg1, ptr %arg2) {
+  %load0 = load i32, ptr %arg0
+  %load1 = load i32, ptr %arg1
+  %load2 = load i32, ptr %arg2
+  store i32 %load0, ptr @gv
+  store i32 %load1, ptr @gv
+  store i32 %load2, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @keep_first_2_of_3(
+; INTERESTING: %load0 = load i32, ptr %arg0
+; INTERESTING: %load1 = load i32, ptr %arg1
+
+; RESULT-LABEL: define i32 @keep_first_2_of_3(
+; RESULT-NEXT: %load0 = load i32, ptr %arg0
+; RESULT-NEXT: %load1 = load i32, ptr %arg1
+; RESULT-NEXT: ret i32 %load1
+define void @keep_first_2_of_3(ptr %arg0, ptr %arg1, ptr %arg2) {
+  %load0 = load i32, ptr %arg0
+  %load1 = load i32, ptr %arg1
+  %load2 = load i32, ptr %arg2
+  store i32 %load0, ptr @gv
+  store i32 %load1, ptr @gv
+  store i32 %load2, ptr @gv
+  ret void
+}
+
+; INTERESTING-LABEL: @keep_second_of_3_already_ret_constexpr(
+; INTERESTING: %load1 = load i32, ptr %arg1
+; INTERESTING: ret
+
+; RESULT-LABEL: define i32 @keep_second_of_3_already_ret_constexpr(
+; RESULT-NEXT: %load0 = load i32, ptr %arg0, align 4
+; RESULT-NEXT: %load1 = load i32, ptr %arg1, align 4
+; RESULT-NEXT: ret i32 %load1
+define i32 @keep_second_of_3_already_ret_constexpr(ptr %arg0, ptr %arg1, ptr %arg2) {
+  %load0 = load i32, ptr %arg0
+  %load1 = load i32, ptr %arg1
+  %load2 = load i32, ptr %arg2
+  store i32 %load0, ptr @gv
+  store i32 %load1, ptr @gv
+  store i32 %load2, ptr @gv
+  ret i32 ptrtoint (ptr @gv to i32)
+}
+
+; INTERESTING-LABEL: @self_recursive(
+; INTERESTING:  %load = load i32, ptr %arg
+
+; RESULT-LABEL: define i32 @self_recursive(
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @self_recursive(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @gv
+  call void @self_recursive(ptr %arg)
+  ret void
+}
+
+; INTERESTING-LABEL: @has_invoke_user(
+
+; RESULT-LABEL: define i32 @has_invoke_user(
+; RESULT-NEXT: %load = load i32, ptr %arg, align 4
+; RESULT-NEXT: ret i32 %load
+define void @has_invoke_user(ptr %arg) {
+  %load = load i32, ptr %arg
+  store i32 %load, ptr @g...
[truncated]

@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from b440a43 to d9602fc Compare March 24, 2025 07:45
@arsenm
Copy link
Contributor Author

arsenm commented Mar 25, 2025

Hit an assert in a real program, will need to fix first

@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from d9602fc to c658ad3 Compare March 25, 2025 05:55
@regehr
Copy link
Contributor

regehr commented Mar 25, 2025

super happy to see this pass!!!! it's one of the reductions I most often see llvm-reduce missing

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, that's super useful!

@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch 2 times, most recently from 298d6ee to f496891 Compare March 30, 2025 07:41
@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from d1a9a08 to f66732b Compare April 2, 2025 06:24
@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from f66732b to 976e948 Compare April 2, 2025 07:51
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like

case CallingConv::RISCV_VLSCall_16384:
case CallingConv::RISCV_VLSCall_32768:
case CallingConv::RISCV_VLSCall_65536:
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there two separate lists with return true here? What's the difference between them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to keep this sorted by enum value

// that phi.
// TODO: Could avoid this with fancier iterator management.
for (BasicBlock *Succ : successors(NewRetBlock))
Succ->removePredecessor(NewRetBlock, /*KeepOneInputPHIs=*/true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of general approach, maybe it would be simpler to first call splitBasicBlock(), replace the terminator with return, and then leave everything else to EliminateUnreachableBlocks? I think that should also automatically handle the unusual terminators in a sensible way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about doing that, but was going to leave it for a later patch to handle the weird cases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #134794

for (Instruction &I : BB) {
if (shouldForwardValueToReturn(BB, &I, RetTy) && !O.shouldKeep()) {
FuncsToReplace.emplace_back(&F, &I);
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, does the oracle properly handle this kind of early return?

Basically if O.shouldKeep() return false and now return true, we'll now perform an extra query (for the next instruction in the block), effectively shifting all the indices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of has to, or none of the reductions that have side effects on other code would work. The chunks get recounted after each step

@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from 976e948 to 655b6d8 Compare April 8, 2025 06:47
@arsenm
Copy link
Contributor Author

arsenm commented Apr 13, 2025

ping

@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch 2 times, most recently from 256d7cd to dba33af Compare April 25, 2025 14:50
Copy link
Contributor Author

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@regehr
Copy link
Contributor

regehr commented Apr 28, 2025

well, this ended up being more complicated than I thought it would-- thank you for tackling it!

this LGTM, my view is that since llvm-reduce isn't on the critical path for mainstream LLVM development it's fine to try out stuff here and fix it later if needed

@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from dba33af to c1d5c08 Compare May 2, 2025 10:09
@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from c1d5c08 to 754dc68 Compare May 2, 2025 11:52
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor Author

arsenm commented May 2, 2025

Merge activity

  • May 2, 10:06 AM EDT: A user started a stack merge that includes this pull request via Graphite.
  • May 2, 10:09 AM EDT: Graphite rebased this pull request as part of a merge.
  • May 2, 10:11 AM EDT: @arsenm merged this pull request with Graphite.

arsenm added 10 commits May 2, 2025 14:08
In void functions, try to replace instruction uses
with a new non-void return. If the return type matches
the instruction, also try to directly return it.

This handles most of the cases, but doesn't try to handle
all of the weird exception related terminators.

Also doesn't try to replace argument uses, although it could. We
could also handle cases where we can insert a simple cast to an
original return value. I didn't think too hard about where to put this
in the default pass order. In many cases it obviates the need for most
of the CFG folds, but I've left it near the end initially.

I also think this is too aggressive about removing dead code, and
should leave existing dead code alone. I'm also not sure why we have
both "removeUnreachableBlocks" and EliminateUnreachableBlocks" in Utils.

Fixes #66039, fixes #107327
@arsenm arsenm force-pushed the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch from 754dc68 to f2584ca Compare May 2, 2025 14:08
@arsenm arsenm merged commit e665925 into main May 2, 2025
6 of 9 checks passed
@arsenm arsenm deleted the users/arsenm/llvm-reduce/add-reduce-values-to-return-reduction branch May 2, 2025 14:11
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 2, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-aarch64-darwin running on doug-worker-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/19376

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'Clang-Unit :: ./AllClangUnitTests/9/48' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/Users/buildbot/buildbot-root/aarch64-darwin/build/tools/clang/unittests/./AllClangUnitTests-Clang-Unit-22564-9-48.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=48 GTEST_SHARD_INDEX=9 /Users/buildbot/buildbot-root/aarch64-darwin/build/tools/clang/unittests/./AllClangUnitTests
--

Script:
--
/Users/buildbot/buildbot-root/aarch64-darwin/build/tools/clang/unittests/./AllClangUnitTests --gtest_filter=TimeProfilerTest.ConstantEvaluationC99
--
/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/unittests/Support/TimeProfilerTest.cpp:349: Failure
Expected equality of these values:
  R"(
Frontend (test.c)
| ParseDeclarationOrFunctionDefinition (test.c:2:1)
| | isIntegerConstantExpr (<test.c:3:18>)
| | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)
| PerformPendingInstantiations
)"
    Which is: "\nFrontend (test.c)\n| ParseDeclarationOrFunctionDefinition (test.c:2:1)\n| | isIntegerConstantExpr (<test.c:3:18>)\n| | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)\n| PerformPendingInstantiations\n"
  buildTraceGraph(Json)
    Which is: "\nFrontend (test.c)\n| ParseDeclarationOrFunctionDefinition (test.c:2:1)\n| | isIntegerConstantExpr (<test.c:3:18>)\n| | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)\n| | PerformPendingInstantiations\n"
With diff:
@@ -4,3 +4,3 @@
 | | isIntegerConstantExpr (<test.c:3:18>)
 | | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)
-| PerformPendingInstantiations\n
+| | PerformPendingInstantiations\n



/Users/buildbot/buildbot-root/aarch64-darwin/llvm-project/clang/unittests/Support/TimeProfilerTest.cpp:349
Expected equality of these values:
  R"(
Frontend (test.c)
| ParseDeclarationOrFunctionDefinition (test.c:2:1)
| | isIntegerConstantExpr (<test.c:3:18>)
| | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)
| PerformPendingInstantiations
)"
    Which is: "\nFrontend (test.c)\n| ParseDeclarationOrFunctionDefinition (test.c:2:1)\n| | isIntegerConstantExpr (<test.c:3:18>)\n| | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)\n| PerformPendingInstantiations\n"
  buildTraceGraph(Json)
    Which is: "\nFrontend (test.c)\n| ParseDeclarationOrFunctionDefinition (test.c:2:1)\n| | isIntegerConstantExpr (<test.c:3:18>)\n| | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)\n| | PerformPendingInstantiations\n"
With diff:
@@ -4,3 +4,3 @@
 | | isIntegerConstantExpr (<test.c:3:18>)
 | | EvaluateKnownConstIntCheckOverflow (<test.c:3:18>)
-| PerformPendingInstantiations\n
+| | PerformPendingInstantiations\n

...

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 3, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2833

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/90/95' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-19844-90-95.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=95 GTEST_SHARD_INDEX=90 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[llvm-reduce] Return a program value instead of void llvm-reduce should try to return intermediate values
6 participants