[AMDGPU] remove move instruction if there is no user of it #136735

BaoshanPang · 2025-04-22T17:56:33Z

No description provided.

llvmbot · 2025-04-22T17:57:06Z

@llvm/pr-subscribers-backend-amdgpu

Author: Baoshan (BaoshanPang)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/136735.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/v_swap_b32.mir (+1-8)

diff --git a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
index 73343e1c80f33..be07e88e87851 100644
--- a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+++ b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
@@ -797,7 +797,7 @@ MachineInstr *SIShrinkInstructions::matchSwap(MachineInstr &MovT) const {
     dropInstructionKeepingImpDefs(*MovY);
     MachineInstr *Next = &*std::next(MovT.getIterator());
 
-    if (T.isVirtual() && MRI->use_nodbg_empty(T)) {
+    if (MRI->use_nodbg_empty(T)) {
       dropInstructionKeepingImpDefs(MovT);
     } else {
       Xop.setIsKill(false);
diff --git a/llvm/test/CodeGen/AMDGPU/v_swap_b32.mir b/llvm/test/CodeGen/AMDGPU/v_swap_b32.mir
index 95aaea6ea8091..5cd395fb18074 100644
--- a/llvm/test/CodeGen/AMDGPU/v_swap_b32.mir
+++ b/llvm/test/CodeGen/AMDGPU/v_swap_b32.mir
@@ -1,11 +1,9 @@
 # RUN: llc -simplify-mir -mtriple=amdgcn -mcpu=gfx900 -run-pass=si-shrink-instructions -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s
-# RUN: llc -simplify-mir -mtriple=amdgcn -mcpu=gfx900 -passes=si-shrink-instructions -verify-machineinstrs %s -o - | FileCheck -check-prefix=GCN %s
 
 # GCN-LABEL: name: swap_phys_condensed
 # GCN: bb.0:
 # GCN-NEXT: liveins:
 # GCN-NEXT: {{^[ ]*$}}
-# GCN-NEXT: $vgpr2 = V_MOV_B32_e32 $vgpr0, implicit $exec
 # GCN-NEXT: $vgpr0, $vgpr1 = V_SWAP_B32 $vgpr1, $vgpr0, implicit $exec
 # GCN-NEXT: S_SETPC_B64_return
 ---
@@ -24,7 +22,6 @@ body:             |
 # GCN: bb.0:
 # GCN-NEXT: liveins:
 # GCN-NEXT: {{^[ ]*$}}
-# GCN-NEXT: $vgpr2 = V_MOV_B32_e32 $vgpr0, implicit $exec
 # GCN-NEXT: $vgpr3 = V_MOV_B32_e32 killed $vgpr4, implicit $exec
 # GCN-NEXT: $vgpr0, $vgpr1 = V_SWAP_B32 $vgpr1, $vgpr0, implicit $exec
 # GCN-NEXT: $vgpr5 = V_MOV_B32_e32 killed $vgpr6, implicit $exec
@@ -47,7 +44,6 @@ body:             |
 # GCN: bb.0:
 # GCN-NEXT: liveins:
 # GCN-NEXT: {{^[ ]*$}}
-# GCN-NEXT: $vgpr2 = V_MOV_B32_e32 $vgpr0, implicit $exec
 # GCN-NEXT: $vgpr0, $vgpr1 = V_SWAP_B32 $vgpr1, $vgpr0, implicit $exec
 # GCN-NEXT: S_SETPC_B64_return
 ---
@@ -66,7 +62,6 @@ body:             |
 # GCN: bb.0:
 # GCN-NEXT: liveins:
 # GCN-NEXT: {{^[ ]*$}}
-# GCN-NEXT: $vgpr2 = V_MOV_B32_e32 $vgpr0, implicit $exec
 # GCN-NEXT: $vgpr0, $vgpr1 = V_SWAP_B32 $vgpr1, $vgpr0, implicit $exec
 # GCN-NEXT: S_SETPC_B64_return
 ---
@@ -85,7 +80,6 @@ body:             |
 # GCN: bb.0:
 # GCN-NEXT: liveins:
 # GCN-NEXT: {{^[ ]*$}}
-# GCN-NEXT: $vgpr4_vgpr5 = COPY $vgpr0_vgpr1
 # GCN-NEXT: $vgpr0, $vgpr2 = V_SWAP_B32 $vgpr2, $vgpr0, implicit $exec
 # GCN-NEXT: $vgpr1, $vgpr3 = V_SWAP_B32 $vgpr3, $vgpr1, implicit $exec
 ---
@@ -936,8 +930,7 @@ body:             |
 ...
 
 # GCN-LABEL: implicit_ops_mov_t_swap_b32
-# GCN:      $vgpr3 = V_MOV_B32_e32 $vgpr0, implicit $exec, implicit $vgpr2, implicit killed $vgpr1_vgpr2, implicit-def $vgpr1
-# GCN-NEXT: $vgpr0, $vgpr1 = V_SWAP_B32 $vgpr1, $vgpr0, implicit $exec
+# GCN: $vgpr0, $vgpr1 = V_SWAP_B32 $vgpr1, $vgpr0, implicit $exec
 
 ---
 name:            implicit_ops_mov_t_swap_b32

arsenm

Do you have an end to end change where this makes an observable difference? That would be a useful testcase.

arsenm

Technically this should avoid touching a reserved physical register. I also have a long term desire to remove physical register use lists

arsenm · 2025-04-22T18:13:22Z

Also should fix the description, it's not obvious what movt is

BaoshanPang · 2025-04-22T18:46:46Z

I don't know how to create a end to end test case yet . Is there an example I can check with?

This one would be difficult. You could try to craft some inline assembly where a swap could form in the middle

BaoshanPang · 2025-04-23T03:38:38Z

I don't know how to create a end to end test case yet . Is there an example I can check with?

This one would be difficult. You could try to craft some inline assembly where a swap could form in the middle

For inline assembly, it seems this pass can't do any optimization to them. also I can't find a way that can generate such swap sequence from C code.

arsenm · 2025-04-23T21:33:14Z

For inline assembly, it seems this pass can't do any optimization to them.

Yes, that's a plus. The point isn't to optimize the asm but the copies to satisfy the constraints. I tried something like

; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -o - < %s

define void @swap() {
  %v0 = call i32 asm "v_mov_b32 $0, 0", "={v0}"()
  %v1 = call i32 asm "v_mov_b32 $0, 1", "={v1}"()
  call void asm "; use $0","{v0},{v1}"(i32 %v1, i32 %v0)
  ret void
}

But we don't end up with a swap and use a temporary register:

swap:                                   ; @swap
; %bb.0:
	s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
	;;#ASMSTART
	v_mov_b32 v0, 0
	;;#ASMEND
	;;#ASMSTART
	v_mov_b32 v1, 1
	;;#ASMEND
	s_nop 0
	v_mov_b32_e32 v2, v0
	v_mov_b32_e32 v0, v1
	v_mov_b32_e32 v1, v2
	;;#ASMSTART
	; use v0
	;;#ASMEND
	s_setpc_b64 s[30:31]

arsenm

Can you fix the description to tag amdgpu and be more specific about what this is doing

For instruction sequence: move t, x move x, y mov y, t Enhance matchSwap so that 'move t, x' can be removed.

BaoshanPang · 2025-04-25T23:05:17Z

Can you fix the description to tag amdgpu and be more specific about what this is doing

Changed it.

llvmbot added the backend:AMDGPU label Apr 22, 2025

arsenm reviewed Apr 22, 2025

View reviewed changes

arsenm approved these changes Apr 22, 2025

View reviewed changes

arsenm reviewed Apr 25, 2025

View reviewed changes

[AMDGPU] remove move instruction if there is no user of it

ede74d5

For instruction sequence: move t, x move x, y mov y, t Enhance matchSwap so that 'move t, x' can be removed.

BaoshanPang force-pushed the remove_movt branch from e83489a to ede74d5 Compare April 25, 2025 23:03

BaoshanPang changed the title ~~remove 'movt' if there is no user of it~~ [AMDGPU] remove move instruction if there is no user of it Apr 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] remove move instruction if there is no user of it #136735

[AMDGPU] remove move instruction if there is no user of it #136735

BaoshanPang commented Apr 22, 2025

llvmbot commented Apr 22, 2025

arsenm left a comment

arsenm left a comment

arsenm commented Apr 22, 2025

BaoshanPang commented Apr 22, 2025 •

edited by arsenm

Loading

BaoshanPang commented Apr 23, 2025

arsenm commented Apr 23, 2025

arsenm left a comment

BaoshanPang commented Apr 25, 2025

[AMDGPU] remove move instruction if there is no user of it #136735

Are you sure you want to change the base?

[AMDGPU] remove move instruction if there is no user of it #136735

Conversation

BaoshanPang commented Apr 22, 2025

llvmbot commented Apr 22, 2025

arsenm left a comment

Choose a reason for hiding this comment

arsenm left a comment

Choose a reason for hiding this comment

arsenm commented Apr 22, 2025

BaoshanPang commented Apr 22, 2025 • edited by arsenm Loading

BaoshanPang commented Apr 23, 2025

arsenm commented Apr 23, 2025

arsenm left a comment

Choose a reason for hiding this comment

BaoshanPang commented Apr 25, 2025

BaoshanPang commented Apr 22, 2025 •

edited by arsenm

Loading