Skip to content

[AMDGPU] fix up vop3p gisel errors #136262

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

Shoreshen
Copy link
Contributor

@Shoreshen Shoreshen commented Apr 18, 2025

This is a fix up for patch #130234, which is reverted in #136249

The main reason of building failure are:

  1. /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp: In function ‘llvm::SmallVector<std::pair<const llvm::MachineOperand*, SrcStatus> > getSrcStats(const llvm::MachineOperand*, const llvm::MachineRegisterInfo&, searchOptions, int)’:
    /home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4669: error: could not convert ‘Statlist’ from ‘SmallVector<[...],4>’ to ‘SmallVector<[...],3>’
     4669 |   return Statlist;
    
  2. /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4554:1: error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type]
     4554 | }
         | ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4644:39: error: overlapping comparisons always evaluate to true [-Werror,-Wtautological-overlap-compare]
     4644 |         (Stat >= SrcStatus::NEG_START || Stat <= SrcStatus::NEG_END)) {
         |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4893:66: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
     4893 |         [=](MachineInstrBuilder &MIB) { MIB.addImm(getAllKindImm(Op)); },
         |                                                                  ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:9: note: 'Op' declared here
     4890 |   auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
         |         ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4894:52: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
     4894 |         [=](MachineInstrBuilder &MIB) { MIB.addImm(Mods); } // src_mods
         |                                                    ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:13: note: 'Mods' declared here
     4890 |   auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
         |             ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4899:50: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
     4899 |       [=](MachineInstrBuilder &MIB) { MIB.addReg(Op->getReg()); },
         |                                                  ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:9: note: 'Op' declared here
     4890 |   auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
         |         ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4900:50: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
     4900 |       [=](MachineInstrBuilder &MIB) { MIB.addImm(Mods); } // src_mods
         |                                                  ^
    /home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:13: note: 'Mods' declared here
     4890 |   auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
         |             ^
    6 errors generated.
    

Both error cannot be reproduced at my local machine, the fix applied are:

  1. In llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp function getSrcStats replace
    SmallVector<std::pair<const MachineOperand *, SrcStatus>, 4> Statlist;
    
    with
    SmallVector<std::pair<const MachineOperand *, SrcStatus>> Statlist;
    
  2. In llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp function AMDGPUInstructionSelector::selectVOP3PRetHelper replace
    auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
    
    with
    auto Results = selectVOP3PModsImpl(&Root, MRI, IsDOT);
    const MachineOperand *Op = Results.first;
    unsigned Mods = Results.second;
    

These change hasn't be testified since both errors cannot be reproduced in local

@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: None (Shoreshen)

Changes

This is a fix up for patch #130234, which is reverted in #136249

The main reason of building failure are:
1.

/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp: In function ‘llvm::SmallVector&lt;std::pair&lt;const llvm::MachineOperand*, SrcStatus&gt; &gt; getSrcStats(const llvm::MachineOperand*, const llvm::MachineRegisterInfo&amp;, searchOptions, int)’:
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4669: error: could not convert ‘Statlist’ from ‘SmallVector&lt;[...],4&gt;’ to ‘SmallVector&lt;[...],3&gt;’
4669 |   return Statlist;

/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4554:1: error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type]
4554 | }
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4644:39: error: overlapping comparisons always evaluate to true [-Werror,-Wtautological-overlap-compare]
4644 | (Stat >= SrcStatus::NEG_START || Stat <= SrcStatus::NEG_END)) {
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4893:66: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4893 | [=](MachineInstrBuilder &MIB) { MIB.addImm(getAllKindImm(Op)); },
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:9: note: 'Op' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4894:52: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4894 | [=](MachineInstrBuilder &MIB) { MIB.addImm(Mods); } // src_mods
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:13: note: 'Mods' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4899:50: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4899 | [=](MachineInstrBuilder &MIB) { MIB.addReg(Op->getReg()); },
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:9: note: 'Op' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4900:50: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4900 | [=](MachineInstrBuilder &MIB) { MIB.addImm(Mods); } // src_mods
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:13: note: 'Mods' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
6 errors generated.


Both error cannot be reproduced at my local machine, the fix applied are:
1. In `llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp` function `getSrcStats` replace

SmallVector<std::pair<const MachineOperand *, SrcStatus>, 4> Statlist;

with

SmallVector<std::pair<const MachineOperand *, SrcStatus>> Statlist;

2. In `llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp` function `AMDGPUInstructionSelector::selectVOP3PRetHelper` replace

auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);

with

auto Results = selectVOP3PModsImpl(&Root, MRI, IsDOT);
const MachineOperand *Op = Results.first;
unsigned Mods = Results.second;


These change hasn't be testified since both errors cannot be reproduced in local

---

Patch is 38.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136262.diff


7 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+573-33) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+4-2) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fmul.v2f16.ll (+163) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sdot2.ll (+4-8) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.udot2.ll (+6-12) 
- (modified) llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll (+5-9) 
- (modified) llvm/test/lit.cfg.py (+1-1) 


``````````diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
index 6ef7505ec6f62..1f9c40ad76d3f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
@@ -4319,60 +4319,600 @@ AMDGPUInstructionSelector::selectVOP3NoMods(MachineOperand &Root) const {
}};
}

-std::pair<Register, unsigned>
-AMDGPUInstructionSelector::selectVOP3PModsImpl(
-  Register Src, const MachineRegisterInfo &MRI, bool IsDOT) const {
+enum class SrcStatus {
+  IS_SAME,
+  IS_UPPER_HALF,
+  IS_LOWER_HALF,
+  IS_UPPER_HALF_NEG,
+  // This means current op = [op_upper, op_lower] and src = -op_lower.
+  IS_LOWER_HALF_NEG,
+  IS_HI_NEG,
+  // This means current op = [op_upper, op_lower] and src = [op_upper,
+  // -op_lower].
+  IS_LO_NEG,
+  IS_BOTH_NEG,
+  INVALID,
+  NEG_START = IS_UPPER_HALF_NEG,
+  NEG_END = IS_BOTH_NEG,
+  HALF_START = IS_UPPER_HALF,
+  HALF_END = IS_LOWER_HALF_NEG
+};
+
+static bool isTruncHalf(const MachineInstr *MI,
+                        const MachineRegisterInfo &MRI) {
+  if (MI->getOpcode() != AMDGPU::G_TRUNC)
+    return false;
+
+  unsigned DstSize = MRI.getType(MI->getOperand(0).getReg()).getSizeInBits();
+  unsigned SrcSize = MRI.getType(MI->getOperand(1).getReg()).getSizeInBits();
+  return DstSize * 2 == SrcSize;
+}
+
+static bool isLshrHalf(const MachineInstr *MI, const MachineRegisterInfo &MRI) {
+  if (MI->getOpcode() != AMDGPU::G_LSHR)
+    return false;
+
+  Register ShiftSrc;
+  std::optional<ValueAndVReg> ShiftAmt;
+  if (mi_match(MI->getOperand(0).getReg(), MRI,
+               m_GLShr(m_Reg(ShiftSrc), m_GCst(ShiftAmt)))) {
+    unsigned SrcSize = MRI.getType(MI->getOperand(1).getReg()).getSizeInBits();
+    unsigned Shift = ShiftAmt->Value.getZExtValue();
+    return Shift * 2 == SrcSize;
+  }
+  return false;
+}
+
+static bool isShlHalf(const MachineInstr *MI, const MachineRegisterInfo &MRI) {
+  if (MI->getOpcode() != AMDGPU::G_SHL)
+    return false;
+
+  Register ShiftSrc;
+  std::optional<ValueAndVReg> ShiftAmt;
+  if (mi_match(MI->getOperand(0).getReg(), MRI,
+               m_GShl(m_Reg(ShiftSrc), m_GCst(ShiftAmt)))) {
+    unsigned SrcSize = MRI.getType(MI->getOperand(1).getReg()).getSizeInBits();
+    unsigned Shift = ShiftAmt->Value.getZExtValue();
+    return Shift * 2 == SrcSize;
+  }
+  return false;
+}
+
+static std::optional<std::pair<const MachineOperand *, SrcStatus>>
+retOpStat(const MachineOperand *Op, SrcStatus Stat,
+          std::pair<const MachineOperand *, SrcStatus> &Curr) {
+  if (Stat != SrcStatus::INVALID &&
+      ((Op->isReg() && !(Op->getReg().isPhysical())) || Op->isImm() ||
+       Op->isCImm() || Op->isFPImm())) {
+    return std::optional<std::pair<const MachineOperand *, SrcStatus>>(
+        {Op, Stat});
+  }
+
+  return std::nullopt;
+}
+
+enum class TypeClass { VECTOR_OF_TWO, SCALAR, NONE_OF_LISTED };
+
+static TypeClass isVectorOfTwoOrScalar(const MachineOperand *Op,
+                                       const MachineRegisterInfo &MRI) {
+  if (!Op->isReg() || Op->getReg().isPhysical())
+    return TypeClass::NONE_OF_LISTED;
+  LLT OpTy = MRI.getType(Op->getReg());
+  if (OpTy.isScalar())
+    return TypeClass::SCALAR;
+  if (OpTy.isVector() && OpTy.getNumElements() == 2)
+    return TypeClass::VECTOR_OF_TWO;
+  return TypeClass::NONE_OF_LISTED;
+}
+
+static SrcStatus getNegStatus(const MachineOperand *Op, SrcStatus S,
+                              const MachineRegisterInfo &MRI) {
+  TypeClass NegType = isVectorOfTwoOrScalar(Op, MRI);
+  if (NegType != TypeClass::VECTOR_OF_TWO && NegType != TypeClass::SCALAR)
+    return SrcStatus::INVALID;
+
+  switch (S) {
+  case SrcStatus::IS_SAME:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [-OpHi, -OpLo]
+      return SrcStatus::IS_BOTH_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [-OpHi, OpLo]
+      return SrcStatus::IS_HI_NEG;
+    }
+    break;
+  case SrcStatus::IS_HI_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [-CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [-(-OpHi), -OpLo] = [OpHi, -OpLo]
+      return SrcStatus::IS_LO_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [-CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [-(-OpHi), OpLo] = [OpHi, OpLo]
+      return SrcStatus::IS_SAME;
+    }
+    break;
+  case SrcStatus::IS_LO_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [-OpHi, -(-OpLo)] = [-OpHi, OpLo]
+      return SrcStatus::IS_HI_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [-OpHi, -OpLo]
+      return SrcStatus::IS_BOTH_NEG;
+    }
+    break;
+  case SrcStatus::IS_BOTH_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [-CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [OpHi, OpLo]
+      return SrcStatus::IS_SAME;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [-CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [OpHi, -OpLo]
+      return SrcStatus::IS_LO_NEG;
+    }
+    break;
+  case SrcStatus::IS_UPPER_HALF:
+    // Vector of 2:
+    // Src = CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+    // Src = -OpUpper
+    //
+    // Scalar:
+    // Src = CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+    // Src = -OpUpper
+    return SrcStatus::IS_UPPER_HALF_NEG;
+  case SrcStatus::IS_LOWER_HALF:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // Src = CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+      // Src = -OpLower
+      return SrcStatus::IS_LOWER_HALF_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // Src = CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+      // Src = OpLower
+      return SrcStatus::IS_LOWER_HALF;
+    }
+    break;
+  case SrcStatus::IS_UPPER_HALF_NEG:
+    // Vector of 2:
+    // Src = -CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+    // Src = -(-OpUpper) = OpUpper
+    //
+    // Scalar:
+    // Src = -CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+    // Src = -(-OpUpper) = OpUpper
+    return SrcStatus::IS_UPPER_HALF;
+  case SrcStatus::IS_LOWER_HALF_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // Src = -CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+      // Src = -(-OpLower) = OpLower
+      return SrcStatus::IS_LOWER_HALF;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // Src = -CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+      // Src = -OpLower
+      return SrcStatus::IS_LOWER_HALF_NEG;
+    }
+    break;
+  default:
+    llvm_unreachable("unexpected SrcStatus");
+  }
+}
+
+static std::optional<std::pair<const MachineOperand *, SrcStatus>>
+calcNextStatus(std::pair<const MachineOperand *, SrcStatus> Curr,
+               const MachineRegisterInfo &MRI) {
+  if (!Curr.first->isReg())
+    return std::nullopt;
+
+  const MachineInstr *MI = Curr.first->isDef()
+                               ? Curr.first->getParent()
+                               : MRI.getVRegDef(Curr.first->getReg());
+
+  unsigned Opc = MI->getOpcode();
+
+  // Handle general Opc cases.
+  switch (Opc) {
+  case AMDGPU::G_BITCAST:
+  case AMDGPU::G_CONSTANT:
+  case AMDGPU::G_FCONSTANT:
+  case AMDGPU::COPY:
+    return retOpStat(&MI->getOperand(1), Curr.second, Curr);
+  case AMDGPU::G_FNEG:
+    return retOpStat(&MI->getOperand(1),
+                     getNegStatus(Curr.first, Curr.second, MRI), Curr);
+  default:
+    break;
+  }
+
+  // Calc next Stat from current Stat.
+  switch (Curr.second) {
+  case SrcStatus::IS_SAME:
+    if (isTruncHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF, Curr);
+    break;
+  case SrcStatus::IS_HI_NEG:
+    if (isTruncHalf(MI, MRI)) {
+      // [SrcHi, SrcLo]   = [-CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = trunc [OpUpper, OpLower] = OpLower
+      //                  = [OpLowerHi, OpLowerLo]
+      // Src = [SrcHi, SrcLo] = [-CurrHi, CurrLo]
+      //     = [-OpLowerHi, OpLowerLo]
+      //     = -OpLower
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF_NEG, Curr);
+    }
+    break;
+  case SrcStatus::IS_UPPER_HALF:
+    if (isShlHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF, Curr);
+    break;
+  case SrcStatus::IS_LOWER_HALF:
+    if (isLshrHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_UPPER_HALF, Curr);
+    break;
+  case SrcStatus::IS_UPPER_HALF_NEG:
+    if (isShlHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF_NEG, Curr);
+    break;
+  case SrcStatus::IS_LOWER_HALF_NEG:
+    if (isLshrHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_UPPER_HALF_NEG, Curr);
+    break;
+  default:
+    break;
+  }
+  return std::nullopt;
+}
+
+class searchOptions {
+private:
+  bool HasNeg = false;
+  // Assume all complex pattern of VOP3P has opsel.
+  bool HasOpsel = true;
+
+public:
+  searchOptions(const MachineOperand *RootOp, const MachineRegisterInfo &MRI) {
+    const MachineInstr *MI = RootOp->getParent();
+    unsigned Opc = MI->getOpcode();
+
+    if (Opc < TargetOpcode::GENERIC_OP_END) {
+      // Keep same for generic op.
+      HasNeg = true;
+    } else if (Opc == TargetOpcode::G_INTRINSIC) {
+      Intrinsic::ID IntrinsicID = cast<GIntrinsic>(*MI).getIntrinsicID();
+      // Only float point intrinsic has neg & neg_hi bits.
+      if (IntrinsicID == Intrinsic::amdgcn_fdot2)
+        HasNeg = true;
+    }
+  }
+  bool checkOptions(SrcStatus Stat) const {
+    if (!HasNeg &&
+        (Stat >= SrcStatus::NEG_START && Stat <= SrcStatus::NEG_END)) {
+      return false;
+    }
+    if (!HasOpsel &&
+        (Stat >= SrcStatus::HALF_START && Stat <= SrcStatus::HALF_END)) {
+      return false;
+    }
+    return true;
+  }
+};
+
+static SmallVector<std::pair<const MachineOperand *, SrcStatus>>
+getSrcStats(const MachineOperand *Op, const MachineRegisterInfo &MRI,
+            searchOptions SearchOptions, int MaxDepth = 6) {
+  int Depth = 0;
+  auto Curr = calcNextStatus({Op, SrcStatus::IS_SAME}, MRI);
+  SmallVector<std::pair<const MachineOperand *, SrcStatus>> Statlist;
+
+  while (Depth <= MaxDepth && Curr.has_value()) {
+    Depth++;
+    if (SearchOptions.checkOptions(Curr.value().second))
+      Statlist.push_back(Curr.value());
+    Curr = calcNextStatus(Curr.value(), MRI);
+  }
+
+  return Statlist;
+}
+
+static std::pair<const MachineOperand *, SrcStatus>
+getLastSameOrNeg(const MachineOperand *Op, const MachineRegisterInfo &MRI,
+                 searchOptions SearchOptions, int MaxDepth = 6) {
+  int Depth = 0;
+  std::pair<const MachineOperand *, SrcStatus> LastSameOrNeg = {
+      Op, SrcStatus::IS_SAME};
+  auto Curr = calcNextStatus(LastSameOrNeg, MRI);
+
+  while (Depth <= MaxDepth && Curr.has_value()) {
+    Depth++;
+    if (SearchOptions.checkOptions(Curr.value().second)) {
+      if (Curr.value().second == SrcStatus::IS_SAME ||
+          Curr.value().second == SrcStatus::IS_HI_NEG ||
+          Curr.value().second == SrcStatus::IS_LO_NEG ||
+          Curr.value().second == SrcStatus::IS_BOTH_NEG)
+        LastSameOrNeg = Curr.value();
+    }
+    Curr = calcNextStatus(Curr.value(), MRI);
+  }
+
+  return LastSameOrNeg;
+}
+
+static bool isInlinableFPConstant(const MachineOperand &Op,
+                                  const SIInstrInfo &TII) {
+  return Op.isFPImm() && TII.isInlineConstant(Op.getFPImm()->getValueAPF());
+}
+
+static bool isSameBitWidth(const MachineOperand *Op1, const MachineOperand *Op2,
+                           const MachineRegisterInfo &MRI) {
+  unsigned Width1 = MRI.getType(Op1->getReg()).getSizeInBits();
+  unsigned Width2 = MRI.getType(Op2->getReg()).getSizeInBits();
+  return Width1 == Width2;
+}
+
+static bool isSameOperand(const MachineOperand *Op1,
+                          const MachineOperand *Op2) {
+  if (Op1->isReg())
+    return Op2->isReg() && Op1->getReg() == Op2->getReg();
+
+  return Op1->isIdenticalTo(*Op2);
+}
+
+static unsigned updateMods(SrcStatus HiStat, SrcStatus LoStat, unsigned Mods) {
+  // SrcStatus::IS_LOWER_HALF remain 0.
+  if (HiStat == SrcStatus::IS_UPPER_HALF_NEG) {
+    Mods ^= SISrcMods::NEG_HI;
+    Mods |= SISrcMods::OP_SEL_1;
+  } else if (HiStat == SrcStatus::IS_UPPER_HALF)
+    Mods |= SISrcMods::OP_SEL_1;
+  else if (HiStat == SrcStatus::IS_LOWER_HALF_NEG)
+    Mods ^= SISrcMods::NEG_HI;
+  else if (HiStat == SrcStatus::IS_HI_NEG)
+    Mods ^= SISrcMods::NEG_HI;
+
+  if (LoStat == SrcStatus::IS_UPPER_HALF_NEG) {
+    Mods ^= SISrcMods::NEG;
+    Mods |= SISrcMods::OP_SEL_0;
+  } else if (LoStat == SrcStatus::IS_UPPER_HALF)
+    Mods |= SISrcMods::OP_SEL_0;
+  else if (LoStat == SrcStatus::IS_LOWER_HALF_NEG)
+    Mods |= SISrcMods::NEG;
+  else if (LoStat == SrcStatus::IS_HI_NEG)
+    Mods ^= SISrcMods::NEG;
+
+  return Mods;
+}
+
+static bool isValidToPack(SrcStatus HiStat, SrcStatus LoStat,
+                          const MachineOperand *NewOp,
+                          const MachineOperand *RootOp, const SIInstrInfo &TII,
+                          const MachineRegisterInfo &MRI) {
+  if (NewOp->isReg()) {
+    auto IsHalfState = [](SrcStatus S) {
+      return S == SrcStatus::IS_UPPER_HALF ||
+             S == SrcStatus::IS_UPPER_HALF_NEG ||
+             S == SrcStatus::IS_LOWER_HALF || S == SrcStatus::IS_LOWER_HALF_NEG;
+    };
+    return isSameBitWidth(NewOp, RootOp, MRI) && IsHalfState(LoStat) &&
+           IsHalfState(HiStat);
+  } else
+    return ((HiStat == SrcStatus::IS_SAME || HiStat == SrcStatus::IS_HI_NEG) &&
+            (LoStat == SrcStatus::IS_SAME || LoStat == SrcStatus::IS_HI_NEG) &&
+            isInlinableFPConstant(*NewOp, TII));
+
+  return false;
+}
+
+std::pair<const MachineOperand *, unsigned>
+AMDGPUInstructionSelector::selectVOP3PModsImpl(const MachineOperand *RootOp,
+                                               const MachineRegisterInfo &MRI,
+                                               bool IsDOT) const {
unsigned Mods = 0;
-  MachineInstr *MI = MRI.getVRegDef(Src);
+  const MachineOperand *Op = RootOp;
+  // No modification if Root type is not form of <2 x Type>.
+  if (isVectorOfTwoOrScalar(Op, MRI) != TypeClass::VECTOR_OF_TWO) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+
+  searchOptions SearchOptions(Op, MRI);

-  if (MI->getOpcode() == AMDGPU::G_FNEG &&
-      // It's possible to see an f32 fneg here, but unlikely.
-      // TODO: Treat f32 fneg as only high bit.
-      MRI.getType(Src) == LLT::fixed_vector(2, 16)) {
+  std::pair<const MachineOperand *, SrcStatus> Stat =
+      getLastSameOrNeg(Op, MRI, SearchOptions);
+  if (!Stat.first->isReg()) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+  if (Stat.second == SrcStatus::IS_BOTH_NEG)
  Mods ^= (SISrcMods::NEG | SISrcMods::NEG_HI);
-    Src = MI->getOperand(1).getReg();
-    MI = MRI.getVRegDef(Src);
+  else if (Stat.second == SrcStatus::IS_HI_NEG)
+    Mods ^= SISrcMods::NEG_HI;
+  else if (Stat.second == SrcStatus::IS_LO_NEG)
+    Mods ^= SISrcMods::NEG;
+
+  Op = Stat.first;
+  MachineInstr *MI = MRI.getVRegDef(Op->getReg());
+
+  if (MI->getOpcode() != AMDGPU::G_BUILD_VECTOR || MI->getNumOperands() != 3 ||
+      (IsDOT && Subtarget->hasDOTOpSelHazard())) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
}

-  // TODO: Handle G_FSUB 0 as fneg
+  SmallVector<std::pair<const MachineOperand *, SrcStatus>> StatlistHi =
+      getSrcStats(&MI->getOperand(2), MRI, SearchOptions);

-  // TODO: Match op_sel through g_build_vector_trunc and g_shuffle_vector.
-  (void)IsDOT; // DOTs do not use OPSEL on gfx942+, check ST.hasDOTOpSelHazard()
+  if (StatlistHi.size() == 0) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+
+  SmallVector<std::pair<const MachineOperand *, SrcStatus>> StatlistLo =
+      getSrcStats(&MI->getOperand(1), MRI, SearchOptions);

+  if (StatlistLo.size() == 0) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+
+  for (int I = StatlistHi.size() - 1; I >= 0; I--) {
+    for (int J = StatlistLo.size() - 1; J >= 0; J--) {
+      if (isSameOperand(StatlistHi[I].first, StatlistLo[J].first) &&
+          isValidToPack(StatlistHi[I].second, StatlistLo[J].second,
+                        StatlistHi[I].first, RootOp, TII, MRI))
+        return {StatlistHi[I].first,
+                updateMods(StatlistHi[I].second, StatlistLo[J].second, Mods)};
+    }
+  }
// Packed instructions do not have abs modifiers.
Mods |= SISrcMods::OP_SEL_1;

-  return std::pair(Src, Mods);
+  return {Op, Mods};
}

-InstructionSelector::ComplexRendererFns
-AMDGPUInstructionSelector::selectVOP3PMods(MachineOperand &Root) const {
-  MachineRegisterInfo &MRI
-    = Root.getParent()->getParent()->getParent()->getRegInfo();
+int64_t getAllKindImm(const MachineOperand *Op) {
+  switch (Op->getType()) {
+  case MachineOperand::MachineOperandType::MO_Immediate:
+    return Op->getImm();
+  case MachineOperand::MachineOperandType::MO_CImmediate:
+    return Op->getCImm()->getSExtValue();
+  case MachineOperand::MachineOperandType::MO_FPImmediate:
+    return Op->getFPImm()->getValueAPF().bitcastToAPInt().getSExtValue();
+  default:
+    llvm_unreachable("not an imm type");
+  }
+}
+
+static bool checkRB(const MachineOperand *Op, unsigned int RBNo,
+                    const AMDGPURegisterBankInfo &RBI,
+                    const MachineRegisterInfo &MRI,
+                    const TargetRegisterInfo &TRI) {
+  const RegisterBank *RB = RBI.getRegBank(Op->getReg(), MRI, TRI);
+  return RB->getID() == RBNo;
+}
+
+// This function is used to get the correct register bank for returned reg.
+// Assume:
+// 1. VOP3P is always legal for VGPR.
+// 2. RootOp's regbank is legal.
+// Thus
+// 1. If RootOp is SGPR, then NewOp can be SGPR or VGPR.
+// 2. If RootOp is VGPR, then NewOp must be VGPR.
+static const MachineOperand *
+getLegalRegBank(const MachineOperand *NewOp, const MachineOperand *RootOp,
+                const AMDGPURegisterBankInfo &RBI, MachineRegisterInfo &MRI,
+         ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2025

@llvm/pr-subscribers-llvm-globalisel

Author: None (Shoreshen)

Changes

This is a fix up for patch #130234, which is reverted in #136249

The main reason of building failure are:
1.

/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp: In function ‘llvm::SmallVector&lt;std::pair&lt;const llvm::MachineOperand*, SrcStatus&gt; &gt; getSrcStats(const llvm::MachineOperand*, const llvm::MachineRegisterInfo&amp;, searchOptions, int)’:
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4669: error: could not convert ‘Statlist’ from ‘SmallVector&lt;[...],4&gt;’ to ‘SmallVector&lt;[...],3&gt;’
4669 |   return Statlist;

/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4554:1: error: non-void function does not return a value in all control paths [-Werror,-Wreturn-type]
4554 | }
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4644:39: error: overlapping comparisons always evaluate to true [-Werror,-Wtautological-overlap-compare]
4644 | (Stat >= SrcStatus::NEG_START || Stat <= SrcStatus::NEG_END)) {
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4893:66: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4893 | [=](MachineInstrBuilder &MIB) { MIB.addImm(getAllKindImm(Op)); },
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:9: note: 'Op' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4894:52: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4894 | [=](MachineInstrBuilder &MIB) { MIB.addImm(Mods); } // src_mods
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:13: note: 'Mods' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4899:50: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4899 | [=](MachineInstrBuilder &MIB) { MIB.addReg(Op->getReg()); },
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:9: note: 'Op' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4900:50: error: captured structured bindings are a C++20 extension [-Werror,-Wc++20-extensions]
4900 | [=](MachineInstrBuilder &MIB) { MIB.addImm(Mods); } // src_mods
| ^
/home/b/sanitizer-aarch64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:4890:13: note: 'Mods' declared here
4890 | auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);
| ^
6 errors generated.


Both error cannot be reproduced at my local machine, the fix applied are:
1. In `llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp` function `getSrcStats` replace

SmallVector<std::pair<const MachineOperand *, SrcStatus>, 4> Statlist;

with

SmallVector<std::pair<const MachineOperand *, SrcStatus>> Statlist;

2. In `llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp` function `AMDGPUInstructionSelector::selectVOP3PRetHelper` replace

auto [Op, Mods] = selectVOP3PModsImpl(&Root, MRI, IsDOT);

with

auto Results = selectVOP3PModsImpl(&Root, MRI, IsDOT);
const MachineOperand *Op = Results.first;
unsigned Mods = Results.second;


These change hasn't be testified since both errors cannot be reproduced in local

---

Patch is 38.83 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136262.diff


7 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp (+573-33) 
- (modified) llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h (+4-2) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fmul.v2f16.ll (+163) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sdot2.ll (+4-8) 
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.udot2.ll (+6-12) 
- (modified) llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll (+5-9) 
- (modified) llvm/test/lit.cfg.py (+1-1) 


``````````diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
index 6ef7505ec6f62..1f9c40ad76d3f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
@@ -4319,60 +4319,600 @@ AMDGPUInstructionSelector::selectVOP3NoMods(MachineOperand &Root) const {
}};
}

-std::pair<Register, unsigned>
-AMDGPUInstructionSelector::selectVOP3PModsImpl(
-  Register Src, const MachineRegisterInfo &MRI, bool IsDOT) const {
+enum class SrcStatus {
+  IS_SAME,
+  IS_UPPER_HALF,
+  IS_LOWER_HALF,
+  IS_UPPER_HALF_NEG,
+  // This means current op = [op_upper, op_lower] and src = -op_lower.
+  IS_LOWER_HALF_NEG,
+  IS_HI_NEG,
+  // This means current op = [op_upper, op_lower] and src = [op_upper,
+  // -op_lower].
+  IS_LO_NEG,
+  IS_BOTH_NEG,
+  INVALID,
+  NEG_START = IS_UPPER_HALF_NEG,
+  NEG_END = IS_BOTH_NEG,
+  HALF_START = IS_UPPER_HALF,
+  HALF_END = IS_LOWER_HALF_NEG
+};
+
+static bool isTruncHalf(const MachineInstr *MI,
+                        const MachineRegisterInfo &MRI) {
+  if (MI->getOpcode() != AMDGPU::G_TRUNC)
+    return false;
+
+  unsigned DstSize = MRI.getType(MI->getOperand(0).getReg()).getSizeInBits();
+  unsigned SrcSize = MRI.getType(MI->getOperand(1).getReg()).getSizeInBits();
+  return DstSize * 2 == SrcSize;
+}
+
+static bool isLshrHalf(const MachineInstr *MI, const MachineRegisterInfo &MRI) {
+  if (MI->getOpcode() != AMDGPU::G_LSHR)
+    return false;
+
+  Register ShiftSrc;
+  std::optional<ValueAndVReg> ShiftAmt;
+  if (mi_match(MI->getOperand(0).getReg(), MRI,
+               m_GLShr(m_Reg(ShiftSrc), m_GCst(ShiftAmt)))) {
+    unsigned SrcSize = MRI.getType(MI->getOperand(1).getReg()).getSizeInBits();
+    unsigned Shift = ShiftAmt->Value.getZExtValue();
+    return Shift * 2 == SrcSize;
+  }
+  return false;
+}
+
+static bool isShlHalf(const MachineInstr *MI, const MachineRegisterInfo &MRI) {
+  if (MI->getOpcode() != AMDGPU::G_SHL)
+    return false;
+
+  Register ShiftSrc;
+  std::optional<ValueAndVReg> ShiftAmt;
+  if (mi_match(MI->getOperand(0).getReg(), MRI,
+               m_GShl(m_Reg(ShiftSrc), m_GCst(ShiftAmt)))) {
+    unsigned SrcSize = MRI.getType(MI->getOperand(1).getReg()).getSizeInBits();
+    unsigned Shift = ShiftAmt->Value.getZExtValue();
+    return Shift * 2 == SrcSize;
+  }
+  return false;
+}
+
+static std::optional<std::pair<const MachineOperand *, SrcStatus>>
+retOpStat(const MachineOperand *Op, SrcStatus Stat,
+          std::pair<const MachineOperand *, SrcStatus> &Curr) {
+  if (Stat != SrcStatus::INVALID &&
+      ((Op->isReg() && !(Op->getReg().isPhysical())) || Op->isImm() ||
+       Op->isCImm() || Op->isFPImm())) {
+    return std::optional<std::pair<const MachineOperand *, SrcStatus>>(
+        {Op, Stat});
+  }
+
+  return std::nullopt;
+}
+
+enum class TypeClass { VECTOR_OF_TWO, SCALAR, NONE_OF_LISTED };
+
+static TypeClass isVectorOfTwoOrScalar(const MachineOperand *Op,
+                                       const MachineRegisterInfo &MRI) {
+  if (!Op->isReg() || Op->getReg().isPhysical())
+    return TypeClass::NONE_OF_LISTED;
+  LLT OpTy = MRI.getType(Op->getReg());
+  if (OpTy.isScalar())
+    return TypeClass::SCALAR;
+  if (OpTy.isVector() && OpTy.getNumElements() == 2)
+    return TypeClass::VECTOR_OF_TWO;
+  return TypeClass::NONE_OF_LISTED;
+}
+
+static SrcStatus getNegStatus(const MachineOperand *Op, SrcStatus S,
+                              const MachineRegisterInfo &MRI) {
+  TypeClass NegType = isVectorOfTwoOrScalar(Op, MRI);
+  if (NegType != TypeClass::VECTOR_OF_TWO && NegType != TypeClass::SCALAR)
+    return SrcStatus::INVALID;
+
+  switch (S) {
+  case SrcStatus::IS_SAME:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [-OpHi, -OpLo]
+      return SrcStatus::IS_BOTH_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [-OpHi, OpLo]
+      return SrcStatus::IS_HI_NEG;
+    }
+    break;
+  case SrcStatus::IS_HI_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [-CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [-(-OpHi), -OpLo] = [OpHi, -OpLo]
+      return SrcStatus::IS_LO_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [-CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = neg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [-(-OpHi), OpLo] = [OpHi, OpLo]
+      return SrcStatus::IS_SAME;
+    }
+    break;
+  case SrcStatus::IS_LO_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [-OpHi, -(-OpLo)] = [-OpHi, OpLo]
+      return SrcStatus::IS_HI_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [-OpHi, -OpLo]
+      return SrcStatus::IS_BOTH_NEG;
+    }
+    break;
+  case SrcStatus::IS_BOTH_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // [SrcHi, SrcLo]   = [-CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](2 x Type)
+      // [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
+      // [SrcHi, SrcLo]   = [OpHi, OpLo]
+      return SrcStatus::IS_SAME;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // [SrcHi, SrcLo]   = [-CurrHi, -CurrLo]
+      // [CurrHi, CurrLo] = fneg [OpHi, OpLo](Type)
+      // [CurrHi, CurrLo] = [-OpHi, OpLo](Type)
+      // [SrcHi, SrcLo]   = [OpHi, -OpLo]
+      return SrcStatus::IS_LO_NEG;
+    }
+    break;
+  case SrcStatus::IS_UPPER_HALF:
+    // Vector of 2:
+    // Src = CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+    // Src = -OpUpper
+    //
+    // Scalar:
+    // Src = CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+    // Src = -OpUpper
+    return SrcStatus::IS_UPPER_HALF_NEG;
+  case SrcStatus::IS_LOWER_HALF:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // Src = CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+      // Src = -OpLower
+      return SrcStatus::IS_LOWER_HALF_NEG;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // Src = CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+      // Src = OpLower
+      return SrcStatus::IS_LOWER_HALF;
+    }
+    break;
+  case SrcStatus::IS_UPPER_HALF_NEG:
+    // Vector of 2:
+    // Src = -CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+    // Src = -(-OpUpper) = OpUpper
+    //
+    // Scalar:
+    // Src = -CurrUpper
+    // Curr = [CurrUpper, CurrLower]
+    // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+    // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+    // Src = -(-OpUpper) = OpUpper
+    return SrcStatus::IS_UPPER_HALF;
+  case SrcStatus::IS_LOWER_HALF_NEG:
+    if (NegType == TypeClass::VECTOR_OF_TWO) {
+      // Vector of 2:
+      // Src = -CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](2 x Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, -OpLower](2 x Type)
+      // Src = -(-OpLower) = OpLower
+      return SrcStatus::IS_LOWER_HALF;
+    } else if (NegType == TypeClass::SCALAR) {
+      // Scalar:
+      // Src = -CurrLower
+      // Curr = [CurrUpper, CurrLower]
+      // [CurrUpper, CurrLower] = fneg [OpUpper, OpLower](Type)
+      // [CurrUpper, CurrLower] = [-OpUpper, OpLower](Type)
+      // Src = -OpLower
+      return SrcStatus::IS_LOWER_HALF_NEG;
+    }
+    break;
+  default:
+    llvm_unreachable("unexpected SrcStatus");
+  }
+}
+
+static std::optional<std::pair<const MachineOperand *, SrcStatus>>
+calcNextStatus(std::pair<const MachineOperand *, SrcStatus> Curr,
+               const MachineRegisterInfo &MRI) {
+  if (!Curr.first->isReg())
+    return std::nullopt;
+
+  const MachineInstr *MI = Curr.first->isDef()
+                               ? Curr.first->getParent()
+                               : MRI.getVRegDef(Curr.first->getReg());
+
+  unsigned Opc = MI->getOpcode();
+
+  // Handle general Opc cases.
+  switch (Opc) {
+  case AMDGPU::G_BITCAST:
+  case AMDGPU::G_CONSTANT:
+  case AMDGPU::G_FCONSTANT:
+  case AMDGPU::COPY:
+    return retOpStat(&MI->getOperand(1), Curr.second, Curr);
+  case AMDGPU::G_FNEG:
+    return retOpStat(&MI->getOperand(1),
+                     getNegStatus(Curr.first, Curr.second, MRI), Curr);
+  default:
+    break;
+  }
+
+  // Calc next Stat from current Stat.
+  switch (Curr.second) {
+  case SrcStatus::IS_SAME:
+    if (isTruncHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF, Curr);
+    break;
+  case SrcStatus::IS_HI_NEG:
+    if (isTruncHalf(MI, MRI)) {
+      // [SrcHi, SrcLo]   = [-CurrHi, CurrLo]
+      // [CurrHi, CurrLo] = trunc [OpUpper, OpLower] = OpLower
+      //                  = [OpLowerHi, OpLowerLo]
+      // Src = [SrcHi, SrcLo] = [-CurrHi, CurrLo]
+      //     = [-OpLowerHi, OpLowerLo]
+      //     = -OpLower
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF_NEG, Curr);
+    }
+    break;
+  case SrcStatus::IS_UPPER_HALF:
+    if (isShlHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF, Curr);
+    break;
+  case SrcStatus::IS_LOWER_HALF:
+    if (isLshrHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_UPPER_HALF, Curr);
+    break;
+  case SrcStatus::IS_UPPER_HALF_NEG:
+    if (isShlHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_LOWER_HALF_NEG, Curr);
+    break;
+  case SrcStatus::IS_LOWER_HALF_NEG:
+    if (isLshrHalf(MI, MRI))
+      return retOpStat(&MI->getOperand(1), SrcStatus::IS_UPPER_HALF_NEG, Curr);
+    break;
+  default:
+    break;
+  }
+  return std::nullopt;
+}
+
+class searchOptions {
+private:
+  bool HasNeg = false;
+  // Assume all complex pattern of VOP3P has opsel.
+  bool HasOpsel = true;
+
+public:
+  searchOptions(const MachineOperand *RootOp, const MachineRegisterInfo &MRI) {
+    const MachineInstr *MI = RootOp->getParent();
+    unsigned Opc = MI->getOpcode();
+
+    if (Opc < TargetOpcode::GENERIC_OP_END) {
+      // Keep same for generic op.
+      HasNeg = true;
+    } else if (Opc == TargetOpcode::G_INTRINSIC) {
+      Intrinsic::ID IntrinsicID = cast<GIntrinsic>(*MI).getIntrinsicID();
+      // Only float point intrinsic has neg & neg_hi bits.
+      if (IntrinsicID == Intrinsic::amdgcn_fdot2)
+        HasNeg = true;
+    }
+  }
+  bool checkOptions(SrcStatus Stat) const {
+    if (!HasNeg &&
+        (Stat >= SrcStatus::NEG_START && Stat <= SrcStatus::NEG_END)) {
+      return false;
+    }
+    if (!HasOpsel &&
+        (Stat >= SrcStatus::HALF_START && Stat <= SrcStatus::HALF_END)) {
+      return false;
+    }
+    return true;
+  }
+};
+
+static SmallVector<std::pair<const MachineOperand *, SrcStatus>>
+getSrcStats(const MachineOperand *Op, const MachineRegisterInfo &MRI,
+            searchOptions SearchOptions, int MaxDepth = 6) {
+  int Depth = 0;
+  auto Curr = calcNextStatus({Op, SrcStatus::IS_SAME}, MRI);
+  SmallVector<std::pair<const MachineOperand *, SrcStatus>> Statlist;
+
+  while (Depth <= MaxDepth && Curr.has_value()) {
+    Depth++;
+    if (SearchOptions.checkOptions(Curr.value().second))
+      Statlist.push_back(Curr.value());
+    Curr = calcNextStatus(Curr.value(), MRI);
+  }
+
+  return Statlist;
+}
+
+static std::pair<const MachineOperand *, SrcStatus>
+getLastSameOrNeg(const MachineOperand *Op, const MachineRegisterInfo &MRI,
+                 searchOptions SearchOptions, int MaxDepth = 6) {
+  int Depth = 0;
+  std::pair<const MachineOperand *, SrcStatus> LastSameOrNeg = {
+      Op, SrcStatus::IS_SAME};
+  auto Curr = calcNextStatus(LastSameOrNeg, MRI);
+
+  while (Depth <= MaxDepth && Curr.has_value()) {
+    Depth++;
+    if (SearchOptions.checkOptions(Curr.value().second)) {
+      if (Curr.value().second == SrcStatus::IS_SAME ||
+          Curr.value().second == SrcStatus::IS_HI_NEG ||
+          Curr.value().second == SrcStatus::IS_LO_NEG ||
+          Curr.value().second == SrcStatus::IS_BOTH_NEG)
+        LastSameOrNeg = Curr.value();
+    }
+    Curr = calcNextStatus(Curr.value(), MRI);
+  }
+
+  return LastSameOrNeg;
+}
+
+static bool isInlinableFPConstant(const MachineOperand &Op,
+                                  const SIInstrInfo &TII) {
+  return Op.isFPImm() && TII.isInlineConstant(Op.getFPImm()->getValueAPF());
+}
+
+static bool isSameBitWidth(const MachineOperand *Op1, const MachineOperand *Op2,
+                           const MachineRegisterInfo &MRI) {
+  unsigned Width1 = MRI.getType(Op1->getReg()).getSizeInBits();
+  unsigned Width2 = MRI.getType(Op2->getReg()).getSizeInBits();
+  return Width1 == Width2;
+}
+
+static bool isSameOperand(const MachineOperand *Op1,
+                          const MachineOperand *Op2) {
+  if (Op1->isReg())
+    return Op2->isReg() && Op1->getReg() == Op2->getReg();
+
+  return Op1->isIdenticalTo(*Op2);
+}
+
+static unsigned updateMods(SrcStatus HiStat, SrcStatus LoStat, unsigned Mods) {
+  // SrcStatus::IS_LOWER_HALF remain 0.
+  if (HiStat == SrcStatus::IS_UPPER_HALF_NEG) {
+    Mods ^= SISrcMods::NEG_HI;
+    Mods |= SISrcMods::OP_SEL_1;
+  } else if (HiStat == SrcStatus::IS_UPPER_HALF)
+    Mods |= SISrcMods::OP_SEL_1;
+  else if (HiStat == SrcStatus::IS_LOWER_HALF_NEG)
+    Mods ^= SISrcMods::NEG_HI;
+  else if (HiStat == SrcStatus::IS_HI_NEG)
+    Mods ^= SISrcMods::NEG_HI;
+
+  if (LoStat == SrcStatus::IS_UPPER_HALF_NEG) {
+    Mods ^= SISrcMods::NEG;
+    Mods |= SISrcMods::OP_SEL_0;
+  } else if (LoStat == SrcStatus::IS_UPPER_HALF)
+    Mods |= SISrcMods::OP_SEL_0;
+  else if (LoStat == SrcStatus::IS_LOWER_HALF_NEG)
+    Mods |= SISrcMods::NEG;
+  else if (LoStat == SrcStatus::IS_HI_NEG)
+    Mods ^= SISrcMods::NEG;
+
+  return Mods;
+}
+
+static bool isValidToPack(SrcStatus HiStat, SrcStatus LoStat,
+                          const MachineOperand *NewOp,
+                          const MachineOperand *RootOp, const SIInstrInfo &TII,
+                          const MachineRegisterInfo &MRI) {
+  if (NewOp->isReg()) {
+    auto IsHalfState = [](SrcStatus S) {
+      return S == SrcStatus::IS_UPPER_HALF ||
+             S == SrcStatus::IS_UPPER_HALF_NEG ||
+             S == SrcStatus::IS_LOWER_HALF || S == SrcStatus::IS_LOWER_HALF_NEG;
+    };
+    return isSameBitWidth(NewOp, RootOp, MRI) && IsHalfState(LoStat) &&
+           IsHalfState(HiStat);
+  } else
+    return ((HiStat == SrcStatus::IS_SAME || HiStat == SrcStatus::IS_HI_NEG) &&
+            (LoStat == SrcStatus::IS_SAME || LoStat == SrcStatus::IS_HI_NEG) &&
+            isInlinableFPConstant(*NewOp, TII));
+
+  return false;
+}
+
+std::pair<const MachineOperand *, unsigned>
+AMDGPUInstructionSelector::selectVOP3PModsImpl(const MachineOperand *RootOp,
+                                               const MachineRegisterInfo &MRI,
+                                               bool IsDOT) const {
unsigned Mods = 0;
-  MachineInstr *MI = MRI.getVRegDef(Src);
+  const MachineOperand *Op = RootOp;
+  // No modification if Root type is not form of <2 x Type>.
+  if (isVectorOfTwoOrScalar(Op, MRI) != TypeClass::VECTOR_OF_TWO) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+
+  searchOptions SearchOptions(Op, MRI);

-  if (MI->getOpcode() == AMDGPU::G_FNEG &&
-      // It's possible to see an f32 fneg here, but unlikely.
-      // TODO: Treat f32 fneg as only high bit.
-      MRI.getType(Src) == LLT::fixed_vector(2, 16)) {
+  std::pair<const MachineOperand *, SrcStatus> Stat =
+      getLastSameOrNeg(Op, MRI, SearchOptions);
+  if (!Stat.first->isReg()) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+  if (Stat.second == SrcStatus::IS_BOTH_NEG)
  Mods ^= (SISrcMods::NEG | SISrcMods::NEG_HI);
-    Src = MI->getOperand(1).getReg();
-    MI = MRI.getVRegDef(Src);
+  else if (Stat.second == SrcStatus::IS_HI_NEG)
+    Mods ^= SISrcMods::NEG_HI;
+  else if (Stat.second == SrcStatus::IS_LO_NEG)
+    Mods ^= SISrcMods::NEG;
+
+  Op = Stat.first;
+  MachineInstr *MI = MRI.getVRegDef(Op->getReg());
+
+  if (MI->getOpcode() != AMDGPU::G_BUILD_VECTOR || MI->getNumOperands() != 3 ||
+      (IsDOT && Subtarget->hasDOTOpSelHazard())) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
}

-  // TODO: Handle G_FSUB 0 as fneg
+  SmallVector<std::pair<const MachineOperand *, SrcStatus>> StatlistHi =
+      getSrcStats(&MI->getOperand(2), MRI, SearchOptions);

-  // TODO: Match op_sel through g_build_vector_trunc and g_shuffle_vector.
-  (void)IsDOT; // DOTs do not use OPSEL on gfx942+, check ST.hasDOTOpSelHazard()
+  if (StatlistHi.size() == 0) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+
+  SmallVector<std::pair<const MachineOperand *, SrcStatus>> StatlistLo =
+      getSrcStats(&MI->getOperand(1), MRI, SearchOptions);

+  if (StatlistLo.size() == 0) {
+    Mods |= SISrcMods::OP_SEL_1;
+    return {Op, Mods};
+  }
+
+  for (int I = StatlistHi.size() - 1; I >= 0; I--) {
+    for (int J = StatlistLo.size() - 1; J >= 0; J--) {
+      if (isSameOperand(StatlistHi[I].first, StatlistLo[J].first) &&
+          isValidToPack(StatlistHi[I].second, StatlistLo[J].second,
+                        StatlistHi[I].first, RootOp, TII, MRI))
+        return {StatlistHi[I].first,
+                updateMods(StatlistHi[I].second, StatlistLo[J].second, Mods)};
+    }
+  }
// Packed instructions do not have abs modifiers.
Mods |= SISrcMods::OP_SEL_1;

-  return std::pair(Src, Mods);
+  return {Op, Mods};
}

-InstructionSelector::ComplexRendererFns
-AMDGPUInstructionSelector::selectVOP3PMods(MachineOperand &Root) const {
-  MachineRegisterInfo &MRI
-    = Root.getParent()->getParent()->getParent()->getRegInfo();
+int64_t getAllKindImm(const MachineOperand *Op) {
+  switch (Op->getType()) {
+  case MachineOperand::MachineOperandType::MO_Immediate:
+    return Op->getImm();
+  case MachineOperand::MachineOperandType::MO_CImmediate:
+    return Op->getCImm()->getSExtValue();
+  case MachineOperand::MachineOperandType::MO_FPImmediate:
+    return Op->getFPImm()->getValueAPF().bitcastToAPInt().getSExtValue();
+  default:
+    llvm_unreachable("not an imm type");
+  }
+}
+
+static bool checkRB(const MachineOperand *Op, unsigned int RBNo,
+                    const AMDGPURegisterBankInfo &RBI,
+                    const MachineRegisterInfo &MRI,
+                    const TargetRegisterInfo &TRI) {
+  const RegisterBank *RB = RBI.getRegBank(Op->getReg(), MRI, TRI);
+  return RB->getID() == RBNo;
+}
+
+// This function is used to get the correct register bank for returned reg.
+// Assume:
+// 1. VOP3P is always legal for VGPR.
+// 2. RootOp's regbank is legal.
+// Thus
+// 1. If RootOp is SGPR, then NewOp can be SGPR or VGPR.
+// 2. If RootOp is VGPR, then NewOp must be VGPR.
+static const MachineOperand *
+getLegalRegBank(const MachineOperand *NewOp, const MachineOperand *RootOp,
+                const AMDGPURegisterBankInfo &RBI, MachineRegisterInfo &MRI,
+         ...
[truncated]

@Shoreshen Shoreshen requested review from Copilot and Artem-B April 18, 2025 06:09
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes build errors related to AMDGPU instruction selection and adjusts the test configuration for llvm-readobj output decoding.

  • Updated the decoding from ASCII to UTF-8 in the lit test configuration.
  • Modified the signature of selectVOP3PModsImpl (and related function) to correct the return type that was causing build conversion issues.

Reviewed Changes

Copilot reviewed 3 out of 7 changed files in this pull request and generated no comments.

File Description
llvm/test/lit.cfg.py Updated stdout decoding to utf-8 for reliable string interpretation.
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h Changed function return types to resolve type mismatch in the build.
Files not reviewed (4)
  • llvm/test/CodeGen/AMDGPU/GlobalISel/fmul.v2f16.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sdot2.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.udot2.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll: Language not supported
Comments suppressed due to low confidence (2)

llvm/test/lit.cfg.py:469

  • Ensure that switching the decoding from 'ascii' to 'utf-8' is intentional and that the llvm-readobj output consistently uses UTF-8 encoding across all environments.
readobj_out = readobj_cmd.stdout.read().decode("utf-8")

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h:190

  • The change in the return type of selectVOP3PModsImpl to use a 'const MachineOperand *' instead of 'Register' is intended to resolve conversion errors; please verify that all corresponding call sites in the implementation file have been updated accordingly.
std::pair<const MachineOperand *, unsigned>

@Shoreshen Shoreshen requested a review from Copilot April 18, 2025 06:14
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes build failures in the AMDGPUInstructionSelector by updating function signatures to remove structured bindings and adjust parameter/return types.

  • Updated selectVOP3PModsImpl signature from taking a Register to a const MachineOperand* and modified its return type accordingly
  • Replaced structured binding in selectVOP3PRetHelper with explicit variable extraction for clearer, C++20 compatible code
Files not reviewed (4)
  • llvm/test/CodeGen/AMDGPU/GlobalISel/fmul.v2f16.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sdot2.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.udot2.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll: Language not supported
Comments suppressed due to low confidence (2)

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h:190

  • Please verify that the updated parameter and return types for selectVOP3PModsImpl are consistently reflected in its implementation and all invocations.
std::pair<const MachineOperand *, unsigned> selectVOP3PModsImpl(const MachineOperand *Op, const MachineRegisterInfo &MRI, bool IsDOT = false) const;

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h:193

  • Ensure that the conversion from structured bindings to explicit variable extraction in selectVOP3PRetHelper is properly implemented in the source file.
InstructionSelector::ComplexRendererFns selectVOP3PRetHelper(MachineOperand &Root, bool IsDOT = false) const;

@Shoreshen Shoreshen changed the title fix up vop3p gisel errors [AMDGPU] fix up vop3p gisel errors Apr 18, 2025
}

static std::optional<std::pair<const MachineOperand *, SrcStatus>>
retOpStat(const MachineOperand *Op, SrcStatus Stat,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we pass MachineOperand as a reference instead?

Copy link
Contributor Author

@Shoreshen Shoreshen Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tgymnich, maybe only part of it. There are some places that using reference will cause error, such as:

static SmallVector<std::pair<const MachineOperand *, SrcStatus>>
getSrcStats(const MachineOperand *Op, const MachineRegisterInfo &MRI,
            searchOptions SearchOptions, int MaxDepth = 6) {
  ...
  while (Depth <= MaxDepth && Curr.has_value()) {
    ...
    Curr = calcNextStatus(Curr.value(), MRI);
  }
  ...
}

If I use reference in this place, the = will change the original operand.

Currently I think it is simpler to use pointer in all place.

@Shoreshen Shoreshen requested a review from tgymnich April 21, 2025 07:26
// [CurrHi, CurrLo] = [-OpHi, -OpLo](2 x Type)
// [SrcHi, SrcLo] = [-OpHi, -OpLo]
return SrcStatus::IS_BOTH_NEG;
} else if (NegType == TypeClass::SCALAR) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No else after return

Comment on lines 4398 to 4399
if (!Op->isReg() || Op->getReg().isPhysical())
return TypeClass::NONE_OF_LISTED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These cases should not reach here. Gisel code should generally operate in terms of Registers, and it never should need to interact with physical registers

Comment on lines +4384 to +4389
if (Stat != SrcStatus::INVALID &&
((Op->isReg() && !(Op->getReg().isPhysical())) || Op->isImm() ||
Op->isCImm() || Op->isFPImm())) {
return std::optional<std::pair<const MachineOperand *, SrcStatus>>(
{Op, Stat});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these cases should need handling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arsenm , if I'm not wrong the input and output of the block may have psychical regs. And some time the input and output can be the same physical regs. this will cause infinite loop...

case AMDGPU::G_CONSTANT:
case AMDGPU::G_FCONSTANT:
case AMDGPU::COPY:
return retOpStat(&MI->getOperand(1), Curr.second, Curr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't try to treat the leaf constants similar to a copy, treat them specially and directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arsenm , do you mean like this:

  switch (Opc) {
  case AMDGPU::G_CONSTANT:
  case AMDGPU::G_FCONSTANT:
    return retOpStat(&MI->getOperand(1), Curr.second, Curr);
  case AMDGPU::G_BITCAST:
  case AMDGPU::COPY:
    return retOpStat(&MI->getOperand(1), Curr.second, Curr);
  case AMDGPU::G_FNEG:
    return retOpStat(&MI->getOperand(1),
                     getNegStatus(Curr.first, Curr.second, MRI), Curr);
  default:
    break;
  }

@Artem-B Artem-B removed their request for review April 21, 2025 21:02
@Shoreshen Shoreshen requested a review from Copilot April 22, 2025 03:20
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses build errors encountered in AMDGPU's VOP3P instruction selection by updating function signatures and removing structured bindings.

  • Updated selectVOP3PModsImpl to return a pair containing a pointer instead of a Register.
  • Introduced selectVOP3PRetHelper to eliminate structured bindings and accommodate C++20 extension issues.
Files not reviewed (4)
  • llvm/test/CodeGen/AMDGPU/GlobalISel/fmul.v2f16.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.sdot2.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.udot2.ll: Language not supported
  • llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll: Language not supported
Comments suppressed due to low confidence (2)

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h:189

  • [nitpick] The updated signature of selectVOP3PModsImpl now uses a pointer instead of a Register; consider adding inline documentation to clarify this design decision and ensure consistency with its usage in the implementation.
std::pair<const MachineOperand *, unsigned> selectVOP3PModsImpl(const MachineOperand *Op, const MachineRegisterInfo &MRI,

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h:193

  • [nitpick] The new selectVOP3PRetHelper function is introduced to avoid structured bindings; consider providing a comment or documentation noting its relationship to selectVOP3PModsImpl to guide future maintenance.
InstructionSelector::ComplexRendererFns selectVOP3PRetHelper(MachineOperand &Root, bool IsDOT = false) const;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants