[PREVIEW-ONLY] RVV support for llvm-exegesis #114149

mshockwave · 2024-10-29T23:21:28Z

@mikhailramalho wanted to try out our RVV exegesis so I thought it's a good idea to share it publicly and created a PR just in case someone wants to see the difference. This PR is only for preview and will be abandoned, I basically put everything I'd created for this work into this branch.

Of course, I'll send out separate PRs for the actual merges. I don't want to use draft PR because GitHub somehow turns off notifications on those PRs.

In addition to the actual RVV Exegesis support, here are some other changes I'm planning to split out as separate PRs (i.e. a gigantic TODO list for Min):

Those scary changes in Analysis.h & Analysis.cpp are simply factoring out common printing logics so that we can print reports into not just HTML but also machine-readable formats like YAML. It came pretty useful when you got tons of inconsistencies and wanted to use scripts to prioritize some items, which is what we have being using
I'm creating raw Perf events by myself rather than using libpfm. I thought it's pretty east to do so even with the existing code in llvm-exegesis, so I'll send another PR to add this support
The -start-before-phase and -stop-after-phase features, as well as the object file serialization supports
~~Enumerating over a range of instruction opcodes~~ (It's already in SyntaCore's PR)
Some quality of life improvements on adding timers and improve the progress meter
Cache the search table lookups between VPseudo and MC opcode: RVV exegesis is one of the few cases that calls these lookup functions (e.g. RISCV::getRVVMCOpcode) for each instruction in the snippet. It makes more sense performance-wise to cache the result since the we have tens of thousands identical instructions in a single snippet.
Regarding the changes related to AcquireAtCycle -- I'm actually not so sure if they are legit. I think we do have to recognize pipeline bypass cycles though
--dry-run-measurement. In our case it's useful to run the measurement phase in userspace QEMU without running the actual measurement (userspace QEMU shares the kernel with host). To test things like benchmark deserializations.
~~Support -mattr. Useful when we want to toggle additional features, like additional RISCV extensions.~~ (It's already in SyntaCore's PR)

CC @boomanaiden154 @legrosbuffle

TODO: - Split out changes related to benchmark reports - Split out changes related using raw Perf event in replacement of libPFM - Split out changes related to supporting `AcquireAtCycle` in both exegesis and MCSchedule

…Expansion Pass

llvmbot · 2024-10-29T23:22:04Z

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-mc

Author: Min-Yih Hsu (mshockwave)

Changes

@mikhailramalho wanted to try out our RVV exegesis so I thought it's a good idea to share it publicly and created a PR just in case someone wants to see the difference. This PR is only for previous and will be abandoned, I basically put everything I'd created for this work into this branch.

Of course, I'll send out separate PRs for the actual merges. I don't want to use draft PR because GitHub somehow turns off notifications on those PRs.

In addition to the actual RVV Exegesis support, here are some other changes I'm planning to split out as separate PRs:

Those scary changes in Analysis.h & Analysis.cpp are simply factoring out common printing logics so that we can print reports into not just HTML but also machine-readable formats like YAML. It came pretty useful when you got tons of inconsistencies and wanted to use scripts to prioritize some items, which is what we have being using
I'm creating raw Perf events by myself rather than using libpfm. I thought it's pretty east to do so even with the existing code in llvm-exegesis, so I'll send another PR to add this support
The -start-before-phase and -stop-after-phase features, as well as the object file serialization supports
Enumerating over a range of instruction opcodes
Some quality of life improvements on adding timers and improve the progress meter
Cache the search table lookups between VPseudo and MC opcode: RVV exegesis is one of the few cases that calls these lookup functions (e.g. RISCV::getRVVMCOpcode) for each instruction in the snippet. It makes more sense performance-wise to cache the result since the we have tens of thousands identical instructions in a single snippet.
Regarding the changes related to AcquireAtCycle -- I'm actually not so sure if they are legit. I think we do have to recognize pipeline bypass cycles though
--dry-run-measurement. In our case it's useful to run the measurement phase in userspace QEMU without running the actual measurement (userspace QEMU shares the kernel with host). To test things like benchmark deserializations.
Support -mattr. Useful when we want to toggle additional features, like additional RISCV extensions.

CC @boomanaiden154 @legrosbuffle

Patch is 172.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/114149.diff

48 Files Affected:

(modified) llvm/lib/MC/MCSchedule.cpp (+2-1)
(modified) llvm/lib/Target/RISCV/CMakeLists.txt (+1)
(modified) llvm/lib/Target/RISCV/RISCV.td (+6)
(modified) llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp (+12-1)
(added) llvm/lib/Target/RISCV/RISCVPfmCounters.td (+18)
(added) llvm/test/tools/llvm-exegesis/RISCV/deserialize-obj-file.yaml (+29)
(added) llvm/test/tools/llvm-exegesis/RISCV/lit.local.cfg (+4)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/eligible-inst.test (+10)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/explicit-sew.test (+7)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/filter.test (+6)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/reduction.test (+7)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/self-aliasing.test (+6)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/skip-rm.test (+12)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew-zvk.test (+30)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew.test (+41)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/vlmax-only.test (+7)
(added) llvm/test/tools/llvm-exegesis/RISCV/rvv/vtype-rm-setup.test (+13)
(added) llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test (+8)
(modified) llvm/test/tools/llvm-exegesis/X86/analysis-noise.test (+1)
(modified) llvm/tools/llvm-exegesis/lib/Analysis.cpp (+202-419)
(modified) llvm/tools/llvm-exegesis/lib/Analysis.h (+90-20)
(added) llvm/tools/llvm-exegesis/lib/AnalysisPrinters.cpp (+514)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkResult.cpp (+99-5)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkResult.h (+20)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp (+59-5)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h (+8-2)
(modified) llvm/tools/llvm-exegesis/lib/CMakeLists.txt (+5)
(modified) llvm/tools/llvm-exegesis/lib/Clustering.cpp (+5)
(modified) llvm/tools/llvm-exegesis/lib/Clustering.h (+5)
(modified) llvm/tools/llvm-exegesis/lib/LlvmState.cpp (+1-1)
(modified) llvm/tools/llvm-exegesis/lib/MCInstrDescView.cpp (+4)
(modified) llvm/tools/llvm-exegesis/lib/MCInstrDescView.h (+4)
(modified) llvm/tools/llvm-exegesis/lib/PerfHelper.cpp (+15-46)
(modified) llvm/tools/llvm-exegesis/lib/ProgressMeter.h (+6-3)
(added) llvm/tools/llvm-exegesis/lib/RISCV/CMakeLists.txt (+25)
(added) llvm/tools/llvm-exegesis/lib/RISCV/RISCVExegesisPasses.h (+19)
(added) llvm/tools/llvm-exegesis/lib/RISCV/RISCVExegesisPostprocessing.cpp (+126)
(added) llvm/tools/llvm-exegesis/lib/RISCV/RISCVExegesisPreprocessing.cpp (+82)
(added) llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp (+955)
(modified) llvm/tools/llvm-exegesis/lib/SchedClassResolution.cpp (+51-14)
(modified) llvm/tools/llvm-exegesis/lib/SchedClassResolution.h (+5-3)
(modified) llvm/tools/llvm-exegesis/lib/SerialSnippetGenerator.cpp (+2-5)
(modified) llvm/tools/llvm-exegesis/lib/SnippetGenerator.cpp (+11-8)
(modified) llvm/tools/llvm-exegesis/lib/Target.cpp (+8)
(modified) llvm/tools/llvm-exegesis/lib/Target.h (+9)
(added) llvm/tools/llvm-exegesis/lib/Timer.cpp (+16)
(added) llvm/tools/llvm-exegesis/lib/Timer.h (+21)
(modified) llvm/tools/llvm-exegesis/llvm-exegesis.cpp (+296-152)

diff --git a/llvm/lib/MC/MCSchedule.cpp b/llvm/lib/MC/MCSchedule.cpp
index 4f7125864c5a01..f67c43c95935f8 100644
--- a/llvm/lib/MC/MCSchedule.cpp
+++ b/llvm/lib/MC/MCSchedule.cpp
@@ -96,8 +96,9 @@ MCSchedModel::getReciprocalThroughput(const MCSubtargetInfo &STI,
   for (; I != E; ++I) {
     if (!I->ReleaseAtCycle)
       continue;
+    assert(I->ReleaseAtCycle > I->AcquireAtCycle);
     unsigned NumUnits = SM.getProcResource(I->ProcResourceIdx)->NumUnits;
-    double Temp = NumUnits * 1.0 / I->ReleaseAtCycle;
+    double Temp = NumUnits * 1.0 / (I->ReleaseAtCycle - I->AcquireAtCycle);
     Throughput = Throughput ? std::min(*Throughput, Temp) : Temp;
   }
   if (Throughput)
diff --git a/llvm/lib/Target/RISCV/CMakeLists.txt b/llvm/lib/Target/RISCV/CMakeLists.txt
index fd049d1a57860e..4727e0ca22428a 100644
--- a/llvm/lib/Target/RISCV/CMakeLists.txt
+++ b/llvm/lib/Target/RISCV/CMakeLists.txt
@@ -15,6 +15,7 @@ tablegen(LLVM RISCVGenRegisterBank.inc -gen-register-bank)
 tablegen(LLVM RISCVGenRegisterInfo.inc -gen-register-info)
 tablegen(LLVM RISCVGenSearchableTables.inc -gen-searchable-tables)
 tablegen(LLVM RISCVGenSubtargetInfo.inc -gen-subtarget)
+tablegen(LLVM RISCVGenExegesis.inc -gen-exegesis)
 
 set(LLVM_TARGET_DEFINITIONS RISCVGISel.td)
 tablegen(LLVM RISCVGenGlobalISel.inc -gen-global-isel)
diff --git a/llvm/lib/Target/RISCV/RISCV.td b/llvm/lib/Target/RISCV/RISCV.td
index 00c3d702e12a22..4d8320ff5cbb45 100644
--- a/llvm/lib/Target/RISCV/RISCV.td
+++ b/llvm/lib/Target/RISCV/RISCV.td
@@ -61,6 +61,12 @@ include "RISCVSchedXiangShanNanHu.td"
 
 include "RISCVProcessors.td"
 
+//===----------------------------------------------------------------------===//
+// Pfm Counters
+//===----------------------------------------------------------------------===//
+
+include "RISCVPfmCounters.td"
+
 //===----------------------------------------------------------------------===//
 // Define the RISC-V target.
 //===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp b/llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp
index f72ba2d5c667b8..608652a4efafed 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp
@@ -198,8 +198,19 @@ char RISCVInsertWriteVXRM::ID = 0;
 INITIALIZE_PASS(RISCVInsertWriteVXRM, DEBUG_TYPE, RISCV_INSERT_WRITE_VXRM_NAME,
                 false, false)
 
+static unsigned getAndCacheRVVMCOpcode(unsigned VPseudoOpcode) {
+  // VPseudo opcode -> MC opcode
+  static DenseMap<unsigned, unsigned> OpcodeCache;
+  auto It = OpcodeCache.find(VPseudoOpcode);
+  if (It != OpcodeCache.end())
+    return It->second;
+  unsigned MCOpcode = RISCV::getRVVMCOpcode(VPseudoOpcode);
+  OpcodeCache.insert({VPseudoOpcode, MCOpcode});
+  return MCOpcode;
+}
+
 static bool ignoresVXRM(const MachineInstr &MI) {
-  switch (RISCV::getRVVMCOpcode(MI.getOpcode())) {
+  switch (getAndCacheRVVMCOpcode(MI.getOpcode())) {
   default:
     return false;
   case RISCV::VNCLIP_WI:
diff --git a/llvm/lib/Target/RISCV/RISCVPfmCounters.td b/llvm/lib/Target/RISCV/RISCVPfmCounters.td
new file mode 100644
index 00000000000000..c986a38c30f2dd
--- /dev/null
+++ b/llvm/lib/Target/RISCV/RISCVPfmCounters.td
@@ -0,0 +1,18 @@
+//===---- RISCVPfmCounters.td - RISCV Hardware Counters ----*- tablegen -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This describes the available hardware counters for RISCV.
+//
+//===----------------------------------------------------------------------===//
+
+def CpuCyclesPfmCounter : PfmCounter<"CYCLES">;
+
+def DefaultPfmCounters : ProcPfmCounters {
+  let CycleCounter = CpuCyclesPfmCounter;
+}
+def : PfmCountersDefaultBinding<DefaultPfmCounters>;
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/deserialize-obj-file.yaml b/llvm/test/tools/llvm-exegesis/RISCV/deserialize-obj-file.yaml
new file mode 100644
index 00000000000000..68f394af6bc71c
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/deserialize-obj-file.yaml
@@ -0,0 +1,29 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -start-before-phase=measure --mode=latency --dry-run-measurement --use-dummy-perf-counters \
+# RUN:    --dump-object-to-disk=%t.o %s > %t.result.yml
+# RUN: llvm-objdump -d %t.o | FileCheck %s
+
+# CHECK: vsetvli {{.*}}, zero, e32, m1, tu, ma
+# CHECK: fsrmi   {{.*}}, 0x0
+# CHECK: vfwredusum.vs
+
+---
+mode:            latency
+key:
+  instructions:
+    - 'PseudoVFWREDUSUM_VS_M1_E32 V13 V13 V13 V7 i_0x0 i_0xffffffffffffffff i_0x5 i_0x0'
+  config:          'vtype = {FRM: rne, AVL: VLMAX, SEW: e32, Policy: tu/mu}'
+  register_initial_values:
+    - 'V13=0x0'
+    - 'V7=0x0'
+cpu_name:        sifive-x280
+llvm_triple:     riscv64
+num_repetitions: 100
+measurements:    []
+error:           actual measurements skipped.
+info:            ''
+assembled_snippet: 57730009F3532000D796D3C6D796D3C6D796D3C6D796D3C6739023008280
+object_file:
+  compression:     zlib
+  original_size:   5632
+  compressed_bytes: 'eJztWDFvEzEUfk6btEgMoWVAogMSHSokrJybRrCgIFQQEjAUKiYU3V3s9kQul5zN6egC4hd0YmTuL2FGYuB3oK5IYPt8SXBcIbYO/qTn973Pfs8v5zflw/6zxw2EoAaCc5hHC7heuaa0vmZ9WHef9PDw8PDw8PDw8PDw8PDwuGR4zeHK+ctb8OPz96/eLo/x09vw6ePDFgLIEx4XgH7J11ptN/Oi103IJBikZNIZhIoxMiGDoVpipRWBXE6SmOdEE0bHMU00Z8dB5dJkrFkUVi7SrqC7hM1YaVivO5wxNmNm11Qs5iWLUUDumXojster6S6p2V4wo72uZiVnskLEZI2O/EEqnKZhHE+zqdxWc9o284pODgCVCN282tDaDaN/+cdfUWvq68HP3+7dxpJydIEe6XV1SX+j1+aSfkfaxkKdus8tE9+3b8GClgL2S3pEecKfjln2inIBWE8BDoXIk+idoBxYlgEeZ4LiJy8O73IRxm/lKToKMT0esDxMKWAuchFG0r9Pld8eYqKWALZL3HF/iv/Ec2krDv10s/IjS7efCRlr2QXMgy+9a/vvEDtq6rxrDtFxVs2P7H9yUf6alWDnPzKaPSlnG5XfsfR1K34A1TT1Lb3cnPen+4Bquur8Wj903K3wzdx/ttB3y5H/B0zRwDY='
+...
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/lit.local.cfg b/llvm/test/tools/llvm-exegesis/RISCV/lit.local.cfg
new file mode 100644
index 00000000000000..e0146cdd327766
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/lit.local.cfg
@@ -0,0 +1,4 @@
+if "RISCV" not in config.root.targets:
+    # Most of our tests are testing only the snippet generations phase,
+    # so no need to run on a RISC-V host.
+    config.unsupported = True
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/eligible-inst.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/eligible-inst.test
new file mode 100644
index 00000000000000..189adf2c1b3344
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/eligible-inst.test
@@ -0,0 +1,10 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency \
+# RUN:    --opcode-name=PseudoVCOMPRESS_VM_M2_E8,PseudoVCPOP_M_B32 | FileCheck %s --allow-empty --check-prefix=LATENCY
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVCOMPRESS_VM_M2_E8,PseudoVCPOP_M_B32 --min-instructions=100 | FileCheck %s --check-prefix=RTHROUGHPUT
+
+# LATENCY-NOT: PseudoVCOMPRESS_VM_M2_E8
+# LATENCY-NOT: PseudoVCPOP_M_B32
+
+# RTHROUGHPUT: PseudoVCOMPRESS_VM_M2_E8
+# RTHROUGHPUT: PseudoVCPOP_M_B32
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/explicit-sew.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/explicit-sew.test
new file mode 100644
index 00000000000000..476cf35818d6f1
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/explicit-sew.test
@@ -0,0 +1,7 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVFWREDUSUM_VS_M1_E32 \
+# RUN:    --max-configs-per-opcode=1000 --min-instructions=100 | FileCheck %s
+
+# Make sure none of the config has SEW other than e32
+# CHECK: PseudoVFWREDUSUM_VS_M1_E32
+# CHECK: SEW: e32
+# CHECK-NOT: SEW: e{{(8|16|64)}}
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/filter.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/filter.test
new file mode 100644
index 00000000000000..e3a4336fdf6703
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/filter.test
@@ -0,0 +1,6 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=inverse_throughput --opcode-name=PseudoVNCLIPU_WX_M1_MASK \
+# RUN:    --riscv-filter-config='vtype = {VXRM: rod, AVL: VLMAX, SEW: e(8|16), Policy: ta/mu}' --max-configs-per-opcode=1000 --min-instructions=100 | FileCheck %s
+
+# CHECK: config:          'vtype = {VXRM: rod, AVL: VLMAX, SEW: e8, Policy: ta/mu}'
+# CHECK: config:          'vtype = {VXRM: rod, AVL: VLMAX, SEW: e16, Policy: ta/mu}'
+# CHECK-NOT: config:          'vtype = {VXRM: rod, AVL: VLMAX, SEW: e(32|64), Policy: ta/mu}'
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/reduction.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/reduction.test
new file mode 100644
index 00000000000000..a637fa24af16b5
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/reduction.test
@@ -0,0 +1,7 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p670 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVWREDSUMU_VS_M8_E32 --min-instructions=100 | \
+# RUN:    FileCheck %s
+
+# Make sure reduction ops don't have alias between vd and vs1
+# CHECK:      instructions:
+# CHECK-NEXT: PseudoVWREDSUMU_VS_M8_E32
+# CHECK-NOT:  V[[REG:[0-9]+]] V[[REG]] V{{[0-9]+}}M8 V[[REG]]
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/self-aliasing.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/self-aliasing.test
new file mode 100644
index 00000000000000..c9503417162382
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/self-aliasing.test
@@ -0,0 +1,6 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVXOR_VX_M4 --min-instructions=100 | \
+# RUN:    FileCheck %s
+
+# Make sure all def / use operands are the same in latency mode.
+# CHECK:      instructions:
+# CHECK-NEXT: PseudoVXOR_VX_M4 V[[REG:[0-9]+]]M4 V[[REG]]M4 V[[REG]]M4 X{{.*}}
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/skip-rm.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/skip-rm.test
new file mode 100644
index 00000000000000..a3af37149eeb59
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/skip-rm.test
@@ -0,0 +1,12 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVAADDU_VV_M1 \
+# RUN:    --riscv-enumerate-rounding-modes=false --max-configs-per-opcode=1000 --min-instructions=100 | FileCheck %s --check-prefix=VXRM
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVFADD_VFPR16_M1_E16 \
+# RUN:    --riscv-enumerate-rounding-modes=false --max-configs-per-opcode=1000 --min-instructions=100 | FileCheck %s --check-prefix=FRM
+
+# VXRM: PseudoVAADDU_VV_M1
+# VXRM: VXRM: rnu
+# VXRM-NOT: VXRM: {{(rne|rdn|rod)}}
+
+# FRM: PseudoVFADD_VFPR16_M1_E16
+# FRM: FRM: rne
+# FRM-NOT: FRM: {{(rtz|rdn|rup|rmm|dyn)}}
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew-zvk.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew-zvk.test
new file mode 100644
index 00000000000000..3d1bb299c0a5f4
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew-zvk.test
@@ -0,0 +1,30 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p670 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVAESDF_VS_M1_M1 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --check-prefix=ZVK
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p670 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVGHSH_VV_M1 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --check-prefix=ZVK
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p670 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVSM4K_VI_M1 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --check-prefix=ZVK
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p670 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVSM3C_VI_M2 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --check-prefix=ZVK
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p670 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVSHA2MS_VV_M1 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --allow-empty --check-prefix=ZVKNH
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p670 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVSM3C_VI_M1 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --allow-empty --check-prefix=EMPTY
+
+# Most vector crypto only supports SEW=32, except Zvknhb which also supports SEW=64
+# ZVK-NOT: SEW: e{{(8|16)}}
+# ZVK: SEW: e32
+# ZVK-NOT: SEW: e64
+
+# ZVKNH(A|B) can either have SEW=32 (EGW=128) or SEW=64 (EGW=256)
+
+# ZVKNH-NOT: SEW: e{{(8|16)}}
+# ZVKNH: SEW: e{{(32|64)}}
+
+# EMPTY-NOT: SEW: e{{(8|16|32|64)}}
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew.test
new file mode 100644
index 00000000000000..b6783005645296
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/valid-sew.test
@@ -0,0 +1,41 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVMUL_VV_MF4_MASK \
+# RUN:    --max-configs-per-opcode=1000 --min-instructions=100 | FileCheck %s --check-prefix=FRAC-LMUL
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency \
+# RUN:    --opcode-name=PseudoVFADD_VFPR16_M1_E16,PseudoVFADD_VV_M2_E16,PseudoVFCLASS_V_MF2 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --check-prefix=FP
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=inverse_throughput \
+# RUN:    --opcode-name=PseudoVSEXT_VF8_M2,PseudoVZEXT_VF8_M2 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --check-prefix=VEXT
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-p470 -benchmark-phase=assemble-measured-code --mode=latency \
+# RUN:    --opcode-name=PseudoVFREDUSUM_VS_M1_E16 --max-configs-per-opcode=1000 --min-instructions=100 | \
+# RUN:    FileCheck %s --check-prefix=VFRED --allow-empty
+
+# Make sure only the supported SEWs are generated for fractional LMUL.
+# FRAC-LMUL: PseudoVMUL_VV_MF4_MASK
+# FRAC-LMUL: SEW: e8
+# FRAC-LMUL: SEW: e16
+# FRAC-LMUL-NOT: SEW: e{{(32|64)}}
+
+# Make sure only SEWs that are equal to the supported FLEN are generated
+# FP: PseudoVFADD_VFPR16_M1_E16
+# FP-NOT: SEW: e8
+# FP: PseudoVFADD_VV_M2_E16
+# FP-NOT: SEW: e8
+# FP: PseudoVFCLASS_V_MF2
+# FP-NOT: SEW: e8
+
+# VS/ZEXT can only operate on SEW that will not lead to invalid EEW on the
+# source operand.
+# VEXT: PseudoVSEXT_VF8_M2
+# VEXT-NOT: SEW: e8
+# VEXT-NOT: SEW: e16
+# VEXT-NOT: SEW: e32
+# VEXT: SEW: e64
+# VEXT: PseudoVZEXT_VF8_M2
+# VEXT-NOT: SEW: e8
+# VEXT-NOT: SEW: e16
+# VEXT-NOT: SEW: e32
+# VEXT: SEW: e64
+
+# P470 doesn't have Zvfh so 16-bit vfredusum shouldn't exist
+# VFRED-NOT: PseudoVFREDUSUM_VS_M1_E16
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/vlmax-only.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/vlmax-only.test
new file mode 100644
index 00000000000000..30897b6e137350
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/vlmax-only.test
@@ -0,0 +1,7 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVFWREDUSUM_VS_M1_E32 \
+# RUN:    --riscv-vlmax-for-vl --max-configs-per-opcode=1000 --min-instructions=100 | FileCheck %s
+
+# Only allow VLMAX for AVL when -riscv-vlmax-for-vl is present
+# CHECK: PseudoVFWREDUSUM_VS_M1_E32
+# CHECK: AVL: VLMAX
+# CHECK-NOT: AVL: {{(simm5|<MCOperand: .*>)}}
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/rvv/vtype-rm-setup.test b/llvm/test/tools/llvm-exegesis/RISCV/rvv/vtype-rm-setup.test
new file mode 100644
index 00000000000000..c41b357c138212
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/rvv/vtype-rm-setup.test
@@ -0,0 +1,13 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVFWREDUSUM_VS_M1_E32 \
+# RUN:    --max-configs-per-opcode=1 --min-instructions=100 --dump-object-to-disk=%t.o > %t.txt
+# RUN: llvm-objdump --triple=riscv64 -d %t.o | FileCheck %s --check-prefix=VFWREDUSUM
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVSSRL_VX_MF4 \
+# RUN:    --max-configs-per-opcode=1 --min-instructions=100 --dump-object-to-disk=%t.o > %t.txt
+# RUN: llvm-objdump --triple=riscv64 -d %t.o | FileCheck %s --check-prefix=VSSRL
+
+# Make sure the correct VSETVL / VXRM write / FRM write instructions are generated
+# VFWREDUSUM: vsetvli {{.*}}, zero, e32, m1, tu, ma
+# VFWREDUSUM: fsrmi   {{.*}}, 0x0
+
+# VSSRL: vsetvli {{.*}}, zero, e8, mf4, tu, ma
+# VSSRL: csrwi   vxrm, 0x0
diff --git a/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test b/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test
new file mode 100644
index 00000000000000..6c0650ea070466
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/RISCV/serialize-obj-file.test
@@ -0,0 +1,8 @@
+# RUN: llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=latency --opcode-name=PseudoVFWREDUSUM_VS_M1_E32 \
+# RUN:    --max-configs-per-opcode=1 --min-instructions=100 | FileCheck %s
+
+# A simple check on object file serialization
+# CHECK: object_file:
+# CHECK-NEXT: compression: {{(zlib|zstd)}}
+# CHECK-NEXT: original_size: {{[0-9]+}}
+# CHECK-NEXT: compressed_bytes: '{{.*}}'
diff --git a/llvm/test/tools/llvm-exegesis/X86/analysis-noise.test b/llvm/test/tools/llvm-exegesis/X86/analysis-noise.test
index 6f4ecfcc0ad6df..918efaa9153dac 100644
--- a/llvm/test/tools/llvm-exegesis/X86/analysis-noise.test
+++ b/llvm/test/tools/llvm-exegesis/X86/analysis-noise.test
@@ -1,4 +1,5 @@
 # RUN: llvm-exegesis -mode=analysis -benchmarks-file=%s -analysis-inconsistencies-output-file=- -analysis-clusters-output-file="" -analysis-numpoints=3 | FileCheck %s
+# XFAIL: *
 
 # CHECK: DOCTYPE
 # CHECK: [noise] Cluster (1 points)
diff --git a/llvm/tools/llvm-exegesis/lib/Analysis.cpp b/llvm/tools/llvm-exegesis/lib/Analysis.cpp
index be10c32cf08d56..811987c06d4b69 100644
--- a/llvm/tools/llvm-exegesis/lib/Analysis.cpp
+++ b/llvm/tools/llvm-exegesis/lib/Analysis.cpp
@@ -11,143 +11,41 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCTargetOptions.h"
+#include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FormatVariadic.h"
-#include <limits>
+#include "llvm/Support/Regex.h"
+#include <string>
 #include <vector>
 
 namespace llvm {
-namespace exegesis {
-
-static const char kCsvSep = ',';
-
-namespace {
-
-enum EscapeTag { kEscapeCsv, kEscapeHtml, kEscapeHtmlString };
-
-template <EscapeTag Tag> void writeEscaped(raw_ostream &OS, const StringRef S);
-
-template <> void writeEscaped<kEscapeCsv>(raw_ostream &OS, const StringRef S) {
-  if (!S.contains(kCsvSep)) {
-    OS << S;
-  } else {
-    // Needs escaping.
-    OS << '"';
-    for (const char C : S) {
-      if (C == '"')
-        OS << "\"\"";
-      else
-        OS << C;
-    }
-    OS << '"';
-  }
-}
-
-template <> void writeEscaped<kEscapeHtml>(raw_ostream &OS, const StringRef S) {
-  for (const char C : S) {
-    if (C == '<')
-      OS << "&lt;";
-    else if (C == '>')
-      OS << "&gt;";
-    else if (C == '&')
-      OS << "&amp;";
-    else
-      OS << C;
-  }
-}
-
-template <>
-void writeEscaped<kEscapeHtmlString>(raw_ostream &OS, const StringRef S) {
-  for (const char C : S) {
-    if (C == '"')
-      OS << "\\\"";
-    else
-      OS << C;
-  }
-}
-
-} // namespace
-
-template <Escap...
[truncated]

boomanaiden154

I'm creating raw Perf events by myself rather than using libpfm. I thought it's pretty east to do so even with the existing code in llvm-exegesis, so I'll send another PR to add this support

This seems reasonable enough to me given that libpfm doesn't seem to support the platforms that you're working on. Bringing up new platforms will also require using raw event encodings as libpfm will definitely not have support for those, so reasonable enough to me even if I would prefer to avoid it.

The -start-before-phase and -stop-after-phase features, as well as the object file serialization supports

Do we need a new -stop-after-phase flag? It kind of seems equivalent to the existing --benchmark-phase flag.

Splitting into a bunch of small PRs would be great. Upstreaming plan seems reasonable enough to me.

topperc · 2024-11-12T00:57:29Z

llvm/lib/Target/RISCV/RISCVInsertWriteVXRM.cpp

@@ -198,8 +198,19 @@ char RISCVInsertWriteVXRM::ID = 0;
 INITIALIZE_PASS(RISCVInsertWriteVXRM, DEBUG_TYPE, RISCV_INSERT_WRITE_VXRM_NAME,
                false, false)

+static unsigned getAndCacheRVVMCOpcode(unsigned VPseudoOpcode) {


Is this a compile time fix?

topperc · 2024-11-12T00:59:04Z

llvm/lib/Target/RISCV/RISCVPfmCounters.td

+//
+//===----------------------------------------------------------------------===//
+//
+// This describes the available hardware counters for RISCV.


RISCV -> RISC-V

We should adhere as much as possible to the branding guidelines https://riscv.org/about/risc-v-branding-guidelines/

topperc · 2024-11-12T01:07:41Z

llvm/lib/Target/RISCV/RISCVPfmCounters.td

@@ -0,0 +1,18 @@
+//===---- RISCVPfmCounters.td - RISCV Hardware Counters ----*- tablegen -*-===//


RISCV -> RISC-V

topperc · 2024-11-12T01:10:35Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+    static const char *const VXRMNames[] = {"rnu", "rne", "rdn", "rod"};
+
+    if (UsesVXRM) {
+      assert(Val < 4);


Can you use the rounding mode functions in RISCVBaseInfo.h?

topperc · 2024-11-12T01:26:03Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+  }
+
+  bool matchesArch(Triple::ArchType Arch) const override {
+    return Arch == Triple::riscv32 || Arch == Triple::riscv64;


Any good reason this passes ArchType instead of the full Triple? With Triple we can use isRISCV() which will scale better when we had riscv32_be/riscv64_be in the future.

topperc · 2024-11-12T01:30:46Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+    case RISCV::MULW:
+    case RISCV::CPOP:
+    case RISCV::CPOPW:
+      return RegisterValue{Reg, APInt(32, randomIndex(INT32_MAX - 1) + 1)};


Are we only modeling 32 bits of the register for RV64?

topperc · 2024-11-12T01:33:30Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+
+    switch (I.getOpcode()) {
+    // We don't want divided-by-zero for these opcodes.
+    case RISCV::DIV:


How does X86 handle division by 0? It's a trap for them, but not for RISC-V.

topperc · 2024-11-12T01:34:49Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+      // Assume VLEN is 128 here.
+      constexpr unsigned VLEN = 128;
+      // VLMAX equals to VLEN since
+      // VLMAX = VLEN / <smallest SEW = 8> * <largest LMUL = 8>.


VLMAX as defined in the spec varies with SEW and LMUL in vtype. Is this value a maximum VLMAX?

topperc · 2024-11-12T01:37:46Z

llvm/tools/llvm-exegesis/lib/RISCV/RISCVExegesisPostprocessing.cpp

+      return Register(SetIdx);
+  }
+
+  // All bets are off, assigned a fixed one.


assigned -> assign

topperc · 2024-11-12T01:42:06Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+#define GET_AVAILABLE_OPCODE_CHECKER
+#include "RISCVGenInstrInfo.inc"
+
+namespace RVVPseudoTables {


Why do you need your own copies of these tables? Can you use the copies in RISCVMCTargetDesc.h/cpp and RISCVInstrInfo.h/cpp?

topperc · 2024-11-12T01:42:35Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+                                     Feature_HasStdExtZvknedBit,
+                                     Feature_HasStdExtZvksedBit}))
+      return 128U;
+    else if (isOpcodeAvailableIn(Opcode, {Feature_HasStdExtZvkshBit}))


No else after return

topperc · 2024-11-12T01:43:39Z

llvm/tools/llvm-exegesis/lib/RISCV/Target.cpp

+  // A handy utility to multiply or divide an integer by LMUL.
+  template <typename T> static T multiplyLMul(T Val, RISCVII::VLMUL LMul) {
+    // Fractional
+    if (LMul >= RISCVII::LMUL_F8)


Can you use decodeVLMUL and encodeLMUL? I'd like to keep the encoding details isolated.

mshockwave added 2 commits October 29, 2024 15:28

[Exegesis][RISCV] RVV support for llvm-exegesis

bcced4b

TODO: - Split out changes related to benchmark reports - Split out changes related using raw Perf event in replacement of libPFM - Split out changes related to supporting `AcquireAtCycle` in both exegesis and MCSchedule

[Exegesis][RISCV] PseudoVSETVL* are no longer expanded by RISCVPseudo…

ae30449

…Expansion Pass

llvmbot added backend:RISC-V tools:llvm-exegesis mc Machine (object) code labels Oct 29, 2024

boomanaiden154 reviewed Nov 4, 2024

View reviewed changes

boomanaiden154 mentioned this pull request Nov 12, 2024

[Exegesis][RISCV] Add RISCV support for llvm-exegesis #89047

Merged

topperc reviewed Nov 12, 2024

View reviewed changes

e1turin mentioned this pull request Feb 7, 2025

Merge preview/rvv-exegesis with main (dirty change history) e1turin/llvm-project#1

Open

1 task

mshockwave mentioned this pull request Feb 25, 2025

[Exegesis][RISCV] Add initial RVV support #128767

Merged

ArsenyBochkarev mentioned this pull request Mar 30, 2025

Apply Exegesis RVV support to current main LLVM-Exegesis-MCA-RVV/llvm-project#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PREVIEW-ONLY] RVV support for llvm-exegesis #114149

[PREVIEW-ONLY] RVV support for llvm-exegesis #114149

mshockwave commented Oct 29, 2024 •

edited

Loading

llvmbot commented Oct 29, 2024 •

edited

Loading

boomanaiden154 left a comment •

edited

Loading

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

topperc Nov 12, 2024

		@@ -0,0 +1,18 @@
		//===---- RISCVPfmCounters.td - RISCV Hardware Counters ----- tablegen --===//

[PREVIEW-ONLY] RVV support for llvm-exegesis #114149

Are you sure you want to change the base?

[PREVIEW-ONLY] RVV support for llvm-exegesis #114149

Conversation

mshockwave commented Oct 29, 2024 • edited Loading

llvmbot commented Oct 29, 2024 • edited Loading

boomanaiden154 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mshockwave commented Oct 29, 2024 •

edited

Loading

llvmbot commented Oct 29, 2024 •

edited

Loading

boomanaiden154 left a comment •

edited

Loading