Reland: "[Exegesis] Add the ability to dry-run the measurement phase (#121991)" #122775

mshockwave · 2025-01-13T19:36:53Z

This relands f8f8598

Follow up on #122371:
The problem here is a little subtle: when we dry-run the measurement phase, we create a LLJIT instance without actually executing the snippets. The key is, LLJIT has its own TargetMachine which uses triple designated by LLVM_TARGET_ARCH (which is default to host). On a machine that does not support Exegesis, the LLJIT would fail to create its TargetMachine because llvm-exegesis don't even register the host's target!

Putting this test into any of the target-specific folder won't help, because it's about the host. And personally I don't really want to use exegesis-can-execute-<arch> for generic tests like this -- it's too strict as we don't actually need to execute the snippet.

My solution here is creating another test feature which is added only when LLVM_TARGET_ARCH is supported by llvm-exegesis. This feature is something in between <arch>-registered-target and exegesis-can-execute-<arch>.

With the new benchmark phase, dry-run-measurement, llvm-exegesis can run everything except the actual snippet execution. It is useful when we want to test some parts of the code between the assemble-measured-code and measure phase without actually running on native platforms.

llvmbot · 2025-01-13T19:42:23Z

@llvm/pr-subscribers-llvm-binary-utilities

@llvm/pr-subscribers-tools-llvm-exegesis

Author: Min-Yih Hsu (mshockwave)

Changes

This relands f8f8598

Follow up on #122371:
The problem here is a little subtle: when we dry-run the measurement phase, we create a LLJIT instance without actually executing the snippets. The key is, LLJIT has its own TargetMachine which uses triple designated by LLVM_TARGET_ARCH (which is default to host). On a machine that does not support Exegesis, the LLJIT would fail to create its TargetMachine because llvm-exegesis don't even register the host's target!

Putting this test into any of the target-specific folder won't help, because it's about the host. And personally I don't really want to use exegesis-can-execute-<arch> for generic tests like this -- it's too strict as we don't actually need to execute the snippet.

My solution here is creating another test feature which is added only when LLVM_TARGET_ARCH is supported by llvm-exegesis. This feature is something in between <arch>-registered-target and exegesis-can-execute-<arch>.

Full diff: https://github.com/llvm/llvm-project/pull/122775.diff

7 Files Affected:

(modified) llvm/docs/CommandGuide/llvm-exegesis.rst (+1)
(added) llvm/test/tools/llvm-exegesis/dry-run-measurement.test (+11)
(modified) llvm/test/tools/llvm-exegesis/lit.local.cfg (+6)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkResult.h (+1)
(modified) llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp (+25-8)
(modified) llvm/tools/llvm-exegesis/lib/Target.cpp (+2-2)
(modified) llvm/tools/llvm-exegesis/llvm-exegesis.cpp (+6-3)

diff --git a/llvm/docs/CommandGuide/llvm-exegesis.rst b/llvm/docs/CommandGuide/llvm-exegesis.rst
index 8266d891a5e6b1..d357c2ceea4189 100644
--- a/llvm/docs/CommandGuide/llvm-exegesis.rst
+++ b/llvm/docs/CommandGuide/llvm-exegesis.rst
@@ -301,6 +301,7 @@ OPTIONS
   * ``prepare-and-assemble-snippet``: Same as ``prepare-snippet``, but also dumps an excerpt of the sequence (hex encoded).
   * ``assemble-measured-code``: Same as ``prepare-and-assemble-snippet``. but also creates the full sequence that can be dumped to a file using ``--dump-object-to-disk``.
   * ``measure``: Same as ``assemble-measured-code``, but also runs the measurement.
+  * ``dry-run-measurement``: Same as measure, but does not actually execute the snippet.
 
 .. option:: --x86-lbr-sample-period=<nBranches/sample>
 
diff --git a/llvm/test/tools/llvm-exegesis/dry-run-measurement.test b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
new file mode 100644
index 00000000000000..02e1ec521cf276
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
@@ -0,0 +1,11 @@
+# RUN: llvm-exegesis --mtriple=riscv64 --mcpu=sifive-p470 --mode=latency --opcode-name=ADD --use-dummy-perf-counters --benchmark-phase=dry-run-measurement | FileCheck %s
+# REQUIRES: riscv-registered-target && native-registered-exegesis-target
+
+# This test makes sure that llvm-exegesis doesn't execute "cross-compiled" snippets in the presence of
+# --dry-run-measurement. RISC-V was chosen simply because most of the time we run tests on X86 machines.
+
+# Should not contain misleading results.
+# CHECK: measurements:    []
+
+# Should not contain error messages like "snippet crashed while running: Segmentation fault".
+# CHECK: error:           ''
diff --git a/llvm/test/tools/llvm-exegesis/lit.local.cfg b/llvm/test/tools/llvm-exegesis/lit.local.cfg
index a51a2d73442fa0..343f34c58673ec 100644
--- a/llvm/test/tools/llvm-exegesis/lit.local.cfg
+++ b/llvm/test/tools/llvm-exegesis/lit.local.cfg
@@ -30,6 +30,12 @@ def can_use_perf_counters(mode, extra_options=[]):
         print("could not exec llvm-exegesis")
         return False
 
+# LLJIT builds its own TargetMachine using arch designated by LLVM_TARGET_ARCH, which
+# is default to host. We don't want tests that use LLJIT (but not necessarily
+# execute the snippets) to run on machines that are not even supported by
+# exegesis.
+if config.root.native_target in ["AArch64", "Mips", "PowerPC", "RISCV", "X86"]:
+    config.available_features.add("native-registered-exegesis-target")
 
 for arch in ["aarch64", "mips", "powerpc", "x86_64"]:
     if can_execute_generated_snippets(arch):
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
index 3c09a8380146e5..5480d856168784 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
@@ -38,6 +38,7 @@ enum class BenchmarkPhaseSelectorE {
   PrepareAndAssembleSnippet,
   AssembleMeasuredCode,
   Measure,
+  DryRunMeasure,
 };
 
 enum class BenchmarkFilter { All, RegOnly, WithMem };
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
index a7771b99e97b1a..cc46f7feb6cf7f 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
@@ -99,7 +99,7 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
   static Expected<std::unique_ptr<InProcessFunctionExecutorImpl>>
   create(const LLVMState &State, object::OwningBinary<object::ObjectFile> Obj,
          BenchmarkRunner::ScratchSpace *Scratch,
-         std::optional<int> BenchmarkProcessCPU) {
+         std::optional<int> BenchmarkProcessCPU, bool DryRun) {
     Expected<ExecutableFunction> EF =
         ExecutableFunction::create(State.createTargetMachine(), std::move(Obj));
 
@@ -107,14 +107,17 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
       return EF.takeError();
 
     return std::unique_ptr<InProcessFunctionExecutorImpl>(
-        new InProcessFunctionExecutorImpl(State, std::move(*EF), Scratch));
+        new InProcessFunctionExecutorImpl(State, std::move(*EF), Scratch,
+                                          DryRun));
   }
 
 private:
   InProcessFunctionExecutorImpl(const LLVMState &State,
                                 ExecutableFunction Function,
-                                BenchmarkRunner::ScratchSpace *Scratch)
-      : State(State), Function(std::move(Function)), Scratch(Scratch) {}
+                                BenchmarkRunner::ScratchSpace *Scratch,
+                                bool DryRun)
+      : State(State), Function(std::move(Function)), Scratch(Scratch),
+        DryRun(DryRun) {}
 
   static void accumulateCounterValues(const SmallVector<int64_t, 4> &NewValues,
                                       SmallVector<int64_t, 4> *Result) {
@@ -143,9 +146,14 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
       CrashRecoveryContext CRC;
       CrashRecoveryContext::Enable();
       const bool Crashed = !CRC.RunSafely([this, Counter, ScratchPtr]() {
-        Counter->start();
-        this->Function(ScratchPtr);
-        Counter->stop();
+        if (DryRun) {
+          Counter->start();
+          Counter->stop();
+        } else {
+          Counter->start();
+          this->Function(ScratchPtr);
+          Counter->stop();
+        }
       });
       CrashRecoveryContext::Disable();
       PS.reset();
@@ -177,6 +185,7 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
   const LLVMState &State;
   const ExecutableFunction Function;
   BenchmarkRunner::ScratchSpace *const Scratch;
+  bool DryRun = false;
 };
 
 #ifdef __linux__
@@ -664,6 +673,9 @@ Expected<std::unique_ptr<BenchmarkRunner::FunctionExecutor>>
 BenchmarkRunner::createFunctionExecutor(
     object::OwningBinary<object::ObjectFile> ObjectFile,
     const BenchmarkKey &Key, std::optional<int> BenchmarkProcessCPU) const {
+  bool DryRun =
+      BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::DryRunMeasure;
+
   switch (ExecutionMode) {
   case ExecutionModeE::InProcess: {
     if (BenchmarkProcessCPU.has_value())
@@ -671,7 +683,8 @@ BenchmarkRunner::createFunctionExecutor(
                                  "support benchmark core pinning.");
 
     auto InProcessExecutorOrErr = InProcessFunctionExecutorImpl::create(
-        State, std::move(ObjectFile), Scratch.get(), BenchmarkProcessCPU);
+        State, std::move(ObjectFile), Scratch.get(), BenchmarkProcessCPU,
+        DryRun);
     if (!InProcessExecutorOrErr)
       return InProcessExecutorOrErr.takeError();
 
@@ -679,6 +692,10 @@ BenchmarkRunner::createFunctionExecutor(
   }
   case ExecutionModeE::SubProcess: {
 #ifdef __linux__
+    if (DryRun)
+      return make_error<Failure>("The subprocess execution mode cannot "
+                                 "dry-run measurement at this moment.");
+
     auto SubProcessExecutorOrErr = SubProcessFunctionExecutorImpl::create(
         State, std::move(ObjectFile), Key, BenchmarkProcessCPU);
     if (!SubProcessExecutorOrErr)
diff --git a/llvm/tools/llvm-exegesis/lib/Target.cpp b/llvm/tools/llvm-exegesis/lib/Target.cpp
index 29e58692f0e92b..e2251ff978888b 100644
--- a/llvm/tools/llvm-exegesis/lib/Target.cpp
+++ b/llvm/tools/llvm-exegesis/lib/Target.cpp
@@ -98,7 +98,7 @@ ExegesisTarget::createBenchmarkRunner(
     return nullptr;
   case Benchmark::Latency:
   case Benchmark::InverseThroughput:
-    if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure &&
+    if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure &&
         !PfmCounters.CycleCounter) {
       const char *ModeName = Mode == Benchmark::Latency
                                  ? "latency"
@@ -116,7 +116,7 @@ ExegesisTarget::createBenchmarkRunner(
         State, Mode, BenchmarkPhaseSelector, ResultAggMode, ExecutionMode,
         ValidationCounters, BenchmarkRepeatCount);
   case Benchmark::Uops:
-    if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure &&
+    if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure &&
         !PfmCounters.UopsCounter && !PfmCounters.IssueCounters)
       return make_error<Failure>(
           "can't run 'uops' mode, sched model does not define uops or issue "
diff --git a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
index fa37e05956be8c..07bd44ee64f1f2 100644
--- a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
+++ b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
@@ -132,7 +132,10 @@ static cl::opt<BenchmarkPhaseSelectorE> BenchmarkPhaseSelector(
         clEnumValN(
             BenchmarkPhaseSelectorE::Measure, "measure",
             "Same as prepare-measured-code, but also runs the measurement "
-            "(default)")),
+            "(default)"),
+        clEnumValN(
+            BenchmarkPhaseSelectorE::DryRunMeasure, "dry-run-measurement",
+            "Same as measure, but does not actually execute the snippet")),
     cl::init(BenchmarkPhaseSelectorE::Measure));
 
 static cl::opt<bool>
@@ -476,7 +479,7 @@ static void runBenchmarkConfigurations(
 }
 
 void benchmarkMain() {
-  if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure &&
+  if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure &&
       !UseDummyPerfCounters) {
 #ifndef HAVE_LIBPFM
     ExitWithError(
@@ -501,7 +504,7 @@ void benchmarkMain() {
 
   // Preliminary check to ensure features needed for requested
   // benchmark mode are present on target CPU and/or OS.
-  if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure)
+  if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure)
     ExitOnErr(State.getExegesisTarget().checkFeatureSupport());
 
   if (ExecutionMode == BenchmarkRunner::ExecutionModeE::SubProcess &&

boomanaiden154

LGTM.

nico · 2025-01-14T14:40:49Z

This seems to break tests on macOS: http://45.33.8.238/macm1/98920/step_10.txt

Please take a look and revert for now if it takes a while to fix.

nico · 2025-01-14T14:41:49Z

In regular cmake builds too: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket/8725770427201973633/+/u/package_clang/stdout?format=raw

nico · 2025-01-14T14:54:08Z

Also on green dragon, which I think are LLVM's official mac bots: https://green.lab.llvm.org/job/llvm.org/view/All/job/clang-stage1-cmake-RA-incremental/lastCompletedBuild/testReport/

Prabhuk · 2025-01-14T18:09:57Z

We are seeing MAC test failures as well in Fuchsia toolchain builders:

https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-host-mac-x64/b8725747943317383073/overview

I'm making this test to run only when exegesis-can-execute-x86_64.

mshockwave · 2025-01-14T18:47:00Z

We are seeing MAC test failures as well in Fuchsia toolchain builders:

https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-host-mac-x64/b8725747943317383073/overview

It's hot fixed by 63d3bd6
I simply moved this test into test/tools/llvm-exegesis/X86 and gated with exegesis-can-execute-x86_64.

mshockwave · 2025-01-14T18:55:13Z

This seems to break tests on macOS: http://45.33.8.238/macm1/98920/step_10.txt

Please take a look and revert for now if it takes a while to fix.

The root cause of this problem was, again, relatively subtle: The RISC-V snippet object was generated by a MC assembler separated from the LLJIT engine, which was configured using the host triple. Therefore the host -- macosx in these cases -- would expect symbols to be prefixed by underscore, while RISC-V doesn't use that convention in any way at this moment.

dyung · 2025-01-14T22:10:50Z

The relanded test seems to be failing on x86-64 MacOS: https://lab.llvm.org/buildbot/#/builders/23/builds/6632

…t phase (#121991)" (#122775)" This reverts commit a39aaf3 and 63d3bd6. Due to test failures on MacOSX.

mshockwave · 2025-01-14T22:20:15Z

The relanded test seems to be failing on x86-64 MacOS: https://lab.llvm.org/buildbot/#/builders/23/builds/6632

Yeah, I just reverted both this PR and the hot fix. I'm gonna dig into this.

mshockwave requested a review from boomanaiden154 January 13, 2025 19:36

llvmbot added tools:llvm-exegesis llvm:binary-utilities labels Jan 13, 2025

boomanaiden154 approved these changes Jan 13, 2025

View reviewed changes

mshockwave merged commit a39aaf3 into llvm:main Jan 13, 2025
10 of 11 checks passed

mshockwave deleted the patch/exegesis/fix-dry-run-measurement branch January 13, 2025 21:43

mshockwave added a commit that referenced this pull request Jan 14, 2025

[Exegesis] Fix test failures from #122775 on MacOSX and Fuchsia

63d3bd6

I'm making this test to run only when exegesis-can-execute-x86_64.

mshockwave added a commit that referenced this pull request Jan 14, 2025

Revert "Reland: "[Exegesis] Add the ability to dry-run the measuremen…

72225ca

…t phase (#121991)" (#122775)" This reverts commit a39aaf3 and 63d3bd6. Due to test failures on MacOSX.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reland: "[Exegesis] Add the ability to dry-run the measurement phase (#121991)" #122775

Reland: "[Exegesis] Add the ability to dry-run the measurement phase (#121991)" #122775

mshockwave commented Jan 13, 2025 •

edited

Loading

llvmbot commented Jan 13, 2025 •

edited

Loading

boomanaiden154 left a comment

nico commented Jan 14, 2025 •

edited

Loading

nico commented Jan 14, 2025

nico commented Jan 14, 2025

Prabhuk commented Jan 14, 2025

mshockwave commented Jan 14, 2025

mshockwave commented Jan 14, 2025

dyung commented Jan 14, 2025

mshockwave commented Jan 14, 2025

Reland: "[Exegesis] Add the ability to dry-run the measurement phase (#121991)" #122775

Reland: "[Exegesis] Add the ability to dry-run the measurement phase (#121991)" #122775

Conversation

mshockwave commented Jan 13, 2025 • edited Loading

llvmbot commented Jan 13, 2025 • edited Loading

boomanaiden154 left a comment

Choose a reason for hiding this comment

nico commented Jan 14, 2025 • edited Loading

nico commented Jan 14, 2025

nico commented Jan 14, 2025

Prabhuk commented Jan 14, 2025

mshockwave commented Jan 14, 2025

mshockwave commented Jan 14, 2025

dyung commented Jan 14, 2025

mshockwave commented Jan 14, 2025

mshockwave commented Jan 13, 2025 •

edited

Loading

llvmbot commented Jan 13, 2025 •

edited

Loading

nico commented Jan 14, 2025 •

edited

Loading