-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Reland: "[Exegesis] Add the ability to dry-run the measurement phase (#121991)" #122775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reland: "[Exegesis] Add the ability to dry-run the measurement phase (#121991)" #122775
Conversation
With the new benchmark phase, dry-run-measurement, llvm-exegesis can run everything except the actual snippet execution. It is useful when we want to test some parts of the code between the assemble-measured-code and measure phase without actually running on native platforms.
@llvm/pr-subscribers-llvm-binary-utilities @llvm/pr-subscribers-tools-llvm-exegesis Author: Min-Yih Hsu (mshockwave) ChangesThis relands f8f8598 Follow up on #122371: Putting this test into any of the target-specific folder won't help, because it's about the host. And personally I don't really want to use My solution here is creating another test feature which is added only when LLVM_TARGET_ARCH is supported by llvm-exegesis. This feature is something in between Full diff: https://github.com/llvm/llvm-project/pull/122775.diff 7 Files Affected:
diff --git a/llvm/docs/CommandGuide/llvm-exegesis.rst b/llvm/docs/CommandGuide/llvm-exegesis.rst
index 8266d891a5e6b1..d357c2ceea4189 100644
--- a/llvm/docs/CommandGuide/llvm-exegesis.rst
+++ b/llvm/docs/CommandGuide/llvm-exegesis.rst
@@ -301,6 +301,7 @@ OPTIONS
* ``prepare-and-assemble-snippet``: Same as ``prepare-snippet``, but also dumps an excerpt of the sequence (hex encoded).
* ``assemble-measured-code``: Same as ``prepare-and-assemble-snippet``. but also creates the full sequence that can be dumped to a file using ``--dump-object-to-disk``.
* ``measure``: Same as ``assemble-measured-code``, but also runs the measurement.
+ * ``dry-run-measurement``: Same as measure, but does not actually execute the snippet.
.. option:: --x86-lbr-sample-period=<nBranches/sample>
diff --git a/llvm/test/tools/llvm-exegesis/dry-run-measurement.test b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
new file mode 100644
index 00000000000000..02e1ec521cf276
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
@@ -0,0 +1,11 @@
+# RUN: llvm-exegesis --mtriple=riscv64 --mcpu=sifive-p470 --mode=latency --opcode-name=ADD --use-dummy-perf-counters --benchmark-phase=dry-run-measurement | FileCheck %s
+# REQUIRES: riscv-registered-target && native-registered-exegesis-target
+
+# This test makes sure that llvm-exegesis doesn't execute "cross-compiled" snippets in the presence of
+# --dry-run-measurement. RISC-V was chosen simply because most of the time we run tests on X86 machines.
+
+# Should not contain misleading results.
+# CHECK: measurements: []
+
+# Should not contain error messages like "snippet crashed while running: Segmentation fault".
+# CHECK: error: ''
diff --git a/llvm/test/tools/llvm-exegesis/lit.local.cfg b/llvm/test/tools/llvm-exegesis/lit.local.cfg
index a51a2d73442fa0..343f34c58673ec 100644
--- a/llvm/test/tools/llvm-exegesis/lit.local.cfg
+++ b/llvm/test/tools/llvm-exegesis/lit.local.cfg
@@ -30,6 +30,12 @@ def can_use_perf_counters(mode, extra_options=[]):
print("could not exec llvm-exegesis")
return False
+# LLJIT builds its own TargetMachine using arch designated by LLVM_TARGET_ARCH, which
+# is default to host. We don't want tests that use LLJIT (but not necessarily
+# execute the snippets) to run on machines that are not even supported by
+# exegesis.
+if config.root.native_target in ["AArch64", "Mips", "PowerPC", "RISCV", "X86"]:
+ config.available_features.add("native-registered-exegesis-target")
for arch in ["aarch64", "mips", "powerpc", "x86_64"]:
if can_execute_generated_snippets(arch):
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
index 3c09a8380146e5..5480d856168784 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkResult.h
@@ -38,6 +38,7 @@ enum class BenchmarkPhaseSelectorE {
PrepareAndAssembleSnippet,
AssembleMeasuredCode,
Measure,
+ DryRunMeasure,
};
enum class BenchmarkFilter { All, RegOnly, WithMem };
diff --git a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
index a7771b99e97b1a..cc46f7feb6cf7f 100644
--- a/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
+++ b/llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
@@ -99,7 +99,7 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
static Expected<std::unique_ptr<InProcessFunctionExecutorImpl>>
create(const LLVMState &State, object::OwningBinary<object::ObjectFile> Obj,
BenchmarkRunner::ScratchSpace *Scratch,
- std::optional<int> BenchmarkProcessCPU) {
+ std::optional<int> BenchmarkProcessCPU, bool DryRun) {
Expected<ExecutableFunction> EF =
ExecutableFunction::create(State.createTargetMachine(), std::move(Obj));
@@ -107,14 +107,17 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
return EF.takeError();
return std::unique_ptr<InProcessFunctionExecutorImpl>(
- new InProcessFunctionExecutorImpl(State, std::move(*EF), Scratch));
+ new InProcessFunctionExecutorImpl(State, std::move(*EF), Scratch,
+ DryRun));
}
private:
InProcessFunctionExecutorImpl(const LLVMState &State,
ExecutableFunction Function,
- BenchmarkRunner::ScratchSpace *Scratch)
- : State(State), Function(std::move(Function)), Scratch(Scratch) {}
+ BenchmarkRunner::ScratchSpace *Scratch,
+ bool DryRun)
+ : State(State), Function(std::move(Function)), Scratch(Scratch),
+ DryRun(DryRun) {}
static void accumulateCounterValues(const SmallVector<int64_t, 4> &NewValues,
SmallVector<int64_t, 4> *Result) {
@@ -143,9 +146,14 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
CrashRecoveryContext CRC;
CrashRecoveryContext::Enable();
const bool Crashed = !CRC.RunSafely([this, Counter, ScratchPtr]() {
- Counter->start();
- this->Function(ScratchPtr);
- Counter->stop();
+ if (DryRun) {
+ Counter->start();
+ Counter->stop();
+ } else {
+ Counter->start();
+ this->Function(ScratchPtr);
+ Counter->stop();
+ }
});
CrashRecoveryContext::Disable();
PS.reset();
@@ -177,6 +185,7 @@ class InProcessFunctionExecutorImpl : public BenchmarkRunner::FunctionExecutor {
const LLVMState &State;
const ExecutableFunction Function;
BenchmarkRunner::ScratchSpace *const Scratch;
+ bool DryRun = false;
};
#ifdef __linux__
@@ -664,6 +673,9 @@ Expected<std::unique_ptr<BenchmarkRunner::FunctionExecutor>>
BenchmarkRunner::createFunctionExecutor(
object::OwningBinary<object::ObjectFile> ObjectFile,
const BenchmarkKey &Key, std::optional<int> BenchmarkProcessCPU) const {
+ bool DryRun =
+ BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::DryRunMeasure;
+
switch (ExecutionMode) {
case ExecutionModeE::InProcess: {
if (BenchmarkProcessCPU.has_value())
@@ -671,7 +683,8 @@ BenchmarkRunner::createFunctionExecutor(
"support benchmark core pinning.");
auto InProcessExecutorOrErr = InProcessFunctionExecutorImpl::create(
- State, std::move(ObjectFile), Scratch.get(), BenchmarkProcessCPU);
+ State, std::move(ObjectFile), Scratch.get(), BenchmarkProcessCPU,
+ DryRun);
if (!InProcessExecutorOrErr)
return InProcessExecutorOrErr.takeError();
@@ -679,6 +692,10 @@ BenchmarkRunner::createFunctionExecutor(
}
case ExecutionModeE::SubProcess: {
#ifdef __linux__
+ if (DryRun)
+ return make_error<Failure>("The subprocess execution mode cannot "
+ "dry-run measurement at this moment.");
+
auto SubProcessExecutorOrErr = SubProcessFunctionExecutorImpl::create(
State, std::move(ObjectFile), Key, BenchmarkProcessCPU);
if (!SubProcessExecutorOrErr)
diff --git a/llvm/tools/llvm-exegesis/lib/Target.cpp b/llvm/tools/llvm-exegesis/lib/Target.cpp
index 29e58692f0e92b..e2251ff978888b 100644
--- a/llvm/tools/llvm-exegesis/lib/Target.cpp
+++ b/llvm/tools/llvm-exegesis/lib/Target.cpp
@@ -98,7 +98,7 @@ ExegesisTarget::createBenchmarkRunner(
return nullptr;
case Benchmark::Latency:
case Benchmark::InverseThroughput:
- if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure &&
+ if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure &&
!PfmCounters.CycleCounter) {
const char *ModeName = Mode == Benchmark::Latency
? "latency"
@@ -116,7 +116,7 @@ ExegesisTarget::createBenchmarkRunner(
State, Mode, BenchmarkPhaseSelector, ResultAggMode, ExecutionMode,
ValidationCounters, BenchmarkRepeatCount);
case Benchmark::Uops:
- if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure &&
+ if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure &&
!PfmCounters.UopsCounter && !PfmCounters.IssueCounters)
return make_error<Failure>(
"can't run 'uops' mode, sched model does not define uops or issue "
diff --git a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
index fa37e05956be8c..07bd44ee64f1f2 100644
--- a/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
+++ b/llvm/tools/llvm-exegesis/llvm-exegesis.cpp
@@ -132,7 +132,10 @@ static cl::opt<BenchmarkPhaseSelectorE> BenchmarkPhaseSelector(
clEnumValN(
BenchmarkPhaseSelectorE::Measure, "measure",
"Same as prepare-measured-code, but also runs the measurement "
- "(default)")),
+ "(default)"),
+ clEnumValN(
+ BenchmarkPhaseSelectorE::DryRunMeasure, "dry-run-measurement",
+ "Same as measure, but does not actually execute the snippet")),
cl::init(BenchmarkPhaseSelectorE::Measure));
static cl::opt<bool>
@@ -476,7 +479,7 @@ static void runBenchmarkConfigurations(
}
void benchmarkMain() {
- if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure &&
+ if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure &&
!UseDummyPerfCounters) {
#ifndef HAVE_LIBPFM
ExitWithError(
@@ -501,7 +504,7 @@ void benchmarkMain() {
// Preliminary check to ensure features needed for requested
// benchmark mode are present on target CPU and/or OS.
- if (BenchmarkPhaseSelector == BenchmarkPhaseSelectorE::Measure)
+ if (BenchmarkPhaseSelector >= BenchmarkPhaseSelectorE::Measure)
ExitOnErr(State.getExegesisTarget().checkFeatureSupport());
if (ExecutionMode == BenchmarkRunner::ExecutionModeE::SubProcess &&
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This seems to break tests on macOS: http://45.33.8.238/macm1/98920/step_10.txt Please take a look and revert for now if it takes a while to fix. |
Also on green dragon, which I think are LLVM's official mac bots: https://green.lab.llvm.org/job/llvm.org/view/All/job/clang-stage1-cmake-RA-incremental/lastCompletedBuild/testReport/ |
We are seeing MAC test failures as well in Fuchsia toolchain builders: |
I'm making this test to run only when exegesis-can-execute-x86_64.
It's hot fixed by 63d3bd6 |
The root cause of this problem was, again, relatively subtle: The RISC-V snippet object was generated by a MC assembler separated from the LLJIT engine, which was configured using the host triple. Therefore the host -- macosx in these cases -- would expect symbols to be prefixed by underscore, while RISC-V doesn't use that convention in any way at this moment. |
The relanded test seems to be failing on x86-64 MacOS: https://lab.llvm.org/buildbot/#/builders/23/builds/6632 |
Yeah, I just reverted both this PR and the hot fix. I'm gonna dig into this. |
This relands f8f8598
Follow up on #122371:
The problem here is a little subtle: when we dry-run the measurement phase, we create a LLJIT instance without actually executing the snippets. The key is, LLJIT has its own TargetMachine which uses triple designated by LLVM_TARGET_ARCH (which is default to host). On a machine that does not support Exegesis, the LLJIT would fail to create its TargetMachine because llvm-exegesis don't even register the host's target!
Putting this test into any of the target-specific folder won't help, because it's about the host. And personally I don't really want to use
exegesis-can-execute-<arch>
for generic tests like this -- it's too strict as we don't actually need to execute the snippet.My solution here is creating another test feature which is added only when LLVM_TARGET_ARCH is supported by llvm-exegesis. This feature is something in between
<arch>-registered-target
andexegesis-can-execute-<arch>
.