Skip to content

[llvm-exegesis][AArch64] Check for PAC keys before disabling them #138643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

abhilash1910
Copy link
Contributor

[llvm-exegesis][AArch64] Check for PAC keys before disabling them.

Tries to resolve aarch64 error arising from accessible PAC keys from PR: Disable pauth and ldgm as unsupported instructions fixed. The suspected reason is in some systems, the pac keys are present and need to be checked before setting them.

This is followup to merge the changes with following changes to attempt to fixup the arch64 failure.

Changes:

Apply checks to verify if PAC is enabled in some systems and disabling them .

@lakshayk-nv please help to verify and review.
cc @sjoerdmeijer @davemgreen

@llvmbot
Copy link
Member

llvmbot commented May 6, 2025

@llvm/pr-subscribers-tools-llvm-exegesis

Author: Abhilash Majumder (abhilash1910)

Changes

[llvm-exegesis][AArch64] Check for PAC keys before disabling them.

Tries to resolve aarch64 error arising from accessible PAC keys from PR: Disable pauth and ldgm as unsupported instructions fixed. The suspected reason is in some systems, the pac keys are present and need to be checked before setting them.

This is followup to merge the changes with following changes to attempt to fixup the arch64 failure.

Changes:

Apply checks to verify if PAC is enabled in some systems and disabling them .

@lakshayk-nv please help to verify and review.
cc @sjoerdmeijer @davemgreen


Full diff: https://github.com/llvm/llvm-project/pull/138643.diff

1 Files Affected:

  • (modified) llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp (+17-7)
diff --git a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
index a1eb5a46f21fc..d1da6f41461a7 100644
--- a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
+++ b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
@@ -207,16 +207,26 @@ class ExegesisAArch64Target : public ExegesisTarget {
 
     if (isPointerAuth(Opcode)) {
 #if defined(__aarch64__) && defined(__linux__)
+      
+      // Fix for some systems where PAC/PAG keys are present but is 
+      // explicitly getting disabled
+      unsigned long pac_keys = 0;
+      if (prctl(PR_PAC_GET_ENABLED_KEYS, &pac_keys, 0, 0, 0) < 0) {
+        return "Failed to get PAC key status";
+      }
+      
       // Disable all PAC keys. Note that while we expect the measurements to
       // be the same with PAC keys disabled, they could potentially be lower
       // since authentication checks are bypassed.
-      if (prctl(PR_PAC_SET_ENABLED_KEYS,
-                PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY |
-                    PR_PAC_APDBKEY, // all keys
-                0,                  // disable all
-                0, 0) < 0) {
-        return "Failed to disable PAC keys";
-      }
+      if (pac_keys != 0) {
+        if (prctl(PR_PAC_SET_ENABLED_KEYS,
+                  PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY |
+                      PR_PAC_APDBKEY, // all keys
+                  0,                  // disable all
+                  0, 0) < 0) {
+          return "Failed to disable PAC keys";
+        }
+      } 
 #else
       return "Unsupported opcode: isPointerAuth";
 #endif

Copy link

github-actions bot commented May 6, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run clang-format over the patch to fix the code formatting.

@@ -207,16 +207,26 @@ class ExegesisAArch64Target : public ExegesisTarget {

if (isPointerAuth(Opcode)) {
#if defined(__aarch64__) && defined(__linux__)

// Fix for some systems where PAC/PAG keys are present but is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment could probably be more descriptive and instead of just stating that it is a fix, state the case that it is handling and what would happen if we do not handle that case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, please help to review.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really describe the why that I am asking about. This being "better" or "safer" doesn't really mean anything to someone who doesn't already understand the pointer authentication flow.

// For systems without PAC, this is a No-op but with PAC, it is
// safer to check the existing key state and then disable/enable them.
// Hence the guard placed for switching.
unsigned long pac_keys = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asl this is a linux pac question so I'm not sure if this is correct behavior and will leave it to you

I am curious though - this seems like a PAC behavior choice based on compile time process configuration rather than specified target info? Is this expected behavior (I don't understand how ptrauth works on linux)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Tagging @atrosinenko

Essentially we expect two things:

  • If full pauth ABI is enabled, it is done in platform-specific manner. E.g. via ABI field in target quadruple for common platforms, or some other way for downstream
  • PAC-RET is expected to work regardless of settings since it is an ABI-neutral change.

My expectation is that PAC instructions in hint space should be handled gracefully regardless of PAC state. But maybe I am missing some context.

@asl asl requested a review from atrosinenko May 7, 2025 08:31
Copy link
Contributor

@atrosinenko atrosinenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding is correct, the problem is that llvm-exegesis may fail with an error like "AUTIA: Failed to disable PAC keys", depending on the CPU features. It looks like I have managed to reproduce this issue in QEMU (in full-system emulation mode), the results are below.

As far as I see, PAuth-related prctls are not documented in man pages well, but thankfully the kernel sources are quite readable on themselves:

Each configuration was tested by running

strace -f -e prctl \
    /path/to/llvm-project/build-aarch64/bin/llvm-exegesis \
        -mtriple=aarch64-linux-gnu \
        -mcpu=neoverse-v2 \
        -mode=latency \
        --opcode-name=AUTIA \
        --use-dummy-perf-counters

On the main branch (commit cd6c4b6), with emulated CPU set to neoverse-v1,pauth-impdef=on (Neoverse-V2 is not supported in QEMU 9.2.1, but Neoverse-V1 seems to be enough to reproduce the issue. Similarly, pauth-impdef=on should not affect the issue, but makes emulation faster) I got

prctl(PR_PAC_SET_ENABLED_KEYS, PR_PAC_APIAKEY|PR_PAC_APIBKEY|PR_PAC_APDAKEY|PR_PAC_APDBKEY, 0, 0, 0) = 0
---
mode:            latency
key:
  instructions:
    - 'AUTIA X23 X23 X15'
  config:          ''
  register_initial_values:
    - 'X23=0x0'
    - 'X15=0x0'
cpu_name:        neoverse-v2
llvm_triple:     aarch64-unknown-linux-gnu
min_instructions: 10000
measurements:    []
error:           ''
info:            Repeating a single implicitly serial instruction
assembled_snippet: F70F1FF8170080D20F0080D2F711C1DAF711C1DAF711C1DAF711C1DAF70741F8C0035FD6
...
+++ exited with 0 +++

and setting CPU to cortex-a72 gives this output:

prctl(PR_PAC_SET_ENABLED_KEYS, PR_PAC_APIAKEY|PR_PAC_APIBKEY|PR_PAC_APDAKEY|PR_PAC_APDBKEY, 0, 0, 0) = -1 EINVAL (Invalid argument)
AUTIA: Failed to disable PAC keys
+++ exited with 0 +++

(by the way, return code is 0, which is a bit surprising...). This is probably because of if (!system_supports_address_auth()) return -EINVAL; check in the kernel, assuming system_supports_address_auth() means "FEAT_PAUTH was detected at run-time" (instead of "the kernel was compiled with PAuth support" or something like this).

This PR (commit 2a9af0f) produces an error like the following with both CPU models:

prctl(PR_PAC_GET_ENABLED_KEYS, 0x7ff2234598, 0, 0, 0) = -1 EINVAL (Invalid argument)
AUTIA: Failed to get PAC key status
+++ exited with 0 +++

Turned out, all arguments should be set to 0 and the bitmask is returned as the return value of prctl.

I applied a quick fix:

diff --git a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
index 7e6382693c3e..3d59955177cb 100644
--- a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
+++ b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
@@ -213,8 +213,8 @@ private:
       // For systems without PAC, this is a No-op but with PAC, it is
       // safer to check the existing key state and then disable/enable them.
       // Hence the guard for switching.
-      unsigned long PacKeys = 0;
-      if (prctl(PR_PAC_GET_ENABLED_KEYS, &PacKeys, 0, 0, 0) < 0) {
+      unsigned long PacKeys = prctl(PR_PAC_GET_ENABLED_KEYS, 0, 0, 0, 0);
+      if ((long)PacKeys < 0) {
         return "Failed to get PAC key status";
       }
 

With CPU set to neoverse-v1,pauth-impdef=on the following output is produced:


prctl(PR_PAC_GET_ENABLED_KEYS, 0, 0, 0, 0) = 0xf (PR_PAC_APIAKEY|PR_PAC_APIBKEY|PR_PAC_APDAKEY|PR_PAC_APDBKEY)
prctl(PR_PAC_SET_ENABLED_KEYS, PR_PAC_APIAKEY|PR_PAC_APIBKEY|PR_PAC_APDAKEY|PR_PAC_APDBKEY, 0, 0, 0) = 0
---
mode:            latency
key:
  instructions:
    - 'AUTIA X23 X23 X0'
  config:          ''
  register_initial_values:
    - 'X23=0x0'
    - 'X0=0x0'
cpu_name:        neoverse-v2
llvm_triple:     aarch64-unknown-linux-gnu
min_instructions: 10000
measurements:    []
error:           ''
info:            Repeating a single implicitly serial instruction
assembled_snippet: F70F1FF8170080D2000080D21710C1DA1710C1DA1710C1DA1710C1DAF70741F8C0035FD6
...
+++ exited with 0 +++

but with cortex-a72 it fails with:

prctl(PR_PAC_GET_ENABLED_KEYS, 0, 0, 0, 0) = -1 EINVAL (Invalid argument)
AUTIA: Failed to get PAC key status
+++ exited with 0 +++

Probably, EINVAL is always returned when FEAT_PAUTH is not supported by the CPU.

Comment on lines 225 to 229
if (prctl(PR_PAC_SET_ENABLED_KEYS,
PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY |
PR_PAC_APDBKEY, // all keys
0, // disable all
0, 0) < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] It may be more readable to define a wrapper with the default argument values in C++, as in this example these three literal 0s have different meaning: the first one is a bitmask and the rest are "the argument is unused, pass 0 now, other values to de defined in the future"

static int prctl_wrapper(int op, long arg2 = 0, long arg3 = 0, long arg4 = 0, long arg5 = 0) {
  return prctl(op, arg2, arg3, arg4, arg5);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the descriptive details , have put a error check for EINVAL and also applied the above refactoring.
Since this is a kernel level call to system_supports_address_auth() , I am open to suggestions on how to handle this better.

@tstellar
Copy link
Collaborator

Since this has been broken for a while now can we revert the commit that introduced this failure while the fix is being worked on?

@@ -199,6 +200,11 @@ class ExegesisAArch64Target : public ExegesisTarget {
PM.add(createAArch64ExpandPseudoPass());
}

static long prctl_wrapper(int op, long arg2 = 0, long arg3 = 0, long arg4 = 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this doing much, or can it be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just acting as a wrapper class based on @atrosinenko 's suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davemgreen @abhilash1910 The idea was to leverage C++ ability to specify the default argument values instead of passing zeros explicitly with an "unused" comment (and to convert variadic arguments to long along the way). It may be a good idea to explain this in a brief comment before prctl_wrapper (especially the conversion to long). By the way, arg4 and arg5 could probably be omitted for brevity, as this function is not a generic prctl wrapper anyway: for example, some non-PAuth prctl operations may involve more complicated argument types.

Something along the lines:

  // Converts variadic arguments to `long` and passes zeros for the unused
  // arg2-arg5, as tested by the Linux kernel.
  static long prctl_wrapper(int op, long arg2 = 0, long arg3 = 0) {
    return prctl(op, arg2, arg3, /*arg4=*/0L, /*arg5=*/0L);
  }

0); // unused
if ((long)PacKeys < 0) {
if (errno == EINVAL) {
return "PAuth not supported on this system";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this return nullptr because if they cannot be disabled, we can still run them (they are already disabled, or they will not be supported by the CPU which goes through the normal unsupported method)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of having nullptr check as well ; it rather differentiates between :

  • PAuth is not supported and we stop (or) -> nullptr can be returned (?)
  • PAuth is not supported but we can continue

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a system where pacbti is not enabled at the system level (the linux distro does not support it yet), but it is fine to check the latency/throughput of the instructions - they are just already disabled. Trying this patch leads to a "AUTDA: PAuth not supported on this system" but we might as well allow people to try and test the instructions if they are available.

Copy link
Contributor

@atrosinenko atrosinenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhilash1910 Thanks for the updates, I left a few comments on prctl_wrapper so far, will re-test the patch in QEMU slightly later.

@@ -199,6 +200,11 @@ class ExegesisAArch64Target : public ExegesisTarget {
PM.add(createAArch64ExpandPseudoPass());
}

static long prctl_wrapper(int op, long arg2 = 0, long arg3 = 0, long arg4 = 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davemgreen @abhilash1910 The idea was to leverage C++ ability to specify the default argument values instead of passing zeros explicitly with an "unused" comment (and to convert variadic arguments to long along the way). It may be a good idea to explain this in a brief comment before prctl_wrapper (especially the conversion to long). By the way, arg4 and arg5 could probably be omitted for brevity, as this function is not a generic prctl wrapper anyway: for example, some non-PAuth prctl operations may involve more complicated argument types.

Something along the lines:

  // Converts variadic arguments to `long` and passes zeros for the unused
  // arg2-arg5, as tested by the Linux kernel.
  static long prctl_wrapper(int op, long arg2 = 0, long arg3 = 0) {
    return prctl(op, arg2, arg3, /*arg4=*/0L, /*arg5=*/0L);
  }

abhilash1910 and others added 5 commits May 12, 2025 18:15
Co-authored-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Co-authored-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Copy link
Contributor

@atrosinenko atrosinenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just realized that the original issue was probably related to FEAT_FPAC (by the way, support for FEAT_FPAC is among the differences between emulated Neoverse-V1 that I used for testing and Neoverse-V2 mentioned it tests). I'm not sure if QEMU defines any "real" CPU model with FEAT_FPAC, but the max pseudo CPU model should work, though.

At the commit fd254b0, I get

AUTIA: PAuth not supported on this system

for cortex-a72 emulated CPU model and successfully executed snippets for neoverse-v1,pauth-impdef=on and max,pauth-impdef=on. If multiple opcodes are passed like in --opcode-name=AUTIASP,AUTIA, two prctl system calls ("get" and then "set") are performed for the first opcode and one more "get" prctl is performed for each subsequent opcode (or exactly one failing prctl per pauth opcode in case of cortex-a72, of course).

On the other hand, when executed on cortex-a72 emulated CPU, llvm-exegesis -mtriple=aarch64-linux-gnu -mcpu=neoverse-v2 -mode=latency --opcode-name=AUTIA --benchmark-phase=assemble-measured-code returns AUTIA: PAuth not supported on this system, even though no snippet code is to be executed on the CPU, which is not correct.

It looks like the high-level logic of this code should be improved (and the description in getIgnoredOpcodeReasonOrNull updated). As far as I see, here are the issues that we want to avoid:

  • if the CPU implements FEAT_PAuth with FEAT_FPAC, any snippet involving authentication instruction is likely to crash. As far as I understand, this does not crash the entire llvm-exegsis process, as it is able to handle signals that are normally fatal, but at least it reports errors that may be unexpected by the user.
  • if the CPU does not implement FEAT_PAuth at all, llvm-exegesis still should be able to perform all tasks not involving actual snippet execution
    • it is even possible to run snippet generation (but not execution, of course) for the same instructions on x86_64 computer!
  • error diagnostic is a plus

Given the above points, a reasonable solution could probably be keeping the approach implemented by this PR (request the enabled keys and, if any key is currently enabled, disable everything). But no errors should probably be reported on failure, not even "Unsupported opcode: isPointerAuth" on non-AArch64 systems. Additionally, it seems a good idea to give the user an option to opt out of these prctls altogether.

  • getIgnoredOpcodeReasonOrNull is probably intended to return error messages for the opcodes whose benchmarking is meaningless unless some custom support is implemented someday. It seems to be called even if merely generating a snippet on a foreign CPU architecture
  • disabling PAuth keys as a side effect of getIgnoredOpcodeReasonOrNull call is probably OK, provided this does not crash llvm-exegesis
    • I guess, this could be a problem if llvm-exegesis executable or any shared object up the call stack is built with pac-ret - this probably only applies to libc stuff (including thread entry functions, if any)
  • a failure to disable the keys when FEAT_FPAC is implemented, on the other hand, is probably not too harmful, as something like error: 'snippet crashed while running: Illegal instruction' would be printed in the report
    • as a reasonable tradeoff, you could return nullptr silently if prctl(PR_PAC_GET_ENABLED_KEYS) either returns 0 or sets errno to EINVAL (to rule out all "not applicable" cases we know about), but return an error message if the second prctl call was performed, and it failed (as it likely indicates we do something wrong)
  • I doubt it is safe to assume every CPU would have the same timings with keys being enabled vs. disabled
    • it may be reasonable to perform prctl calls to disable all the keys by default, but even then it is probably a good idea to print a free-form warning to the console (like it is already done by some other part of llvm-exegesis for AUTIASP: setRegTo is not implemented, results will be unreliable)
    • or the default behavior could be to keep everything as-is and disable the keys if a special command line option is given

Comment on lines +203 to +207
// Converts variadic arguments to `long` and passes zeros for the unused
// arg2-arg5, as tested by the Linux kernel.
static long prctl_wrapper(int op, long arg2 = 0, long arg3 = 0) {
return prctl(op, arg2, arg3, /*arg4=*/0L, /*arg5=*/0L);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please surround this in #if ... #endif, as otherwise compilation fails for non-AArch64 systems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have also added the nullptr return so as to not throw the exceptions.
For having something like: --aarch64-disable-pac-control
I was thinking something like this:

const char *getIgnoredOpcodeReasonOrNull(const LLVMState &State,
                                           unsigned Opcode) const override {
    bool DisablePACControl = !State.getExegesisTarget().getOptions().AArch64DisablePACControl;
    
    if (const char *Reason =
            ExegesisTarget::getIgnoredOpcodeReasonOrNull(State, Opcode))
      return Reason;

    if (isPointerAuth(Opcode)) {
#if defined(__aarch64__) && defined(__linux__)
   if (DisablePACControl) {
   
   //rest of the code as is///....
   }
#endif

@atrosinenko @davemgreen what do you think ?

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable enough to me at this point.

Please get approval from someone who actually understands pointer authentication before merging though.

@ojhunt
Copy link
Contributor

ojhunt commented May 15, 2025

The linux PAC launch path was explained to me on discord and my concerns were cleared up, I can't speak for the PR code itself, my concerns were with how the actual process launch was being handled structurally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants