-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[clang][ARM][AArch64] Define intrinsics guarded by __has_builtin on all platforms #128222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-backend-x86 Author: Nick Sarnie (sarnex) ChangesApply a preemptive workaround for a possible behavior change in Full diff: https://github.com/llvm/llvm-project/pull/128222.diff 1 Files Affected:
diff --git a/clang/lib/Headers/arm_acle.h b/clang/lib/Headers/arm_acle.h
index b1dc90f84ad36..fbd0020258722 100644
--- a/clang/lib/Headers/arm_acle.h
+++ b/clang/lib/Headers/arm_acle.h
@@ -27,6 +27,8 @@
extern "C" {
#endif
+#if !defined(__CUDA__)
+
/* 7 SYNCHRONIZATION, BARRIER AND HINT INTRINSICS */
/* 7.3 Memory barriers */
#if !__has_builtin(__dmb)
@@ -70,6 +72,7 @@ static __inline__ void __attribute__((__always_inline__, __nodebug__)) __yield(v
__builtin_arm_yield();
}
#endif
+#endif // #if !defined(__CUDA__)
#if defined(__ARM_32BIT_STATE) && __ARM_32BIT_STATE
#define __dbg(t) __builtin_arm_dbg(t)
|
@llvm/pr-subscribers-clang Author: Nick Sarnie (sarnex) ChangesApply a preemptive workaround for a possible behavior change in Full diff: https://github.com/llvm/llvm-project/pull/128222.diff 1 Files Affected:
diff --git a/clang/lib/Headers/arm_acle.h b/clang/lib/Headers/arm_acle.h
index b1dc90f84ad36..fbd0020258722 100644
--- a/clang/lib/Headers/arm_acle.h
+++ b/clang/lib/Headers/arm_acle.h
@@ -27,6 +27,8 @@
extern "C" {
#endif
+#if !defined(__CUDA__)
+
/* 7 SYNCHRONIZATION, BARRIER AND HINT INTRINSICS */
/* 7.3 Memory barriers */
#if !__has_builtin(__dmb)
@@ -70,6 +72,7 @@ static __inline__ void __attribute__((__always_inline__, __nodebug__)) __yield(v
__builtin_arm_yield();
}
#endif
+#endif // #if !defined(__CUDA__)
#if defined(__ARM_32BIT_STATE) && __ARM_32BIT_STATE
#define __dbg(t) __builtin_arm_dbg(t)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a comment for why we need to check for the CUDA arch, but I'll defer to @Artem-B for the final +1.
clang/lib/Headers/arm_acle.h
Outdated
#if !defined(__CUDA_ARCH__) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this will be observable to any code that happens to need those builtins, and they are actually not provided by the compiler (i.e. __has_builtin(__wfi)
would be host on the host).
In that case, any use of __wfi()
in the host code will break compilation because host-side code will not be able to find it.
I think a more robust fix would be to provide only declarations for these maybe-builtin functions, so the compiler always sees the correct signature of the functions that are conditionally provided depending on __has_builtin()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we unconditionally declare the functions, regardless of compilation mode/host/device/__has_builtin
result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#if __CUDA_ARCH__
// Make sure the declarations here are in sync with the functions provided under #else.
<declarations>
#else
<existing code>
#endif
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually wondering if the header needs __has_builtin()
... They are coming from the same compiler, and, if they are missing, they are declared in terms of __builtin*()
functions.
Looking at their definitions, it appears that those builtins are defined when we're targeting windows:
llvm-project/clang/include/clang/Basic/BuiltinsARM.def
Lines 212 to 220 in 79261d4
LANGBUILTIN(__yield, "v", "", ALL_MS_LANGUAGES) | |
LANGBUILTIN(__wfe, "v", "", ALL_MS_LANGUAGES) | |
LANGBUILTIN(__wfi, "v", "", ALL_MS_LANGUAGES) | |
LANGBUILTIN(__sev, "v", "", ALL_MS_LANGUAGES) | |
LANGBUILTIN(__sevl, "v", "", ALL_MS_LANGUAGES) | |
LANGBUILTIN(__dmb, "vUi", "nc", ALL_MS_LANGUAGES) | |
LANGBUILTIN(__dsb, "vUi", "nc", ALL_MS_LANGUAGES) | |
LANGBUILTIN(__isb, "vUi", "nc", ALL_MS_LANGUAGES) |
If we have to define them on non-windows targets anyways, perhaps a better fix is to let clang define them everywhere, and avoid this special case in the headers.
@compnerd -- do you recall what's the story with these non-builtin builtins here?
They were added specifically for MS targets in 4bddd9d
But five years later, @Bigcheese added these wrappers in the header:
16af23f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@compnerd @Bigcheese Do you mind taking a look at Artem's questions above? Thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rnk Reid, would you happen to have an idea what's up with these builtins on Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rnk Ping on this, thx!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have to define them on non-windows targets anyways, perhaps a better fix is to let clang define them everywhere, and avoid this special case in the headers.
Yes, we should go this direction. There are some examples of this for x86 builtins. Here's an example of one I did recently for x86:
ee92122
In this example, we leave behind an _m_prefetchw
declaration, and the builtin is defined as a "library builtin" which has an associated header. This forces callers to include the relevant header, in this case arm_acle.h
, to make the builtin callable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, ill try to implement this assuming it's relatively easy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i just pushed a commit which attempts to implement this, please take a look, thanks!
✅ With the latest revision this PR passed the C/C++ code formatter. |
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
clang/test/Headers/ms-intrin.cpp
Outdated
@@ -36,6 +36,28 @@ typedef __SIZE_TYPE__ size_t; | |||
|
|||
#include <intrin.h> | |||
|
|||
#ifdef __ARM_ACLE | |||
// arm_acle.h needs some stdint types, but -ffreestanding prevents us from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't that be fixed in arm_acle.h itself so it includes the headers with the types it needs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arm_acle.h
already includes the correct header, stdint.h
, the problem here is this test uses a custom stdint.h
in the Inputs/include
directory that doesn't have the required types, a better fix is to just add this code to that custom stdint.h
, I'll do that in the next commit, thanks.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I think this is a good direction.
This breaks real-world code with MSVC headers. Testcase: #include <windows.h>
#include <winnt.h>
void func(void) { _InstructionSynchronizationBarrier(); }
|
Investigating, thanks |
Fix #136742 |
Instead of defining ARM ACLE intrinsics only on MSVC and guarding wrapper functions in headers with
__has_builtin
, universally define the intrinsics as target header builtins.