-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[Clang][AArch64] Add pessimistic vscale_range for sve/sme #137624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…eatures The "target-features" function attribute is not currently considered when adding vscale_range to a function. When +sve is pushed onto functions with "#pragma attribute push(+sve)", the function potentially misses out on optimizations that rely on vscale_range being present.
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-backend-aarch64 Author: Matthew Devereau (MDevereau) Changes…eatures The "target-features" function attribute is not currently considered when adding vscale_range to a function. When +sve is pushed onto functions with "#pragma attribute push(+sve)", the function potentially misses out on optimizations that rely on vscale_range being present. Full diff: https://github.com/llvm/llvm-project/pull/137624.diff 8 Files Affected:
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index 8c3dcda25bc8d..ba24f2aebcf5b 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -48,6 +48,7 @@
namespace llvm {
struct fltSemantics;
+class Function;
}
namespace clang {
@@ -1037,8 +1038,8 @@ class TargetInfo : public TransferrableTargetInfo,
/// Returns target-specific min and max values VScale_Range.
virtual std::optional<std::pair<unsigned, unsigned>>
- getVScaleRange(const LangOptions &LangOpts,
- bool IsArmStreamingFunction) const {
+ getVScaleRange(const LangOptions &LangOpts, bool IsArmStreamingFunction,
+ llvm::Function *F = nullptr) const {
return std::nullopt;
}
/// The __builtin_clz* and __builtin_ctz* built-in
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 3633bab6e0df9..045df30e6c77a 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -19,6 +19,7 @@
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSwitch.h"
+#include "llvm/IR/Function.h"
#include "llvm/TargetParser/AArch64TargetParser.h"
#include "llvm/TargetParser/ARMTargetParserCommon.h"
#include <optional>
@@ -794,7 +795,8 @@ AArch64TargetInfo::getTargetBuiltins() const {
std::optional<std::pair<unsigned, unsigned>>
AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts,
- bool IsArmStreamingFunction) const {
+ bool IsArmStreamingFunction,
+ llvm::Function *F) const {
if (LangOpts.VScaleMin || LangOpts.VScaleMax)
return std::pair<unsigned, unsigned>(
LangOpts.VScaleMin ? LangOpts.VScaleMin : 1, LangOpts.VScaleMax);
@@ -802,6 +804,13 @@ AArch64TargetInfo::getVScaleRange(const LangOptions &LangOpts,
if (hasFeature("sve") || (IsArmStreamingFunction && hasFeature("sme")))
return std::pair<unsigned, unsigned>(1, 16);
+ if (F && F->hasFnAttribute("target-features")) {
+ StringRef Str = F->getFnAttribute("target-features").getValueAsString();
+ for (const auto &s : llvm::split(Str, ",")) {
+ if (s == "+sve")
+ return std::pair<unsigned, unsigned>(1, 16);
+ }
+ }
return std::nullopt;
}
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 2fab88cfca901..eb9fc24e51638 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -18,6 +18,10 @@
#include "llvm/TargetParser/AArch64TargetParser.h"
#include <optional>
+namespace llvm {
+class Function;
+}
+
namespace clang {
namespace targets {
@@ -187,8 +191,8 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
llvm::SmallVector<Builtin::InfosShard> getTargetBuiltins() const override;
std::optional<std::pair<unsigned, unsigned>>
- getVScaleRange(const LangOptions &LangOpts,
- bool IsArmStreamingFunction) const override;
+ getVScaleRange(const LangOptions &LangOpts, bool IsArmStreamingFunction,
+ llvm::Function *F = nullptr) const override;
bool doesFeatureAffectCodeGen(StringRef Name) const override;
bool validateCpuSupports(StringRef FeatureStr) const override;
bool hasFeature(StringRef Feature) const override;
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index 390ef0f3ac884..95211cc7c5554 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -332,7 +332,8 @@ bool RISCVTargetInfo::initFeatureMap(
std::optional<std::pair<unsigned, unsigned>>
RISCVTargetInfo::getVScaleRange(const LangOptions &LangOpts,
- bool IsArmStreamingFunction) const {
+ bool IsArmStreamingFunction,
+ llvm::Function *F) const {
// RISCV::RVVBitsPerBlock is 64.
unsigned VScaleMin = ISAInfo->getMinVLen() / llvm::RISCV::RVVBitsPerBlock;
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index c26aa19080162..b072ccfe28ac4 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -20,6 +20,10 @@
#include "llvm/TargetParser/Triple.h"
#include <optional>
+namespace llvm {
+class Function;
+}
+
namespace clang {
namespace targets {
@@ -99,8 +103,8 @@ class RISCVTargetInfo : public TargetInfo {
const std::vector<std::string> &FeaturesVec) const override;
std::optional<std::pair<unsigned, unsigned>>
- getVScaleRange(const LangOptions &LangOpts,
- bool IsArmStreamingFunction) const override;
+ getVScaleRange(const LangOptions &LangOpts, bool IsArmStreamingFunction,
+ llvm::Function *F = nullptr) const override;
bool hasFeature(StringRef Feature) const override;
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index 4d29ceace646f..e9b1231e79289 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1117,7 +1117,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
// Add vscale_range attribute if appropriate.
std::optional<std::pair<unsigned, unsigned>> VScaleRange =
getContext().getTargetInfo().getVScaleRange(
- getLangOpts(), FD ? IsArmStreamingFunction(FD, true) : false);
+ getLangOpts(), FD ? IsArmStreamingFunction(FD, true) : false, CurFn);
if (VScaleRange) {
CurFn->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs(
getLLVMContext(), VScaleRange->first, VScaleRange->second));
diff --git a/clang/test/CodeGen/AArch64/cpu-supports-target.c b/clang/test/CodeGen/AArch64/cpu-supports-target.c
index a39ffd4e4a74d..9b551a0714e74 100644
--- a/clang/test/CodeGen/AArch64/cpu-supports-target.c
+++ b/clang/test/CodeGen/AArch64/cpu-supports-target.c
@@ -220,7 +220,7 @@ int test_versions() {
//.
// CHECK: attributes #[[ATTR0]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
// CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon" }
-// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" }
+// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" }
//.
// CHECK: [[META0:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
// CHECK: [[META1:![0-9]+]] = !{!"{{.*}}clang version {{.*}}"}
diff --git a/clang/test/CodeGen/AArch64/targetattr.c b/clang/test/CodeGen/AArch64/targetattr.c
index cfe115bf97ed3..7052f2917cf71 100644
--- a/clang/test/CodeGen/AArch64/targetattr.c
+++ b/clang/test/CodeGen/AArch64/targetattr.c
@@ -201,21 +201,21 @@ void applem4() {}
//.
// CHECK: attributes #[[ATTR0]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+lse,+neon,+ras,+rdm,+v8.1a,+v8.2a,+v8a" }
-// CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a" }
-// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a" }
-// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
-// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve-bitperm,+sve2,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
+// CHECK: attributes #[[ATTR1]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+v8.1a,+v8.2a,+v8a" }
+// CHECK: attributes #[[ATTR2]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+crc,+fp-armv8,+fullfp16,+lse,+neon,+ras,+rdm,+sve,+sve2,+v8.1a,+v8.2a,+v8a" }
+// CHECK: attributes #[[ATTR3]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" }
+// CHECK: attributes #[[ATTR4]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a710" "target-features"="+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+ete,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+mte,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+ssbs,+sve,+sve-bitperm,+sve2,+trbe,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+v9a" }
// CHECK: attributes #[[ATTR5]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR6]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+ete,+fp-armv8,+neon,+trbe,+v8a" }
// CHECK: attributes #[[ATTR7]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="generic" }
// CHECK: attributes #[[ATTR8]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+crc,+dotprod,+fp-armv8,+fullfp16,+lse,+neon,+perfmon,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs,+v8.1a,+v8.2a,+v8a" "tune-cpu"="cortex-a710" }
-// CHECK: attributes #[[ATTR9]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" "tune-cpu"="cortex-a710" }
-// CHECK: attributes #[[ATTR10]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+ccdp,+ccidx,+ccpp,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a" }
+// CHECK: attributes #[[ATTR9]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #[[ATTR10]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+ccdp,+ccidx,+ccpp,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a" }
// CHECK: attributes #[[ATTR11]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+ccdp,+ccidx,+ccpp,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a,-sve" }
-// CHECK: attributes #[[ATTR12]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" }
+// CHECK: attributes #[[ATTR12]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" }
// CHECK: attributes #[[ATTR13]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16" }
-// CHECK: attributes #[[ATTR14]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
-// CHECK: attributes #[[ATTR15]] = { noinline nounwind optnone "branch-target-enforcement" "guarded-control-stack" "no-trapping-math"="true" "sign-return-address"="non-leaf" "sign-return-address-key"="a_key" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #[[ATTR14]] = { noinline nounwind optnone vscale_range(1,16) "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #[[ATTR15]] = { noinline nounwind optnone vscale_range(1,16) "branch-target-enforcement" "guarded-control-stack" "no-trapping-math"="true" "sign-return-address"="non-leaf" "sign-return-address-key"="a_key" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR16]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
// CHECK: attributes #[[ATTR17]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-v9.3a" }
// CHECK: attributes #[[ATTR18]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="apple-m4" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fpac,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+sha3,+sme,+sme-f64f64,+sme-i16i64,+sme2,+spe-eef,+ssbs,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8.7a,+v8a,+wfxt" }
|
clang/lib/Basic/Targets/AArch64.cpp
Outdated
if (F && F->hasFnAttribute("target-features")) { | ||
StringRef Str = F->getFnAttribute("target-features").getValueAsString(); | ||
for (const auto &s : llvm::split(Str, ",")) { | ||
if (s == "+sve") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also be doing something similar for sme in streaming mode, i.e.
if (s == "+sve" || (IsArmStreamingFunction && s == "+sme"))
? Might also be worth having a test for this if you agree it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it makes sense, but the reason I hesitated was IsArmStreamingFunction looks like it could become an internal function of FunctionDecl which should help keep CodeGenFunction.cpp target agnostic. I thought it might be better to submit the changes to SME streaming mode as a separate patch along with any refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough!
nit: Looks like the title for the PR has spilled over into the commit message. |
Normally the way we handle target features is through ASTContext::getFunctionFeatureMap(); does that work here? |
It works for AArch64, but it's causing errors in unexpected places. CodeGen/X86/avx512-error.c and CodeGen/target-avx-abi-diag.c fail with
This is just when doing some simple setup: if (FD){
getContext().getFunctionFeatureMap(FeatureMap, FD);
IsArmStreaming = IsArmStreamingFunction(FD, true);
}
std::optional<std::pair<unsigned, unsigned>> VScaleRange =
getContext().getTargetInfo().getVScaleRange(
getLangOpts(), IsArmStreaming, &FeatureMap);
if (VScaleRange) {
CurFn->addFnAttr(llvm::Attribute::getWithVScaleRangeArgs(
getLLVMContext(), VScaleRange->first, VScaleRange->second));
} I assume |
Those tests are already generating that warning; I assume the "new" warning is a duplicate? That seems like a bug in the x86 handling for target features: it should generate a warning at most once per function, and the warning should point to the function in question. (Currently, it generates three warnings per function with no source location. CC @phoebewang @e-kud @KanRobert .) It might make sense to refactor the way getFunctionFeatureMap works; maybe add some kind of cache so we aren't constantly recomputing it. |
clang/lib/Basic/Targets/AArch64.cpp
Outdated
if (LangOpts.VScaleMin || LangOpts.VScaleMax) | ||
return std::pair<unsigned, unsigned>( | ||
LangOpts.VScaleMin ? LangOpts.VScaleMin : 1, LangOpts.VScaleMax); | ||
|
||
if (hasFeature("sve") || (IsArmStreamingFunction && hasFeature("sme"))) | ||
if (hasFeature("sve") || (IsArmStreamingFunction && hasFeature("sme")) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you're here, please also fix the SME feature check.
The "target-features" function attribute is not currently considered when adding vscale_range to a function. When +sve/+sme are pushed onto functions with "#pragma attribute push(+sve/+sme)", the function potentially misses out on optimizations that rely on vscale_range being present.