-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[DataLayout] Introduce sentinel pointer value #131557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lld @llvm/pr-subscribers-llvm-ir Author: Shilei Tian (shiltian) ChangesThe value of a null pointer is not always We do have This PR introduces the concept of a sentinel pointer value to
A follow-up patch series will introduce an equivalent of Full diff: https://github.com/llvm/llvm-project/pull/131557.diff 5 Files Affected:
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index a42b4589fb5ac..1506387a41d96 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -32,9 +32,9 @@ static const char *const DataLayoutStringR600 =
"-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1";
static const char *const DataLayoutStringAMDGCN =
- "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
- "-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:"
- "32-v48:64-v96:128"
+ "e-p:64:64-p1:64:64-p2:32:32-p3:32:32:32:32:f-p4:64:64-p5:32:32:32:32:f"
+ "-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16"
+ "-v24:32-v32:32-v48:64-v96:128"
"-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1"
"-ni:7:8:9";
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index c0567090fdd2a..a22020da38f83 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -3134,7 +3134,7 @@ as follows:
``A<address space>``
Specifies the address space of objects created by '``alloca``'.
Defaults to the default address space of 0.
-``p[n]:<size>:<abi>[:<pref>][:<idx>]``
+``p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]]``
This specifies the *size* of a pointer and its ``<abi>`` and
``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
@@ -3143,7 +3143,10 @@ as follows:
specified, the default index size is equal to the pointer size. All sizes
are in bits. The address space, ``n``, is optional, and if not specified,
denotes the default address space 0. The value of ``n`` must be
- in the range [1,2^24).
+ in the range [1,2^24). The fifth parameter ``<sentinel>`` specifies the
+ sentinel value of the pointer for the corresponding address space. It
+ currently accepts two values: ``0`` for an all-zero value and ``f`` for a
+ full-bit set value. The default sentinel pointer value is all-zero.
``i<size>:<abi>[:<pref>]``
This specifies the alignment for an integer type of a given bit
``<size>``. The value of ``<size>`` must be in the range [1,2^24).
@@ -13045,9 +13048,9 @@ This instruction requires several arguments:
- Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
- The call is in tail position (ret immediately follows call and ret
uses value of call or is void).
- - Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is
+ - Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is
``true``, or the calling convention is ``tailcc``.
- - `Platform-specific constraints are met.
+ - `Platform-specific constraints are met.
<CodeGenerator.html#tail-call-optimization>`_
#. The optional ``notail`` marker indicates that the optimizers should not add
diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h
index 2ad080e6d0cd2..1257ffb32a4df 100644
--- a/llvm/include/llvm/IR/DataLayout.h
+++ b/llvm/include/llvm/IR/DataLayout.h
@@ -78,6 +78,7 @@ class DataLayout {
Align ABIAlign;
Align PrefAlign;
uint32_t IndexBitWidth;
+ APInt SentinelValue;
/// Pointers in this address space don't have a well-defined bitwise
/// representation (e.g. may be relocated by a copying garbage collector).
/// Additionally, they may also be non-integral (i.e. containing additional
@@ -148,7 +149,7 @@ class DataLayout {
/// Sets or updates the specification for pointer in the given address space.
void setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth, Align ABIAlign,
Align PrefAlign, uint32_t IndexBitWidth,
- bool IsNonIntegral);
+ APInt SentinelValue, bool IsNonIntegral);
/// Internal helper to get alignment for integer of given bitwidth.
Align getIntegerAlignment(uint32_t BitWidth, bool abi_or_pref) const;
@@ -552,6 +553,11 @@ class DataLayout {
///
/// This includes an explicitly requested alignment (if the global has one).
Align getPreferredAlign(const GlobalVariable *GV) const;
+
+ /// Returns the sentinel pointer value for a given address space. If the
+ /// address space is invalid, it defaults to the sentinel pointer value of
+ /// address space 0, aligning with the behavior of \p getPointerSpec.
+ APInt getSentinelPointerValue(unsigned AS) const;
};
inline DataLayout *unwrap(LLVMTargetDataRef P) {
diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp
index 0cf0bfc9702d3..058775d68465b 100644
--- a/llvm/lib/IR/DataLayout.cpp
+++ b/llvm/lib/IR/DataLayout.cpp
@@ -206,9 +206,10 @@ constexpr DataLayout::PrimitiveSpec DefaultVectorSpecs[] = {
};
// Default pointer type specifications.
-constexpr DataLayout::PointerSpec DefaultPointerSpecs[] = {
- // p0:64:64:64:64
- {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, false},
+const DataLayout::PointerSpec DefaultPointerSpecs[] = {
+ // p0:64:64:64:64:0
+ {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, APInt(64, 0),
+ false},
};
DataLayout::DataLayout()
@@ -296,6 +297,22 @@ static Error parseSize(StringRef Str, unsigned &BitWidth,
return Error::success();
}
+static Error parseSentinelValue(StringRef Str, APInt &V) {
+ if (Str.empty())
+ return createStringError("sentinel value component cannot be empty");
+ if (Str.size() != 1)
+ return createStringError("sentinel value component must be a '0' or 'f'");
+ if (Str[0] == '0') {
+ V.clearAllBits();
+ return Error::success();
+ }
+ if (Str[0] == 'f') {
+ V.setAllBits();
+ return Error::success();
+ }
+ return createStringError("sentinel value component must be a '0' or 'f'");
+}
+
/// Attempts to parse an alignment component of a specification.
///
/// On success, returns the value converted to byte amount in \p Alignment.
@@ -409,13 +426,14 @@ Error DataLayout::parseAggregateSpec(StringRef Spec) {
}
Error DataLayout::parsePointerSpec(StringRef Spec) {
- // p[<n>]:<size>:<abi>[:<pref>[:<idx>]]
+ // p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]]
SmallVector<StringRef, 5> Components;
assert(Spec.front() == 'p');
Spec.drop_front().split(Components, ':');
- if (Components.size() < 3 || Components.size() > 5)
- return createSpecFormatError("p[<n>]:<size>:<abi>[:<pref>[:<idx>]]");
+ if (Components.size() < 3 || Components.size() > 6)
+ return createSpecFormatError(
+ "p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]]");
// Address space. Optional, defaults to 0.
unsigned AddrSpace = 0;
@@ -454,8 +472,14 @@ Error DataLayout::parsePointerSpec(StringRef Spec) {
return createStringError(
"index size cannot be larger than the pointer size");
+ APInt SentinelValue(BitWidth, 0);
+ if (Components.size() > 5) {
+ if (Error Err = parseSentinelValue(Components[5], SentinelValue))
+ return Err;
+ }
+
setPointerSpec(AddrSpace, BitWidth, ABIAlign, PrefAlign, IndexBitWidth,
- false);
+ SentinelValue, /*IsNonIntegral=*/false);
return Error::success();
}
@@ -631,7 +655,7 @@ Error DataLayout::parseLayoutString(StringRef LayoutString) {
// the spec for AS0, and we then update that to mark it non-integral.
const PointerSpec &PS = getPointerSpec(AS);
setPointerSpec(AS, PS.BitWidth, PS.ABIAlign, PS.PrefAlign, PS.IndexBitWidth,
- true);
+ PS.SentinelValue, /*IsNonIntegral=*/true);
}
return Error::success();
@@ -679,16 +703,19 @@ DataLayout::getPointerSpec(uint32_t AddrSpace) const {
void DataLayout::setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth,
Align ABIAlign, Align PrefAlign,
- uint32_t IndexBitWidth, bool IsNonIntegral) {
+ uint32_t IndexBitWidth, APInt SentinelValue,
+ bool IsNonIntegral) {
auto I = lower_bound(PointerSpecs, AddrSpace, LessPointerAddrSpace());
if (I == PointerSpecs.end() || I->AddrSpace != AddrSpace) {
PointerSpecs.insert(I, PointerSpec{AddrSpace, BitWidth, ABIAlign, PrefAlign,
- IndexBitWidth, IsNonIntegral});
+ IndexBitWidth, SentinelValue,
+ IsNonIntegral});
} else {
I->BitWidth = BitWidth;
I->ABIAlign = ABIAlign;
I->PrefAlign = PrefAlign;
I->IndexBitWidth = IndexBitWidth;
+ I->SentinelValue = SentinelValue;
I->IsNonIntegral = IsNonIntegral;
}
}
@@ -1020,3 +1047,10 @@ Align DataLayout::getPreferredAlign(const GlobalVariable *GV) const {
}
return Alignment;
}
+
+APInt DataLayout::getSentinelPointerValue(unsigned AS) const {
+ auto I = lower_bound(PointerSpecs, AS, LessPointerAddrSpace());
+ if (I != PointerSpecs.end() || I->AddrSpace == AS)
+ return I->SentinelValue;
+ return PointerSpecs[0].SentinelValue;
+}
diff --git a/llvm/unittests/IR/DataLayoutTest.cpp b/llvm/unittests/IR/DataLayoutTest.cpp
index 16a603ff6416f..79d46ebfd7d8e 100644
--- a/llvm/unittests/IR/DataLayoutTest.cpp
+++ b/llvm/unittests/IR/DataLayoutTest.cpp
@@ -313,11 +313,12 @@ TEST(DataLayout, ParsePointerSpec) {
EXPECT_THAT_EXPECTED(DataLayout::parse(Str), Succeeded());
for (StringRef Str :
- {"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32", "p0:32:32:32:32:32"})
+ {"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32:0", "p0:32:32:32:32:32:0"})
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
- FailedWithMessage("malformed specification, must be of the form "
- "\"p[<n>]:<size>:<abi>[:<pref>[:<idx>]]\""));
+ FailedWithMessage(
+ "malformed specification, must be of the form "
+ "\"p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]]\""));
// address space
for (StringRef Str : {"p0x0:32:32", "px:32:32:32", "p16777216:32:32:32:32"})
@@ -401,6 +402,19 @@ TEST(DataLayout, ParsePointerSpec) {
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
FailedWithMessage("index size cannot be larger than the pointer size"));
+
+ // sentinel value
+ for (StringRef Str :
+ {"p:32:32:32:32:a", "p0:32:32:32:32:ab", "p42:32:32:32:32:123"})
+ EXPECT_THAT_EXPECTED(
+ DataLayout::parse(Str),
+ FailedWithMessage("sentinel value component must be a '0' or 'f'"));
+
+ for (StringRef Str :
+ {"p:32:32:32:32:", "p0:32:32:32:32:", "p42:32:32:32:32:"})
+ EXPECT_THAT_EXPECTED(
+ DataLayout::parse(Str),
+ FailedWithMessage("sentinel value component cannot be empty"));
}
TEST(DataLayoutTest, ParseNativeIntegersSpec) {
|
You can test this locally with the following command:git-clang-format --diff 3757ecf5f16c0d9b8cbfc1c9d41965df537d43e6 55227d747544a2ccc5c981cff879063f9cebc253 --extensions c,cpp,h -- clang/lib/Basic/Targets/AMDGPU.cpp clang/test/CodeGen/target-data.c llvm/include/llvm/IR/DataLayout.h llvm/lib/IR/DataLayout.cpp llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp llvm/unittests/IR/DataLayoutTest.cpp View the diff from clang-format here.diff --git a/llvm/unittests/IR/DataLayoutTest.cpp b/llvm/unittests/IR/DataLayoutTest.cpp
index 090baac4b6..f01f37b20d 100644
--- a/llvm/unittests/IR/DataLayoutTest.cpp
+++ b/llvm/unittests/IR/DataLayoutTest.cpp
@@ -312,8 +312,8 @@ TEST(DataLayout, ParsePointerSpec) {
"p16777215:32:32:64:8", "p16777215:16777215:32768:32768:16777215"})
EXPECT_THAT_EXPECTED(DataLayout::parse(Str), Succeeded());
- for (StringRef Str :
- {"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32:0", "p0:32:32:32:32:32:0"})
+ for (StringRef Str : {"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32:0",
+ "p0:32:32:32:32:32:0"})
EXPECT_THAT_EXPECTED(
DataLayout::parse(Str),
FailedWithMessage(
|
c966236
to
86cd48c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs an RFC. For reference a previous attempt was at #83109
llvm/docs/LangRef.rst
Outdated
in the range [1,2^24). The fifth parameter ``<sentinel>`` specifies the | ||
sentinel value of the pointer for the corresponding address space. It | ||
currently accepts two values: ``0`` for an all-zero value and ``f`` for a | ||
full-bit set value. The default sentinel pointer value is all-zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"no sentinel pointer" should also be a valid option. We also should be able to specify the default for unlisted address spaces (I thought the old PR had that part implemented?)
I'm also not sure if we should just allow an arbitrary bit pattern here.
This also should define what the sentinel pointer means. Also clarify how this interacts with null_pointer_is_valid and the various flavors of nonnull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also should be able to specify the default for unlisted address spaces (I thought the old PR had that part implemented?)
The implementation has it set to 0 at the moment by default if nothing is specified. I will mention this here.
I'm also not sure if we should just allow an arbitrary bit pattern here.
We could extend it in the future IMHO. There is nothing practical in the upstream use. There is indeed one comment used in downstream in the RFC mentioned arbitrary bit for nullptr.
This also should define what the sentinel pointer means. Also clarify how this interacts with null_pointer_is_valid and the various flavors of nonnull
I'll resolve them next.
The RFC was posted, as mentioned in a previous comment. I discussed offline with the author of the previous attempt and took it over. |
bc7c7a3
to
c3f38d0
Compare
efda127
to
015964e
Compare
The value of a null pointer is not always `0`. For example, on AMDGPU, the null pointer in address spaces 3 and 5 is `0xffffffff`. Currently, there is no target-independent way to get this information, making it difficult and error-prone to handle null pointers in target-agnostic code. We do have `ConstantPointerNull`, but it might be a little confusing and misleading. It represents a pointer with an all-zero value rather than necessarily a real `nullptr`. This PR introduces the concept of a *sentinel pointer value* to `DataLayout`, representing the actual `nullptr` value for a given address space. The changes include: - A new interface function: ``` APInt getSentinelPointerValue(unsigned AS) ``` This function returns an `APInt` representing the sentinel pointer value for the given address space `AS`. An `APInt` is used instead of a literal integer to support cases where pointers are wider than 64 bits (e.g., AMDGPU’s address space 8). - An extension to the data layout string format: ``` p[n]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]] ``` The new `<sentinel>` component specifies the sentinel value for the corresponding pointer. It currently supports two values: - `0` for an all-zero value - `f` for a full-bit set value These two values are the most common representations of `nullptr`. It is unlikely that any target would define `nullptr` as a random value. A follow-up patch series will introduce an equivalent of `ConstantPointerNull` that represents the actual `nullptr`, built on top of this PR.
015964e
to
55227d7
Compare
The value of a null pointer is not always
0
. For example, on AMDGPU, the nullpointer in address spaces 3 and 5 is
0xffffffff
. Currently, there is notarget-independent way to get this information, making it difficult and
error-prone to handle null pointers in target-agnostic code.
We do have
ConstantPointerNull
, but it might be a little confusing andmisleading. It represents a pointer with an all-zero value rather than
necessarily a real
nullptr
.This PR introduces the concept of a sentinel pointer value to
DataLayout
,representing the actual
nullptr
value for a given address space. The changesinclude:
A new interface function:
This function returns an
APInt
representing the sentinel pointer value forthe given address space
AS
. AnAPInt
is used instead of a literal integerto support cases where pointers are wider than 64 bits (e.g., AMDGPU’s address
space 8).
An extension to the data layout string format:
The new
<sentinel>
component specifies the sentinel value for thecorresponding pointer. It currently supports two values:
z
for an all-zero valuef
for a full-bit set valueThese two values are the most common representations of
nullptr
. It isunlikely that any target would define
nullptr
as a random value.A follow-up patch series will introduce an equivalent of
ConstantPointerNull
that represents the actual
nullptr
, built on top of this PR.