Description
Occasionally, people ask about a pattern where a program has extern
statics like
extern "C" {
static mut BEGIN: u32;
static mut END: u32;
}
and the intention is not that these are two integers, but that these indicate an array of integers between the two addresses.
Is that legal, or may the compiler assume that accesses outside of the u32
are out-of-bounds and thus UB?
That boils down to two questions:
- Does LLVM make any assumption based on the type of a static, that that static is as big as the type says so accesses beyond that size are UB?
- If no, does Rust introduce such an assumption?
The former is something that hopefully the LLVM docs can answer. If the answer is "yes", Rust would have to follow suit or lobby for changing LLVM. If the answer is "no", we can decide either way.
The pattern is also a bit weird in that it promises that there are at least 4 bytes of memory available at begin
and end
each. If the actual region of memory is smaller than that, I am fairly sure that at least any use of the static is UB -- BEGIN
denotes a place of size 4, and it is (currently) UB to create dangling places. If we wanted to change this we again need to start by figuring out what LLVM's rules are.