Description
For what it's worth, I think it would be nice if the compiler could do scalar-replacement-of-aggregates (i.e. splitting a structure-typed variable into individual variables for each of the fields, which can then potentially be stored in registers rather than memory) even when a reference to one of the fields escapes.
My response is:
Note that repr(Rust) doesn't save us here; repr(Rust) means the layout is unspecified but offset_of! should still work. I am not sure how to specify a language in a reasonably simple way that allows this.
(I think C may allow this optimization, making container_of-style code illegal in C. But of course tons of code still does it in practice and we all just hope things don't fall apart. I do not consider that an acceptable outcome for Rust.)
and then @comex asks
Well, have you thought about which other optimizations may be impacted by this?
Not sure what you mean -- "this" = having byte-level untyped memory? This is what LLVM does, and to my knowledge all its optimizations are compatible with that model, so I'd say the impact is not big. But I am not an expert in what kinds of optimizations compilers (want to) do. I am coming more from the perspective of how to specify the language to enable optimizations that others tell me are desirable. Given that most languages have an ambiguous or no specification of what exactly constitutes UB, I feel like that's a niche I need to fill. ;)
Being able to separate the fields of a struct even when there are pointers to it is easy in languages with a high-level view of memory, but immensely difficult in C/C++/Rust which allow byte-wise memory manipulations. While it is possible to do so, I think the resulting semantics are inscrutable for programmers. So the real question here is, how much complexity in the spec (with the associated amount of bugs caused by misunderstandings, and complexity in Miri-like tools that aim to reflect the spec) are we willing to accept to enable such optimizations?
Activity
Diggsey commentedon Aug 6, 2020
Why would
offset_of!
break that optimisation? If a reference to a field in a struct escapes, then the provenance of that reference means that the called code cannot access any other fields on the struct, so the compiler could happily put them in registers.RalfJung commentedon Aug 6, 2020
That would use provenance to justify the optimization, which I think is very different from what @comex imagined.
Also, this doesn't work when we talk about a raw pointer to the field instead of a reference.
comex commentedon Aug 6, 2020
Sorry, to clarify:
By "this" I meant
container_of
-type operations, i.e. computing an outer object pointer from an inner object pointer.I was talking about separating the fields of a struct even if there is an escaped pointer to a field – but not pointers to the struct itself, or to any other fields. So the field which had pointers to it would be laid out as normal, but the rest of the struct might not be.
If
container_of
-type operations are forbidden for references, and if the pointer-to-field was computed using references (&foo.bar
not&raw foo.bar
), this optimization should be sound... at least in some cases.At minimum, it should be fully semantics-preserving to allocate stack space for the entire struct, but only write to the field in question, leaving the rest as uninitialized memory. Really that's just a special case of aliasing assumptions combined with dead store elimination.
As a stretch goal, it would also be nice to only allocate stack space for the field, but I suppose there are a few speed bumps. Even if the user wasn't allowed to dereference the result of
container_of
, they could still compute it, and make assumptions about the pointer value itself:I don't know how feasible it is to actually enforce these restrictions, so in practice they might rule out the "only allocate space tor the field" optimization altogether. It might be interesting to consider whether the memory model could be adjusted to free the compiler from having to enforce them. But again, that's a stretch goal; wasting a bit of stack space is usually not a big deal.
RalfJung commentedon Aug 8, 2020
That is what I assumed. This relies on a view of memory not as a chunk of bytes, but as something "tree-like" that is structured according to the types of the data stored in there. C tries to do that. It is a mess.
The question is, how do you forbid them? Languages are not defined by saying with optimizations are allowed, they are defined by a concrete operational semantics that says what happens when you execute the program. To permit optimizations like the one you describe, that operational semantics has to somehow make it such that pointer arithmetic starting at one field of a struct cannot reach other fields of that struct. Provenance can do that, but only when aliasing assumptions come into play -- not when raw pointers are used.
In that case the only avenue left is to make it so that the fields of a struct are not laid out next to each other in some byte-wise fashion, but instead form a tree in memory so that the fields are not next to each other linearly. But then you have to explain why
memcpy
and things like that work, and you end up making a lot of real-world code UB.If the pointer-to-field is raw, there are no aliasing assumptions we can make.
comex commentedon Aug 12, 2020
As I said, this optimization would not be applicable for raw pointers to fields.
RalfJung commentedon Aug 12, 2020
So you'd only do it if there were references to fields, but not if there were any raw pointers that escaped?
Yeah that should work, if we adopt Stacked Borrows or something similar. That would not rely on a different view of memory, so it is a fundamentally very different proposal.
It is in conflict with #134.
[-]What about: optimizations that rely on a higher-level view of memory[/-][+]What about: splitting struct into separate fields despite escaped references (maybe even raw pointers)[/+]&T
rust-lang/rust-memory-model#19Jules-Bertholet commentedon Jul 14, 2023
Varargs potentially wants something like this: https://rust-lang.zulipchat.com/#narrow/stream/136281-t-opsem/topic/Varargs.20SROA.20sadness
Dante-Broggi commentedon Jul 16, 2023
IIUC, Swift does not have field splitting / SROA optimization for unsafe pointers, but does for inout and shared arguments, because all values are semantically "in registers" and the only ways to semantically ensure a value actually has an address is the withUnsafe*Pointer(to:) family of intrinsic functions, or if the value is in a static variable.
A particular example of this is described in a blog post by Matt Gallagher, where only taking the address of the first element of an array passed to
strcpy
causes the tail of the array to simply not be allocated.Reflecting back to Rust, this would result in new reference types
&?addr T
/&?addr mut T
, where the existing reference types would be&addr T
/&addr mut T
if the default would change in a new edition, where the new reference types could only be convertible to the current reference types in limited ways which would not expose the address, perhaps by an intrinsic with a callback API and not directly to pointers or integers.11 remaining items