Skip to content

Stacked Borrows: do we even want protectors? #372

Open
@RalfJung

Description

@RalfJung

Stacked Borrows 'protectors' are a mechanism ensuring that references passed to a function (including inside ADTs, Cc #125) must outlive that function -- if they are being invalidated while the function still runs, we have immediate UB.

In my view this is by far the most surprising UB in Stacked Borrows that I don't see a good fix for. Most of the other issues, in particular around mutable references prematurely invalidating things and raw pointers being too limited in the range of memory they can access, are fixable either without impacting the basic reordering optimizations, or only impacting some of the more obscure ones (such as moving a write up across an unknown function call without there already being a write before that call).

Protectors, however, are needed for all reorderings that move accesses down across unknown function calls -- even reads:

fn foo(x: &i32) -> i32 {
  let val = *x;
  unknown();
  return val; // can we return `*x` here, and not use a register for `val`?
}

Without protectors, unknown could just invalidate x by writing to it (through another alias that has write permissions), and there'd be no UB from that. (If there is a 2nd read of x after the call to unknown, then even without protectors we can optimize assuming both reads return the same value. It is only optimizations that extend the liveness of x that need protectors.)

Furthermore, protectors are used to justify the dereferenceable attribute in LLVM, which indicates that the reference is dereferenceable for the entire duration of foo. LLVM has a long-standing plan of also adding support for an attribute which means that x is only dereferenceable when foo starts running, but no such attribute has landed yet -- I guess they are struggling with keeping the code quality up under that weaker assumption, but @nikic might know more. It definitely becomes a lot harder to analyze foo if unknown were allowed to deallocate x.

So as of today, it seems we are faced with a hard choice: either we have some super subtle UB, or we lose the dereferenceable attribute and make it a lot harder for the compiler to reason about references after an unknown function got called. On the one hand I'd like to make unsafe code authors life easier by not putting the burden of such subtle rules on them, on the other hand I don't want to pessimize optimizations in all code (safe and unsafe) just to enable some really obscure barely needed patterns.

I'm wondering what others think here, and also what evidence we might have that could help us decide one way or the other.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-aliasing-modelTopic: Related to the aliasing model (e.g. Stacked/Tree Borrows)A-dereferenceableTopic: when exactly does a reference need to point to regular dereferenceable memory?C-open-questionCategory: An open question that we should revisit

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions