Skip to content

implement UniqueEntityVec #17549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jan 28, 2025
Merged

Conversation

Victoronz
Copy link
Contributor

@Victoronz Victoronz commented Jan 26, 2025

Objective

In #16547, we added EntitySets/EntitySetIterators. We can know whenever an iterator only contains unique entities, however we do not yet have the ability to collect and reuse these without either the unsafe UniqueEntityIter::from_iterator_unchecked, or the expensive HashSet::from_iter.
An important piece for being able to do this is a Vec that maintains the uniqueness property, can be collected into, and is itself EntitySet.

A lot of entity collections are already intended to be "unique", but have no way of expressing that when stored, other than using an aforementioned HashSet. Such a type helps by limiting or even removing the need for unsafe on the user side when not using a validated Set type, and makes it easier to interface with other infrastructure like f.e. RelationshipSourceCollections.

Solution

We implement UniqueEntityVec.

This is a wrapper around Vec, that only ever contains unique elements. It mirrors the API of Vec, however restricts any mutation as to not violate the uniqueness guarantee. Meaning:

  • Any inherent method which can introduce new elements or mutate existing ones is now unsafe, f.e.: insert, retain_mut
  • Methods that are impossible to use safely are omitted, f.e.: fill, extend_from_within

A handful of the unsafe methods can do element-wise mutation (retain_mut, dedup_by), which can be an unwind safety hazard were the element-wise operation to panic. For those methods, we require that each individual execution of the operation upholds uniqueness, not just the entire method as a whole.

To be safe for mutable usage, slicing and the associated slice methods require a matching UniqueEntitySlice type , which we leave for a follow-up PR.

Because this type will deref into the UniqueEntitySlice type, we also offer the immutable Vec methods on this type (which only amount to a handful). "as inner" functionality is covered by additional as_vec/as_mut_vec methods + AsRef/Borrow trait impls.
Like UniqueEntityIter::from_iterator_unchecked, this type has a from_vec_unchecked method as well.

The canonical way to safely obtain this type however is via EntitySetIterator::collect_set or UniqueEntityVec::from_entity_set_iter. Like mentioned in #17513, these are named suboptimally until supertrait item shadowing arrives, since a normal collect will still run equality checks.

@Victoronz Victoronz added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events C-Usability A targeted quality-of-life change that makes Bevy easier to use D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes D-Unsafe Touches with unsafe code in some way S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Jan 26, 2025
@alice-i-cecile
Copy link
Member

I'd like to add support for this as a relations storage (for e.g. Children) too. I think that's a really powerful and sensible invariant to uphold there (and something we should probably swap to internally in a follow-up).

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent stuff: this hits all the notes I was hoping for here. Great docs, clear and unsurprising API, easy conversion back into a normal vec, simple construction from trusted set-like data structures.

Copy link
Contributor

@chescock chescock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I left a few questions and suggestions, but nothing blocking.

/// This type is best obtained by its `FromEntitySetIterator` impl, via either
/// `EntityIterator::collect_set` or `UniqueEntityVec::from_entity_iter`.
///
/// While this type can be constructed via `Iterator::collect`, doing so is inefficient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The O(N^2) FromIterator and Extend implementations seem like a bit of a performance footgun. Would it make sense to only offer them as named functions so that folks don't use them accidentally?

... Oh, I see, FromIterator is a supertrait of FromEntitySetIterator. Why is that a supertrait? UniqueEntityVec seems like an example of a type that should be FromEntitySetIterator but should not be FromIterator.

Copy link
Contributor Author

@Victoronz Victoronz Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FomEntitySetIterator is a subtrait of FromIterator because it can't exist without it!
It represents "This can be constructed from an iterator, however we can do better by skipping validation".
Any call to collect/from_iter really should evaluate to the FromEntitySetIterator version. And that is what supertrait item shadowing is for! That RFC has been accepted, and is in the process of moving through the rust dev pipeline.

The O(n^2) impls aren't necessarily wrong either, given that this struct is not bounded on either Ord or Hash. While rare, we want to leave the possibility for T to only implement Eq open.
Performance-wise, these impls can make sense for very low N, where the construction cost of things like a HashSet outweigh the few direct comparisons.

I did document that normal collect should be avoided on the struct itself, and the very concept of the struct should tell that extending this struct is not free, however I've added further documentation to these trait method impls.

It'd be nice if unintentional use of collect over collect_set could be linted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FomEntitySetIterator is a subtrait of FromIterator because it can't exist without it!
It represents "This can be constructed from an iterator, however we can do better by skipping validation".

I don't follow. Why couldn't it instead represent "this can only be constructed from an iterator of unique values"?

Not every function that accepts an EntitySetIterator can accept a plain Iterator. That's why we have these traits in the first place! So it seems perfectly reasonable to have a collection that can be created from an EntitySetIterator but not from a plain Iterator. If you want to build one from a non-unique iterator then you should call a method to validate or de-duplicate it first.

given that this struct is not bounded on either Ord or Hash

It's bounded on TrustedEntityBorrow, though, so we can always get the Entity and hash or sort that.

I did document that normal collect should be avoided on the struct itself

Yeah, but not everyone is going to read the documentation. :) They'll just write .collect() to make the type checker happy and assume it's good enough. It would be good to push them into the pit of success instead!

Copy link
Contributor Author

@Victoronz Victoronz Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FomEntitySetIterator is a subtrait of FromIterator because it can't exist without it!
It represents "This can be constructed from an iterator, however we can do better by skipping validation".

I don't follow. Why couldn't it instead represent "this can only be constructed from an iterator of unique values"?

Not every function that accepts an EntitySetIterator can accept a plain Iterator. That's why we have these traits in the first place! So it seems perfectly reasonable to have a collection that can be created from an EntitySetIterator but not from a plain Iterator. If you want to build one from a non-unique iterator then you should call a method to validate or de-duplicate it first.

At first I also thought to exclude these FromIterator and Extend impls, however what made add them back is the interaction with the greater ecosystem: Rust generally structures things by composing them, and excluding these impls means we lose on composition:
What is a collection? Different crates and traits define this differently. Sometimes everything that impls FromIterator, i.e. can be "collected" is a collection. Sometimes, if something can be made into an iterator (IntoIterator), it is a collection.
Sometimes if can be extended (Extend) it is a collection, and often, it is some composition of those three.
By excluding these trait impls, UniqueEntityVec will not qualify as a "collection" in other places.
An important example is RelationshipSourceCollection: This trait defines "collection" differently than the EntitySet logic does. Whether that is the best way to do it remains to be seen! However right now, it has a safe add method.
If UniqueEntityVec does not have a safe way to add elements, then it should not be RelationshipSourceCollection, because the safety contract cannot be propagated to a safe trait method.
The same goes for other traits: If a FromIterator impl does not exist, then an unsatisfied bound might make it not at all be possible to use UniqueEntityVec. In many cases a default/provided trait method impl can be overridden, collect is just one of the unfortunate exceptions.

I agree that this a footgun, but it is a temporary, documented, and a lintable one! (I've added UniqueEntityVec::from_iter and EntitySetIterator::collect::<UniqueEntityVec> to disallowed-methods in clippy-toml).
It is definitely a trade-off, but I lean towards not making useful things impossible, and linting/documenting sketchy uses instead.

/// This type is best obtained by its `FromEntitySetIterator` impl, via either
/// `EntityIterator::collect_set` or `UniqueEntityVec::from_entity_iter`.
///
/// While this type can be constructed via `Iterator::collect`, doing so is inefficient,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, would it be useful to have a method that does a safe O(N log N) collection using sort and dedup? It can't be the default since it reorders the input, but I think that will often be the most efficient approach for untrusted input, and I don't think the current API exposes a way to do it without unsafe.

Copy link
Contributor Author

@Victoronz Victoronz Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have another trait I want to add, ToEntitySetIterator. Since all iterators can be made unique, it is simplest to do so with a trait, it would do this with a HashSet though, since that can be built iteratively, and you could define a set of entities to exclude (like parent).
That would be the main way to get a trusted EntitySet from untrusted input. I think there are still specific cases where a sort + dedup would be preferred, but I'd consider them rare enough that it is okay to leave them up to the user to then do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, I bet what we'll actually want for sort + dedup is something like SortedVecSet from #16784. Then you could collect into a SortedVecSet want that kind of deduplication, and do a cheap trusted conversion to UniqueEntityVec if you need that instead.

(That's all future stuff, obviously, and shouldn't affect this PR.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think about it, "all elements are in sorted order" is a property a collection holds, just like "all elements are unique".
As long as a property belongs to a collection, it can be managed and preserved the same way we now do for uniqueness.
Although, for "sortedness" you'd additionally need to carry along "what it is sorted by".
Having that information would then be useful for insertion, modification and again, skipping validation.
You could even compose these properties, like in a SortedUniqueVec.
Definitely useful, though whether that should be done can be left for the future!

@alice-i-cecile alice-i-cecile added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Jan 27, 2025
@alice-i-cecile alice-i-cecile added this pull request to the merge queue Jan 28, 2025
Merged via the queue into bevyengine:main with commit b039bf6 Jan 28, 2025
28 checks passed
mrchantey pushed a commit to mrchantey/bevy that referenced this pull request Feb 4, 2025
# Objective

In bevyengine#16547, we added `EntitySet`s/`EntitySetIterator`s. We can know
whenever an iterator only contains unique entities, however we do not
yet have the ability to collect and reuse these without either the
unsafe `UniqueEntityIter::from_iterator_unchecked`, or the expensive
`HashSet::from_iter`.
An important piece for being able to do this is a `Vec` that maintains
the uniqueness property, can be collected into, and is itself
`EntitySet`.

A lot of entity collections are already intended to be "unique", but
have no way of expressing that when stored, other than using an
aforementioned `HashSet`. Such a type helps by limiting or even removing
the need for unsafe on the user side when not using a validated `Set`
type, and makes it easier to interface with other infrastructure like
f.e. `RelationshipSourceCollection`s.

## Solution

We implement `UniqueEntityVec`. 

This is a wrapper around `Vec`, that only ever contains unique elements.
It mirrors the API of `Vec`, however restricts any mutation as to not
violate the uniqueness guarantee. Meaning:
- Any inherent method which can introduce new elements or mutate
existing ones is now unsafe, f.e.: `insert`, `retain_mut`
- Methods that are impossible to use safely are omitted, f.e.: `fill`,
`extend_from_within`

A handful of the unsafe methods can do element-wise mutation
(`retain_mut`, `dedup_by`), which can be an unwind safety hazard were
the element-wise operation to panic. For those methods, we require that
each individual execution of the operation upholds uniqueness, not just
the entire method as a whole.

To be safe for mutable usage, slicing and the associated slice methods
require a matching `UniqueEntitySlice` type , which we leave for a
follow-up PR.

Because this type will deref into the `UniqueEntitySlice` type, we also
offer the immutable `Vec` methods on this type (which only amount to a
handful). "as inner" functionality is covered by additional
`as_vec`/`as_mut_vec` methods + `AsRef`/`Borrow` trait impls.
Like `UniqueEntityIter::from_iterator_unchecked`, this type has a
`from_vec_unchecked` method as well.

The canonical way to safely obtain this type however is via
`EntitySetIterator::collect_set` or
`UniqueEntityVec::from_entity_set_iter`. Like mentioned in bevyengine#17513, these
are named suboptimally until supertrait item shadowing arrives, since a
normal `collect` will still run equality checks.
github-merge-queue bot pushed a commit that referenced this pull request Feb 5, 2025
# Objective

Follow-up to #17549 and #16547.

A large part of `Vec`s usefulness is behind its ability to be sliced,
like sorting f.e., so we want the same to be possible for
`UniqueEntityVec`.

## Solution

Add a `UniqueEntitySlice` type. It is a wrapper around `[T]`, and itself
a DST.

Because `mem::swap` has a `Sized` bound, DSTs cannot be swapped, and we
can freely hand out mutable subslices without worrying about the
uniqueness invariant of the backing collection!
`UniqueEntityVec` and the relevant `UniqueEntityIter`s now have methods
and trait impls that return `UniqueEntitySlice`s.
`UniqueEntitySlice` itself can deref into normal slices, which means we
can avoid implementing the vast majority of immutable slice methods.

Most of the remaining methods:
- split a slice/collection in further unique subsections/slices
- reorder the slice: `sort`, `rotate_*`, `swap`
- construct/deconstruct/convert pointer-like types: `Box`, `Arc`, `Rc`,
`Cow`
- are comparison trait impls

As this PR is already larger than I'd like, we leave several things to
follow-ups:
- `UniqueEntityArray` and the related slice methods that would return it
    - denoted by "chunk", "array_*" for iterators
- Methods that return iterators with `UniqueEntitySlice` as their item 
    - `windows`, `chunks` and `split` families
- All methods that are capable of actively mutating individual elements.
While they could be offered unsafely, subslicing makes their safety
contract weird enough to warrant its own discussion.
- `fill_with`, `swap_with_slice`, `iter_mut`, `split_first/last_mut`,
`select_nth_unstable_*`

Note that `Arc`, `Rc` and `Cow` are not fundamental types, so even if
they contain `UniqueEntitySlice`, we cannot write direct trait impls for
them.
On top of that, `Cow` is not a receiver (like `self: Arc<Self>` is) so
we cannot write inherent methods for it either.
mrchantey pushed a commit to mrchantey/bevy that referenced this pull request Feb 17, 2025
# Objective

Follow-up to bevyengine#17549 and bevyengine#16547.

A large part of `Vec`s usefulness is behind its ability to be sliced,
like sorting f.e., so we want the same to be possible for
`UniqueEntityVec`.

## Solution

Add a `UniqueEntitySlice` type. It is a wrapper around `[T]`, and itself
a DST.

Because `mem::swap` has a `Sized` bound, DSTs cannot be swapped, and we
can freely hand out mutable subslices without worrying about the
uniqueness invariant of the backing collection!
`UniqueEntityVec` and the relevant `UniqueEntityIter`s now have methods
and trait impls that return `UniqueEntitySlice`s.
`UniqueEntitySlice` itself can deref into normal slices, which means we
can avoid implementing the vast majority of immutable slice methods.

Most of the remaining methods:
- split a slice/collection in further unique subsections/slices
- reorder the slice: `sort`, `rotate_*`, `swap`
- construct/deconstruct/convert pointer-like types: `Box`, `Arc`, `Rc`,
`Cow`
- are comparison trait impls

As this PR is already larger than I'd like, we leave several things to
follow-ups:
- `UniqueEntityArray` and the related slice methods that would return it
    - denoted by "chunk", "array_*" for iterators
- Methods that return iterators with `UniqueEntitySlice` as their item 
    - `windows`, `chunks` and `split` families
- All methods that are capable of actively mutating individual elements.
While they could be offered unsafely, subslicing makes their safety
contract weird enough to warrant its own discussion.
- `fill_with`, `swap_with_slice`, `iter_mut`, `split_first/last_mut`,
`select_nth_unstable_*`

Note that `Arc`, `Rc` and `Cow` are not fundamental types, so even if
they contain `UniqueEntitySlice`, we cannot write direct trait impls for
them.
On top of that, `Cow` is not a receiver (like `self: Arc<Self>` is) so
we cannot write inherent methods for it either.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible C-Usability A targeted quality-of-life change that makes Bevy easier to use D-Modest A "normal" level of difficulty; suitable for simple features or challenging fixes D-Unsafe Touches with unsafe code in some way S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants