Upper bound on lifetime of operation states

<a href="https://github.com/msimberg"><img src="https://avatars.githubusercontent.com/u/42977?v=4" align="left" width="96" height="96" hspace="10"></img></a> **Issue by [msimberg](https://github.com/msimberg)**
_Tuesday Sep 12, 2023 at 10:49 GMT_
_Originally opened as https://github.com/NVIDIA/stdexec/issues/1076_

----

TL;DR operation states currently have a lower bound on their lifetimes (at least until set_x is called on an associated receiver). Should they have an upper bound?

- Should some adaptors release operation states as soon as they can? Typical examples of adaptors that could do this are the "two-part" adaptors like `schedule_from`, `let_value`, `ensure_started`, `split`, `when_all`.
- Should _all_ adaptors release operation states as soon as they can?
- Is it always safe to release operation states early or does it lead to issues elsewhere? The order of destruction is changed, but I don't know if this can cause problems somewhere.

FWIW in HPX and pika we release some operation states early to avoid problems like below. We just rediscovered the problems when trying to use stdexec in place of the current implementation.

A few motivating examples below.

# split

Once the the `r` receiver of of `split` has received a completion signal, the `op_state2` does not necessarily need to be kept around because the values sent by the predecessor have been stored in the `split` shared state (https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2300r7.html#spec-execution.senders.adapt.split). stdexec currently does keep `op_state2` around until the shared state itself is released. This can lead to some, IMO, surprising behaviour.

The following example avoids deep recursion in the "forward" direction by explicitly transfering to new tasks. It creates a DAG that roughly looks like this:
```
o_1 --> o --> o --> o ... o_n
   \     \     \     \
    o     o     o     o ...
```
The unintuitive thing is even if the work itself was done without recursing deeply, once all the work is done there _will_ be recursion proportional to `n` when the operation states of `o_n` to `o_1` are released. On regular OS threads this isn't a huge problem, but the stack is quite easy to exhaust on e.g. fibers where the stack is a lot smaller.

```
  using any_sender_of_void =
    any_sender_of<ex::set_value_t(int), ex::set_error_t(std::exception_ptr), ex::set_stopped_t()>;
  exec::static_thread_pool pool{4};
  auto sched = pool.get_scheduler();

  std::size_t const n = 10000;
  std::vector<any_sender_of_void> senders;
  senders.reserve(n);
  any_sender_of_void cur = ex::just(1);

  for (std::size_t i = 0; i < n; ++i) {
    auto split = std::move(cur) | ex::split();
    auto transfer1 = split | ex::transfer(sched) | ex::then([](int x) { std::cerr << x << '\n'; });
    auto transfer2 = std::move(split) | ex::transfer(sched) | ex::then([](int x) { return x + 1; });
    cur = any_sender_of_void(std::move(transfer2));
  }

  for (auto& s: senders) {
    // The last sync_wait ends up freeing a whole stack of n split op states,
    // even though they could be freed incrementally because the split op state
    // takes ownership of values sent from predecessors
    stdexec::sync_wait(std::move(s));
  }
```

# `schedule_from`/`transfer`

One can trigger the same problem with a simpler example. The previous example was simplified from a real example, which is why it was a bit more complicated.


`schedule_from` also does not release `op_state2` as soon as it could release it (https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2300r7.html#spec-execution.senders.adaptors.schedule_from), leading to deep recursion in the destructors of the operation states, despite avoiding deep recursion in the forward direction with new tasks.

```
  using any_sender_of_void =
    any_sender_of<ex::set_value_t(int), ex::set_error_t(std::exception_ptr), ex::set_stopped_t()>;
  exec::static_thread_pool pool{4};
  auto sched = pool.get_scheduler();

  std::size_t const n = 10000;
  any_sender_of_void cur = ex::just(1);

  for (std::size_t i = 0; i < n; ++i) {
    cur = any_sender_of_void(
      std::move(cur) | ex::transfer(sched) | ex::then([](int x) { return x + 1; }));
  }

  stdexec::sync_wait(std::move(cur));
```

# `let_value`

This example shows unintuitive (again, IMO) lifetimes of the values kept alive by the `let_value` adaptor. (One of) the use cases of `let_value` is to keep something alive for the duration an asynchronous operation returned by the callable passed to `let_value`. It does that, but it keeps the values alive until the operation state of `let_value` is released, which could be _much later_. The below example uses memory as the precious resource that's kept alive longer than necessary, but the same applies for anything else that you might want to release as soon as possible (file handles, asynchronous locks, etc.)

One can work around this by manually releasing the values, but that requires knowing that the values are kept alive longer to start with.

```
  exec::static_thread_pool pool{4};
  auto sched = pool.get_scheduler();

  auto s = ex::schedule(sched) | ex::then([]() { return std::vector<int>(1000000, 42); })
         | ex::let_value([](auto& v) {
             return ex::just() | ex::then([&]() { /* do something with v */ });
             // could do this here:
             // | ex::then([&]() { auto v = std::move(v); });
           })
         | ex::then([]() {
             // We are not using v anymore here, but it stays alive in the
             // let_value op state
             std::this_thread::sleep_for(std::chrono::seconds(5));
           });
  // The large allocation is freed only when the sync_wait op state is freed
  stdexec::sync_wait(std::move(s));
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upper bound on lifetime of operation states #70

split

`schedule_from`/`transfer`

`let_value`

5 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Upper bound on lifetime of operation states #70

Description

split

schedule_from/transfer

let_value

Activity

lewissbaker commented on Jul 24, 2024

lewissbaker commented on Jul 24, 2024

msimberg commented on Jul 24, 2024

inbal2l commented on Jul 31, 2024

lewissbaker commented on Jul 31, 2024

inbal2l commented on Oct 23, 2024

5 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions

`schedule_from`/`transfer`

`let_value`