-
Notifications
You must be signed in to change notification settings - Fork 17
Fix issue with actor restarting instead of stopping #1223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v1.x.x
Are you sure you want to change the base?
Fix issue with actor restarting instead of stopping #1223
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes an issue where actors would restart instead of stopping when exceptions occurred during cancellation. Key changes include the introduction of a new actor subclass that raises an exception during cancellation, updates to the actor’s internal logic to properly handle cancelled state, and a new test that verifies the correct stopping behavior.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
tests/actor/test_actor.py | New actor subclass and test to check actor stops on cancellation. |
src/frequenz/sdk/actor/_actor.py | Updates in actor startup and exception handling during cancellation. |
RELEASE_NOTES.md | Updated release notes to describe the bug fix. |
When an actor throws an exception during cancellation, it restarts rather than stopping as expected. Signed-off-by: Elzbieta Kotulska <elzbieta.kotulska@frequenz.com>
4feaf6d
to
51c6e89
Compare
When an actor threw an exception during cancellation, it restarted rather than stopping as expected. Now it stops and raises an unhandled exception. Signed-off-by: Elzbieta Kotulska <elzbieta.kotulska@frequenz.com>
51c6e89
to
1045eea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a bug where an actor restarted instead of stopping when an exception was raised during cancellation. Key changes include the addition of a new test case in tests/actor/test_actor.py, an adjustment in cancellation and exception logging logic in src/frequenz/sdk/actor/_actor.py, and updated release notes.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
tests/actor/test_actor.py | Added a new test and a specialized actor (RaiseExceptionOnCancelActor) to validate proper stop behavior upon cancellation errors. |
src/frequenz/sdk/actor/_actor.py | Introduced a cancellation flag, refined exception handling in the run loop, and updated the cancel method to prevent actor restart. |
RELEASE_NOTES.md | Updated the bug fix section to reflect that actors now stop and surface the unhandled exception during cancellation. |
@@ -94,6 +114,12 @@ async def _run_loop(self) -> None: | |||
_logger.info("Actor %s: Cancelled.", self) | |||
raise | |||
except Exception: # pylint: disable=broad-except | |||
if self._is_cancelled: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception handling branch in the run loop logs the error and breaks when the actor is cancelled. It would be beneficial to add a comment explaining that this behavior is intentional to prevent restarts, which will help future maintainers understand the rationale.
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks sketchy to me, it looks like supporting a random exception to be raised during cancellation is just a fix for a but we just created by raising this exception.
|
||
if self._is_cancelled: | ||
raise RuntimeError( | ||
"Actor %s: was canceled and can't be re-run." | ||
"Please create another actor instance.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't go for this at the actor level yet. We have many actors that assume they can be restarted and count on this. I would like to make a big overhaul of how we handle tasks and actors, I would avoid making any breaking changes until then as adapting all existing code to any breaking changes here will be very costly.
name: The name of this background service. | ||
""" | ||
super().__init__(name=name) | ||
self._is_cancelled = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we save the main actor task instead and use self._main_task.cancelled()
as a single source of truth?
except asyncio.CancelledError as exc: | ||
raise RuntimeError("Actor should stop.") from exc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need to support this, this feels wrong, if a task is cancelled, it should not raise some other exception. Cancellation should always be supported and do nothing if it was already cancelled.
When an actor threw an exception during cancellation,
it restarted rather than stopping as expected.
Now it stops and raises and just prints exception.