-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Move unpromotable relocations to its own transport action #127330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Move unpromotable relocations to its own transport action #127330
Conversation
Relates ES-10339
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
Hi @fcofdez, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good though I have a comment on the cleanup.
onGoingRecoveries.markRecoveryAsDone(recoveryId); | ||
return null; | ||
}), indexShard::preRecovery); | ||
try (onCompletion) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think this releases the recovery monitor and the recovery-ref too soon? My intuition would be that it should only be done when the action completes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that the RecoveryTarget would be retained until the recovery is marked as done (since the initial refCount=1 from the AbstractRefCounted corresponds to that decRef). But just to be on the safe side I've reverted to the previous behaviour that would release the RecoveryRef once the action returns.
…-relocation-handoff
This reverts commit 9ef9621.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
(though I'd like Iraklis to have a look at RecoveriesCollection if possible).
@@ -167,7 +167,6 @@ public RecoveryRef getRecoverySafe(long id, ShardId shardId) { | |||
throw new IndexShardClosedException(shardId); | |||
} | |||
assert recoveryRef.target().shardId().equals(shardId); | |||
assert recoveryRef.target().indexShard().routingEntry().isPromotableToPrimary(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not out of the top of my head. But going back to the code, I see we've made a special branch in PeerRecoveryTargetService#doRecovery()
with if (indexShard.routingEntry().isPromotableToPrimary() == false) {
for unpromotables that basically quick skips all recovery stages, and closes the RecoveryRef as well. So the point of the assertion at the time was that there should be no other coordination needed for unpromotables to justify getting the RecoveryRef.
Seeing though that now this PR introduces some sort of coordination between unpromotables, it probably makes to remove the assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I did not fully review this PR, but feel free to tell me if I should)
Relates ES-10339