-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Allow balancing weights to be set per tier #125824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow balancing weights to be set per tier #125824
Conversation
diskUsageBalanceFactor | ||
); | ||
Balancer balancer = new Balancer(writeLoadForecaster, allocation, weightFunction, threshold); | ||
WeightFunction weightFunction = new SpecialisedWeightFunction(settings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we wanted to, we could have separate WeightFunctionFactory
implementations for stateful and stateless. The stateful one would just return a SingleWeightFunction
with the non-tier-specific weights, and the stateless one returns a SpecialisedWeightFunction
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment, may think more about it.
while (true) { | ||
final ModelNode minNode = modelNodes[lowIdx]; | ||
final ModelNode maxNode = modelNodes[highIdx]; | ||
final float localThreshold = sorter.minWeightDelta(minNode) * threshold; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a great deal of implied knowledge here in that it assumes that indexing and search shards are separated on separate nodes and that allocation decisions are made for that. And that the minWeightDelta
will give the same for any node in the same tier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can instead overload BalancedShardsAllocator.allocate
(in stateless) and then invoke the logic twice, once for each tier? It could pass down a stripped version of the nodes (per tier) and a specialized weight function. It is also not 100% separate (shard allocator decisions still important), but somewhat less intrusive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had another go at this. I investigated partitioning the RoutingAllocation
but that seemed like trying to unscramble an egg. It occurred to me we could do the partitioning at the NodeSorter
level.
It still feels a bit hacky, but less so than this approach. I did a prototype to test it out here #126091
I ran out of time today but I think the idea is there. I'd be interested to know if it seems sound to you. cc @pxsalehi
Property.Dynamic, | ||
Property.NodeScope | ||
); | ||
public static final Setting<Float> SEARCH_TIER_WRITE_LOAD_BALANCE_FACTOR_SETTING = Setting.floatSetting( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we'd need a flag (like a feature flag for this new behaviour) and one new setting like the INDEXING_TIER_SHARD_BALANCE_FACTOR_SETTING
that you have. And that should be enough. I mean we could add SEARCH_TIER_SHARD_BALANCE_FACTOR_SETTING
to make this cleaner sure. But why do we need SEARCH_TIER_WRITE_LOAD_BALANCE_FACTOR_SETTING
? That just conceptually doesn't make sense and IMO should be zero if the flag for this behaviour is turned on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I guess it was just in case we wanted to investigate if write load was somehow a useful predictor of search activity (e.g. if it happens that the most actively updated indices are also the most actively searched, or similar). Perhaps it might be useful to have the option to tweak it to values other than zero?
It seems unlikely, but I didn't want to pre-empt the experiment/investigation. Most likely it'll just be set to zero.
|
||
public float calculateNodeWeight( | ||
float calculateNodeWeight( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is only this function that would be different depending on if we want per tier function or one for both tiers, right? I wonder if initially we could limit the change to this? Or is that too hacky? That reduces the size of the PR and we could potentially follow up with a refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think minWeightDelta
should also be specific. That calculation includes shardWriteLoad
, so the deltas would be different in each tier.
In saying that, the POC only set the forecasted write load on search nodes to zero and left the rest untouched.
Superseded by #126091 |
First cut of approach for allowing weights to be configured per tier
Relates: ES-11367