Add new metric to help debug HA problems #10988
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does
Adds a new metric to track problems with HA. According to this note, if the first series in the request happens to be incorrectly configured, HA will not work properly for all the other series in the request. If I'm understanding how this works correctly, this means that for users with a lot of series, if only a minority of them are configured incorrectly, it can be very hard to track down the problem. This PR tries to fix that by using a new metric to show the metric names of the series that are misconfigured, and the reason why the HA fails e.g. is it the cluster or replica label that's missing? It also adds a log line with more details.
This could potentially create a lot of new series and log lines, so it's behind a per-tenant config flag
track_ha_failures
(off by default) so we can turn it on only temporarily for users as needed.Will update changelog after getting feedback on this approach.
Which issue(s) this PR fixes or relates to
N/A
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]
.about-versioning.md
updated with experimental features.