Skip to content

querymiddleware: cache instant query errors #11120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 16, 2025

Conversation

narqo
Copy link
Contributor

@narqo narqo commented Apr 6, 2025

What this PR does

The PR adds errors caching middleware for instant query. This is basically the amendment to the original #9028 (I haven't tested it yet, but I think it should just work).

Update I ran some manual testing in a dev environment, and it works as I'd expected:

Details

Non-cachable errors aren't cached:
Screenshot 2025-04-11 at 14 30 04

Cachable errors are stored in the cache on first occurrence:
Screenshot 2025-04-11 at 14 58 00

...and are retrived from the cache on subsequent requests
Screenshot 2025-04-11 at 14 59 04

Which issue(s) this PR fixes or relates to

Fixes #11029

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@narqo narqo requested a review from a team as a code owner April 6, 2025 13:03
@narqo narqo requested a review from 56quarters April 6, 2025 15:57
Copy link
Contributor

@56quarters 56quarters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM assuming you've tested locally. Do you mind expanding the unit tests for this middleware to make sure instant queries behave how we expect?

@narqo narqo force-pushed the vldmr/query-fe-instant-query-error-caching branch from 022f49c to 1fb59dc Compare April 11, 2025 11:46
@narqo narqo requested a review from 56quarters April 11, 2025 13:56
@narqo
Copy link
Contributor Author

narqo commented Apr 11, 2025

@56quarters I've updated the tests and also ran the tests (see the updated description of the PR). PHAL

Copy link
Contributor

@56quarters 56quarters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding tests but I don't think this will work as written. The way the cache key for errors is built includes req.GetStart() and req.GetEnd() in the key. For an instant query this is the timestamp the query is executed at which will be different for every instant query.

Additionally, the screenshots of testing you did indicate another bug. A context deadline exceeded error is a timeout and shouldn't be cached.

@56quarters
Copy link
Contributor

Thanks for adding tests but I don't think this will work as written. The way the cache key for errors is built includes req.GetStart() and req.GetEnd() in the key. For an instant query this is the timestamp the query is executed at which will be different for every instant query.

Additionally, the screenshots of testing you did indicate another bug. A context deadline exceeded error is a timeout and shouldn't be cached.

The first issue can be fix by changes to DefaultCacheKeyGenerator which can detect instant queries by comparing GetStart() and GetEnd() and build the key without start/end/step parameters.

56quarters added a commit that referenced this pull request Apr 11, 2025
Discovered during #11120

Related #10537

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
56quarters added a commit that referenced this pull request Apr 11, 2025
Discovered during #11120

Related #10537

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
@56quarters
Copy link
Contributor

Additionally, the screenshots of testing you did indicate another bug. A context deadline exceeded error is a timeout and shouldn't be cached.

Opened #11198 to fix this

56quarters added a commit that referenced this pull request Apr 11, 2025
Discovered during #11120

Related #10537

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
narqo added 5 commits April 16, 2025 12:23
Signed-off-by: Vladimir Varankin <vladimir.varankin@grafana.com>
Signed-off-by: Vladimir Varankin <vladimir.varankin@grafana.com>
Signed-off-by: Vladimir Varankin <vladimir.varankin@grafana.com>
Signed-off-by: Vladimir Varankin <vladimir.varankin@grafana.com>
@narqo narqo force-pushed the vldmr/query-fe-instant-query-error-caching branch from 6bc02e5 to 5bccd34 Compare April 16, 2025 10:24
@narqo
Copy link
Contributor Author

narqo commented Apr 16, 2025

For an instant query this is the timestamp the query is executed at which will be different for every instant query.

That's a great point. I've updated the default cache key generator to handle that.

@narqo narqo requested a review from 56quarters April 16, 2025 10:30
Copy link
Contributor

@56quarters 56quarters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few comments. Thanks!

narqo added 2 commits April 16, 2025 20:35
Signed-off-by: Vladimir Varankin <vladimir.varankin@grafana.com>
Signed-off-by: Vladimir Varankin <vladimir.varankin@grafana.com>
@narqo narqo merged commit 72cd7eb into main Apr 16, 2025
26 checks passed
@narqo narqo deleted the vldmr/query-fe-instant-query-error-caching branch April 16, 2025 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Idea: cache errors from instant queries
3 participants