Skip to content

query-frontend: Ensure queries are properly closed #11157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

jhesketh
Copy link
Contributor

@jhesketh jhesketh commented Apr 9, 2025

Make sure for each execution of a query we properly call the Close() interface. This is necessary where an engine (such as MQE) pools result slices.

What this PR does

Which issue(s) this PR fixes or relates to

Fixes: #5713

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

jhesketh added 4 commits April 9, 2025 16:17
Make sure for each execution of a query we properly call the Close()
interface. This is necessary where an engine (such as MQE) pools
result slices.

Fixes: grafana#5713
Comment on lines +93 to +95
// FIXME: newSeriesSetFromEmbeddedQueriesResults copies samples, but does not do a deep copy of histogram spans.
// We either need to do a deep copy of the histogram spans or somehow wrap the SeriesSets with a closer.
// Labels also need closer inspection as they appear to be pointers too.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to suggest we do a deep copy in these cases. I would have to investigate the impact, but the complexity of changing the Queryable interface or doing something else seems quite high.

That coupled with the possibility that any change here may be short lived after the introduction of the planner.

CC @charleskorn @julienduchesne

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That coupled with the possibility that any change here may be short lived after the introduction of the planner.

I think this is the deciding factor for me, and the fact that histograms aren't yet widely used - so I'm OK with the idea of doing a deep copy of histograms.

Labels also need closer inspection as they appear to be pointers too.

Can you elaborate on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this?

Labels are not copied, they are also pointers:

set = append(set, series.NewConcreteSeries(mimirpb.FromLabelAdaptersToLabels(stream.Labels), samples, histograms))

func FromLabelAdaptersToLabels(ls []LabelAdapter) labels.Labels {

Although likely not an issue for MQE at the moment, if we start pooling label resources it may become one. Thus we likely need to consider deep copying labels as well at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels.Labels instances are not pointers:

func FromLabelAdaptersToLabels(ls []LabelAdapter) labels.Labels {
is only used if stringlabels is not being used, and stringlabels has been the default for some time (might be worth proposing removing it to avoid confusion).
func FromLabelAdaptersToLabels(ls []LabelAdapter) labels.Labels {
is the equivalent for stringlabels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labels.Labels instances are not pointers

No, but if MQE starts to pool them and re-use them we need to be aware of when we Close the query. My point was more that we might need to put some protection or be careful here, not that they are currently an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't pool labels.Labels instances, as they're immutable.

@@ -1014,6 +1036,7 @@ func TestSplitAndCacheMiddleware_ResultsCacheFuzzy(t *testing.T) {
}

func TestSplitAndCacheMiddleware_ResultsCache_ExtentsEdgeCases(t *testing.T) {
t.Skip()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this test skipped?

Comment on lines +93 to +95
// FIXME: newSeriesSetFromEmbeddedQueriesResults copies samples, but does not do a deep copy of histogram spans.
// We either need to do a deep copy of the histogram spans or somehow wrap the SeriesSets with a closer.
// Labels also need closer inspection as they appear to be pointers too.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That coupled with the possibility that any change here may be short lived after the introduction of the planner.

I think this is the deciding factor for me, and the fact that histograms aren't yet widely used - so I'm OK with the idea of doing a deep copy of histograms.

Labels also need closer inspection as they appear to be pointers too.

Can you elaborate on this?

@jhesketh jhesketh marked this pull request as ready for review April 14, 2025 12:14
@jhesketh jhesketh requested a review from a team as a code owner April 14, 2025 12:14
require.Equal(t, expected.ContentLength, actual.ContentLength)

// Verify that body types match
require.Equal(t, reflect.TypeOf(expected.Body), reflect.TypeOf(actual.Body))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to compare the body types? Can't we just confirm they contain the same or equivalent content?

@@ -146,7 +146,10 @@ type PrometheusResponseExtractor struct{}

// Extract extracts response for specific a range from a response.
func (PrometheusResponseExtractor) Extract(start, end int64, from Response) Response {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we expand the comment here to clarify whose responsibility it is to close resp, and how this relates to the returned Response? (eg. does closing resp also close the returned value?)

(same comment applies to ResponseWithoutHeaders)

@@ -1698,6 +1729,9 @@ func (q roundTripper) RoundTrip(r *http.Request) (*http.Response, error) {
if err != nil {
return nil, err
}
// TODO Check if it's safe to close the response here. I suspect EncodeMetricsQueryResponse does not do a deep copy and thus it is not yet safe.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this TODO still relevant?

@@ -179,9 +185,14 @@ func (q *spinOffSubqueriesQuerier) Select(ctx context.Context, _ bool, hints *st
if err != nil {
return storage.ErrSeriesSet(fmt.Errorf("error running subquery: %w", err))
}
promRes, ok := resp.(*PrometheusResponse)
// FIXME: newSeriesSetFromEmbeddedQueriesResults copies samples, but does not do a deep copy of histogram spans.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this FIXME still apply?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Close never called on PromQL Query used in query splitting or sharding
2 participants