Releases: grafana/mimir
2.15.2
This release contains 4 PRs from 3 authors. Thank you!
Changelog
2.15.2
Grafana Mimir
- [BUGFIX] Update module golang.org/x/net to v0.36.0 to address CVE-2025-22870. #10875
- [BUGFIX] Update module github.com/golang-jwt/jwt/v5 to v5.2.2 to address CVE-2025-30204. #11045
All changes in this release: mimir-2.15.1...mimir-2.15.2
2.16.0
This release contains 463 PRs from 69 authors, including new contributors Alessandro Verzicco, Alex Greenbank, André Pires, Bjorn Stout, Bruno FERNANDO, Casie Chen, Dustin Wilson, Edwin Tye, Kenny Trytek, Leszek Błażewski, Markus Opolka, Matthew Jacobson, Matt Veitas, mimir-github-bot[bot], Moustafa Baiou, Ryan Brady, TheRealNoob. Thank you!
Grafana Mimir version 2.16.0 release notes
Features and enhancements
In rulers, when rule concurrency is enabled for a rule group, its rules will now be reordered and run in batches based on their dependencies. This increases the number of rules that can potentially run concurrently. Note that the global and tenant-specific limits around the number of rule groups and rules per group still apply.
Using mimirtool
to analyze Grafana dashboards now supports bar chart, pie chart, state timeline, status history, histogram, candlestick, canvas, flame graph, geomap, node graph, trend, and XY chart panels.
Important changes
In Grafana Mimir 2.16, the following behavior has changed:
Grafana Mimir only provides container images based on distroless images. Alpine Linux-based container images were deprecated in the 2.12 release and are no longer built.
How experimental PromQL functions are enabled has changed.
- The experimental CLI flags
-querier.promql-experimental-functions-enabled
and-query-frontend.block-promql-experimental-functions
and respective YAML configuration have been removed from query-frontends and queriers. - Experimental PromQL functions are disabled by default but can be enabled using only the per-tenant setting
enabled_promql_experimental_functions
.
Support for native histograms and out-of-order native histograms is enabled by default in ingesters.
Distributors discard float and histogram samples with duplicated timestamps from each timeseries in a request before the request is forwarded to ingesters. Discarded samples are tracked by cortex_discarded_samples_total
metrics with the reason sample_duplicate_timestamp
.
Experimental features
Grafana Mimir 2.16 includes some features that are experimental and disabled by default. Use these features with caution and report any issues that you encounter:
Distributors now include experimental support for the Influx line protocol.
Query-frontends now include experimental support to "spin off" subqueries as actual range queries, so that they benefit from query acceleration techniques such as sharding, splitting, and caching.
Bug fixes
- Distributor: Use a boolean to track changes while merging the ReplicaDesc components, rather than comparing the objects directly. #10185
- Querier: Fix timeout responding to the query-frontend when the response size is within a few hundred bytes of
-querier.frontend-client.grpc-max-send-msg-size
. #10154 - Query-frontend and querier: Show warning and info annotations in some cases where they were missing (if a lazy querier was used). #10277
- Query-frontend: Fix an issue where transient errors are inadvertently cached. #10537 #10631
- Ruler: Fix indeterminate rules always, instead of never, running concurrently when
-ruler.max-independent-rule-evaluation-concurrency
is set. prometheus/prometheus#15560 #10258 - PromQL: Fix various UTF-8 bugs related to quoting. prometheus/prometheus#15531 #10258
- Ruler: Fix an issue when using the experimental
-ruler.max-independent-rule-evaluation-concurrency
feature, where if a rule group was eligible for concurrency, it would flap between running concurrently or not based on the time it took after running concurrently. #9726 #10189 - Mimirtool:
remote-read
commands now return data. #10286 - PromQL: Fix deriv, predict_linear and double_exponential_smoothing with histograms prometheus/prometheus#15686 #10383
- MQE: Fix deriv with histograms. #10383
- PromQL: Fix <aggr_over_time> functions with histograms. prometheus/prometheus#15711 #10400
- MQE: Fix <aggr_over_time> functions with histograms. #10400
- Distributor: Return HTTP status 415, Unsupported Media Type, instead of 200, Success, for Remote Write 2.0 until we support it. #10423 #10916
- Query-frontend: Add
-query-frontend.prom2-range-compat
flag and corresponding YAML to rewrite queries with ranges that worked in Prometheus 2 but are invalid in Prometheus 3. #10445 #10461 #10502 - Distributor: Fix edge case at the HA-tracker with memberlist as KVStore, where when a replica in the KVStore is marked as deleted but not yet removed, it fails to update the KVStore. #10443
- Distributor: Fix panics in
DurationWithJitter
util functions when computed variance is zero. #10507 - Ingester: Fixed a race condition in the
PostingsForMatchers
cache that may have infrequently returned expired cached postings. #10500 - Distributor: Report partially converted OTLP requests with status 400, Bad Request. #10588
- Ruler: Fix issue where rule evaluations could be missed while shutting down a ruler instance if that instance owns many rule groups. prometheus/prometheus#15804 #10762
- Ingester: Add additional check on reactive limiter queue sizes. #10722
- TSDB: Fix unknown series errors and possible lost data during WAL replay when series are removed from the head due to inactivity and reappear before the next WAL checkpoint. prometheus/prometheus#16060 #10824
- Querier: Fix issue where
label_join
could incorrectly return multiple series with the same labels rather than failing withvector cannot contain metrics with the same labelset
. prometheus/prometheus#15975 #10826 - Querier: Fix issue where counter resets on native histograms could be incorrectly under or over-counted when using subqueries. prometheus/prometheus#15987 #10871
- Ingester: Fix goroutines and memory leak when experimental ingest storage is enabled and a server-side error occurs during metrics ingestion. #10915
- Mimirtool: Fix issue where
MIMIR_HTTP_PREFIX
environment variable was ignored and the value fromMIMIR_MIMIR_HTTP_PREFIX
was used instead. #10207
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm chart is released independently.
Refer to the Grafana Mimir Helm chart documentation.
Changelog
2.16.0
Grafana Mimir
- [CHANGE] Querier: pass context to queryable
IsApplicable
hook. #10451 - [CHANGE] Distributor: OTLP and push handler replace all non-UTF8 characters with the unicode replacement character
\uFFFD
in error messages before propagating them. #10236 - [CHANGE] Querier: pass query matchers to queryable
IsApplicable
hook. #10256 - [CHANGE] Build: removed Mimir Alpine Docker image and related CI tests. #10469
- [CHANGE] Query-frontend: Add
topic
label tocortex_ingest_storage_strong_consistency_requests_total
,cortex_ingest_storage_strong_consistency_failures_total
, andcortex_ingest_storage_strong_consistency_wait_duration_seconds
metrics. #10220 - [CHANGE] Ruler: cap the rate of retries for remote query evaluation to 170/sec. This is configurable via
-ruler.query-frontend.max-retries-rate
. #10375 #10403 - [CHANGE] Query-frontend: Add
topic
label tocortex_ingest_storage_reader_last_produced_offset_requests_total
,cortex_ingest_storage_reader_last_produced_offset_failures_total
,cortex_ingest_storage_reader_last_produced_offset_request_duration_seconds
,cortex_ingest_storage_reader_partition_start_offset_requests_total
,cortex_ingest_storage_reader_partition_start_offset_failures_total
,cortex_ingest_storage_reader_partition_start_offset_request_duration_seconds
metrics. #10462 - [CHANGE] Ingester: Set
-ingester.ooo-native-histograms-ingestion-enabled
to true by default. #10483 - [CHANGE] Ruler: Add
user
andreason
labels tocortex_ruler_write_requests_failed_total
andcortex_ruler_queries_failed_total
; adduser
to
cortex_ruler_write_requests_total
andcortex_ruler_queries_total
metrics. #10536 - [CHANGE] Querier / Query-frontend: Remove experimental
-querier.promql-experimental-functions-enabled
and-query-frontend.block-promql-experimental-functions
CLI flags and respective YAML configuration options to enable experimental PromQL functions. Instead access to experimental PromQL functions is always blocked. You can enable them using the per-tenant settingenabled_promql_experimental_functions
. #10660 #10712 - [CHANGE] Store-gateway: Include posting sampling rate in sparse index headers. When the sampling rate isn't set in a sparse index header, store gateway rebuilds the sparse header with the configured
blocks-storage.bucket-store.posting-offsets-in-mem-sampling
value. If the sparse header's sampling rate is set but doesn't match the configured rate, store gateway either rebuilds the sparse header or downsamples to the configured sampling rate. #10684 #10878 - [CHANGE] Distributor: Return specific error message when burst size limit is exceeded. #10835
- [CHANGE] Ingester: enable native histograms ingestion by default, meaning
ingester.native-histograms-ingestion-enabled
defaults to true. #10867 - [FEATURE] Ingester/Distributor: Add support for exporting cost attribution metrics (
cortex_ingester_attributed_active_series
,cortex_distributor_received_attributed_samples_total
, andcortex_discarded_attributed_samples_total
) with labels specified by customers to a custom Prometheus registry. This feature enables more flexible billing data tracking. #10269 #10702 - [FEATURE] Ruler: Added
/ruler/tenants
endpoints to list the discovered tenants with rule groups. #10738 - [FEATURE] Distributor: Add experimental Influx handler. #10153
- [ENHANCEMENT]...
2.15.1
This release contains 8 PRs from 3 authors. Thank you!
Changelog
2.15.1
Grafana Mimir
- [BUGFIX] Update module github.com/golang/glog to v1.2.4 to address CVE-2024-45339. #10541
- [BUGFIX] Update module github.com/go-jose/go-jose/v4 to v4.0.5 to address CVE-2025-27144. #10783
- [BUGFIX] Update module golang.org/x/oauth2 to v0.27.0 to address CVE-2025-22868. #10803
- [BUGFIX] Update module golang.org/x/crypto to v0.35.0 to address CVE-2025-22869. #10804
- [BUGFIX] Upgrade Go to 1.23.7 to address CVE-2024-45336, CVE-2024-45341, and CVE-2025-22866. #10862
All changes in this release: mimir-2.15.0...mimir-2.15.1
2.14.3
This release contains 4 PRs from 3 authors. Thank you!
Changelog
2.14.3
Grafana Mimir
- [BUGFIX] Update
golang.org/x/crypto
to address CVE-2024-45337. #10251 - [BUGFIX] Update
golang.org/x/net
to address CVE-2024-45338. #10298
All changes in this release: mimir-2.14.2...mimir-2.14.3
2.15.0
This release contains 487 PRs from 61 authors, including new contributors Alexander Akhmetov, Daan Schipper, Daniel Kovacs, I. Elisa Pasaoglu, Jay Clifford, Jorge Alberto Díaz Orozco (Akiel), Martin Valiente Ainz, Mia, Michael Tweten, Nikos Angelopoulos, Santi Leira, cui fliter, elsoa-invitech, madhu-reddy-peram. Thank you!
Grafana Mimir version 2.15.0 release notes
Grafana Labs is excited to announce version 2.15 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bug fixes in this release.
For the complete list of changes, refer to the CHANGELOG.
Features and enhancements
S2 compression for gRPC is now supported using the following flags:
-alertmanager.alertmanager-client.grpc-compression=s2
-ingester.client.grpc-compression=s2
-querier.frontend-client.grpc-compression=s2
-querier.scheduler-client.grpc-compression=s2
-query-frontend.grpc-client-config.grpc-compression=s2
-query-scheduler.grpc-client-config.grpc-compression=s2
-ruler.client.grpc-compression=s2
-ruler.query-frontend.grpc-client-config.grpc-compression=s2
Distributors now support lz4
OTLP compression, and you can deploy them in multiple availability zones.
The ruler's <prometheus-http-prefix>/api/v1/rules
endpoint now supports the exclude_alerts
, group_limit
, and group_next_token
parameters.
mimirtool's analyze ruler
and analyze prometheus
commands now support bearer tokens.
You can now tune HTTP client settings for GCS and Azure backends via an http
block or corresponding CLI flags.
The compactor now refreshes deletion marks concurrently when updating the bucket index.
You can now set the number of Memcached replicas for each type of cache using configuration settings when using jsonnet.
Important changes
In Grafana Mimir 2.15, the following behavior has changed:
The following alertmanager metrics are not exported for a user
label when the metric value is zero:
cortex_alertmanager_alerts_received_total
cortex_alertmanager_alerts_invalid_total
cortex_alertmanager_partial_state_merges_total
cortex_alertmanager_partial_state_merges_failed_total
cortex_alertmanager_state_replication_total
cortex_alertmanager_state_replication_failed_total
cortex_alertmanager_alerts
cortex_alertmanager_silences
PromQL compatibility has been upgraded from Prometheus 2.0 to 3.0. For more details, refer to the Prometheus documentation. The following changes are of note:
- The
.
pattern in regular expressions in PromQL now matches newline characters. - Lookback and range selectors are left-open and right-closed. They were previously left-closed and right-closed.
- Native histograms now use exponential interpolation.
Backwards compatibility in dashboards and alerts for thanos_memcached_
-prefixed metrics has been removed. These metrics were removed in 2.12 in favor of thanos_cache_
-prefixed metrics.
Experimental support for Redis as a cache backend has been removed from jsonnet.
The following deprecated configuration options were removed in this release:
-distributor.direct-otlp-translation-enabled
, which has been enabled by default since 2.13 and is now considered stable.-query-scheduler.prioritize-query-components
, which is now always enabled.-api.get-request-for-ingester-shutdown-enabled
, a deprecated experimental flag which has been marked for removal in 2.15.
Experimental features
Grafana Mimir 2.15 includes some features that are experimental and disabled by default.
Use these features with caution and report any issues that you encounter:
You can now enable Mimir's experimental PromQL engine with -querier.query-engine=mimir
. This new engine provides improved performance and reduced querier resource consumption. However, it only supports a subset of all PromQL features. It falls back to the regular Prometheus engine for queries containing unsupported features.
You can now use the query-frontend to cache non-transient errors using the experimental flags -query-frontend.cache-errors
and -query-frontend.results-cache-ttl-for-errors
.
The query-frontend and querier both support an experimental PromQL function, double_exponential_smoothing
, which you can enable by setting -querier.promql-experimental-functions-enabled=true
and -query-frontend.promql-experimental-functions-enabled=true
.
The ingester now supports out-of-order native histogram ingestion via the flag -ingester.ooo-native-histograms-ingestion-enabled
.
The ingester can now build 24h blocks for out-of-order data which is more than 24 hours old, using the setting -blocks-storage.tsdb.bigger-out-of-order-blocks-for-old-samples
.
The ruler now supports caching the contents of rule groups via the setting -ruler-storage.cache.rule-group-enabled
.
The distributor now supports promotion of OTel resource attributes to labels via the setting -distributor.promote-otel-resource-attributes
.
Bug fixes
- Alerts: Fix autoscaling metrics joins in
MimirAutoscalerNotActive
when series churn. - Alerts: Exclude failed cache "add" operations from alerting since failures are expected in normal operation.
- Alerts: Exclude read-only replicas from
IngesterInstanceHasNoTenants
alert. - Alerts: Use resident set memory for the
EtcdAllocatingTooMuchMemory
alert so that ephemeral file cache memory doesn't cause the alert to misfire. - Dashboards: Fix autoscaling metrics joins when series churn.
- Distributor: Fix pooling buffer reuse logic when
-distributor.max-request-pool-buffer-size
is set. - Ingester: Fix issue where active series requests error when encountering a stale posting.
- Ingester: Fix race condition in per-tenant TSDB creation.
- Ingester: Fix race condition in exemplar adding.
- Ingester: Fix race condition in native histogram appending.
- Ingester: Fix bug in concurrent fetching where a failure to list topics on startup would cause an invalid topic ID (0x00000000000000000000000000000000).
- Ingester: Fix data loss bug in the experimental ingest storage when a Kafka Fetch is split into multiple requests and some of them return an error.
- Ingester: Fix bug where chunks could have one unnecessary zero byte at the end.
- OTLP: Support integer exemplar value type.
- OTLP receiver: Preserve colons and combine multiple consecutive underscores into one when generating metric names in suffix adding mode (
-distributor.otel-metric-suffixes-enabled
). - Prometheus: Fix issue where negation of native histograms (e.g.
-some_native_histogram_series
) did nothing. - Prometheus: Always return unknown hint for first sample in non-gauge native histograms chunk to avoid incorrect counter reset hints when merging chunks from different sources.
- Prometheus: Ensure native histograms counter reset hints are corrected when merging results from different sources.
- PromQL: Fix issue where functions such as
rate()
over native histograms could return incorrect values if a float stale marker was present in the selected range. - PromQL:
round
now removes the metric name again. - PromQL: Fix issue where
metric might not be a counter, name does not end in _total/_sum/_count/_bucket
annotation would be emitted even ifrate
orincrease
did not have enough samples to compute a result. - Querier: Fix the behavior of binary operators between native histograms and floats.
- Querier: Fix stddev+stdvar aggregations to always ignore native histograms, and to treat infinity consistently.
- Query-frontend: Fix issue where sharded queries could return annotations with incorrect or confusing position information.
- Query-frontend: Fix issue where downstream consumers may not generate correct cache keys for experimental error caching.
- Query-frontend: Support
X-Read-Consistency-Offsets
on labels queries too. - Ruler: Fix issue when using the experimental
-ruler.max-independent-rule-evaluation-concurrency
feature, where the ruler could panic as it updates a running ruleset or shutdowns.
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm charts are released independently.
Refer to the Grafana Mimir Helm chart documentation.
Changelog
2.15.0
Grafana Mimir
- [CHANGE] Alertmanager: the following metrics are not exported for a given
user
when the metric value is zero: #9359cortex_alertmanager_alerts_received_total
cortex_alertmanager_alerts_invalid_total
cortex_alertmanager_partial_state_merges_total
cortex_alertmanager_partial_state_merges_failed_total
cortex_alertmanager_state_replication_total
cortex_alertmanager_state_replication_failed_total
cortex_alertmanager_alerts
cortex_alertmanager_silences
- [CHANGE] Distributor: Drop experimental
-distributor.direct-otlp-translation-enabled
flag, since direct OTLP translation is well tested at this point. #9647 - [CHANGE] Ingester: Change
-initial-delay
for circuit breakers to begin when the first request is received, rather than at breaker activation. #9842 - [CHANGE] Query-frontend: apply query pruning before query sharding instead of after. #9913
- [CHANGE] Ingester: remove experimental flags
-ingest-storage.kafka.ongoing-records-per-fetch
and-ingest-storage.kafka.startup-records-per-fetch
. They are removed in favour of-ingest-storage.kafka.max-buffered-bytes
. #9906 - [CHANGE] Ingester: Replace
cortex_discarded_samples_total
label fromsample-out-of-bounds
tosample-timestamp-too-old
. #9885 - [CHANGE] Ruler: the
/prometheus/config/v1/rules
does not return an error anymore if a rule group is missing in the object storage after been successfully returned by listing the storage,...
2.15.0-rc.0
This release contains 493 PRs from 61 authors, including new contributors Alexander Akhmetov, Daan Schipper, Daniel Kovacs, I. Elisa Pasaoglu, Jay Clifford, Jorge Alberto Díaz Orozco (Akiel), Martin Valiente Ainz, Mia, Michael Tweten, Nikos Angelopoulos, Santi Leira, cui fliter, elsoa-invitech, madhu-reddy-peram. Thank you!
Grafana Mimir version 2.15.0-rc.0 release notes
Grafana Labs is excited to announce version 2.15 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bug fixes in this release.
For the complete list of changes, refer to the CHANGELOG.
Features and enhancements
S2 compression for gRPC is now supported using the following flags:
-alertmanager.alertmanager-client.grpc-compression=s2
-ingester.client.grpc-compression=s2
-querier.frontend-client.grpc-compression=s2
-querier.scheduler-client.grpc-compression=s2
-query-frontend.grpc-client-config.grpc-compression=s2
-query-scheduler.grpc-client-config.grpc-compression=s2
-ruler.client.grpc-compression=s2
-ruler.query-frontend.grpc-client-config.grpc-compression=s2
Distributors now support lz4
OTLP compression, and can be deployed in multiple availability zones.
The ruler's <prometheus-http-prefix>/api/v1/rules
endpoint now supports exclude_alerts
, group_limit
, and group_next_token
parameters.
mimirtool's analyze ruler/prometheus commands now support bearer tokens.
HTTP client settings can now be tuned for GCS and Azure backends via an http
block or corresponding CLI flags.
The compactor now refreshes deletion marks concurrently when updating the bucket index.
The number of Memcached replicas for each type of cache can now be set using configuration settings when using jsonnet.
Important changes
In Grafana Mimir 2.15, the following behavior has changed:
The following alertmanager metrics are not exported for a user
when the metric value is zero:
cortex_alertmanager_alerts_received_total
cortex_alertmanager_alerts_invalid_total
cortex_alertmanager_partial_state_merges_total
cortex_alertmanager_partial_state_merges_failed_total
cortex_alertmanager_state_replication_total
cortex_alertmanager_state_replication_failed_total
cortex_alertmanager_alerts
cortex_alertmanager_silences
PromQL compatibility has been upgraded from Prometheus 2.0 to 3.0. More details can be found in the Prometheus documentation. The following changes are of note:
- The
.
pattern in regular expressions in PromQL now matches newline characters. - Lookback and range selectors are left open and right closed (previously left closed and right closed).
- Native histograms now use exponential interpolation.
Backwards compatibility in dashboards and alerts for thanos_memcached_
-prefixed metrics has been removed. These metrics were removed in 2.12 in favor of thanos_cache_
-prefixed metrics.
Experimental support for Redis as a cache backend has been removed from jsonnet.
The following deprecated configuration options were removed in this release:
-distributor.direct-otlp-translation-enabled
, which has been enabled by default since 2.13 and is now considered stable.-query-scheduler.prioritize-query-components
, which is always enabled now-ingest-storage.kafka.ongoing-records-per-fetch
and-ingest-storage.kafka.startup-records-per-fetch
, which have been removed in favour of-ingest-storage.kafka.max-buffered-bytes
-api.get-request-for-ingester-shutdown-enabled
, a deprecated experimental flag which has been marked for removal in 2.15.
Experimental features
Grafana Mimir 2.15 includes some features that are experimental and disabled by default.
Use these features with caution and report any issues that you encounter:
Mimir's experimental PromQL engine can now be enabled with -querier.query-engine=mimir
. This new engine provides improved performance and reduced querier resource consumption; however, it supports only a subset of all PromQL features. It will fall back to Prometheus' engine for queries containing unsupported features.
The query-frontend can now cache non-transient errors using the experimental flags -query-frontend.cache-errors
and -query-frontend.results-cache-ttl-for-errors
.
The query-frontend and querier both support an experimental PromQL function, double_exponential_smoothing
, which can be enabled by setting -querier.promql-experimental-functions-enabled=true
and -query-frontend.promql-experimental-functions-enabled=true
.
The ingester can now support out-of-order native histogram ingestion via the flag -ingester.ooo-native-histograms-ingestion-enabled
.
The ingester can now build 24h blocks for out-of-order data which is >24h old, using the setting -blocks-storage.tsdb.bigger-out-of-order-blocks-for-old-samples
.
The ruler now supports caching the contents of rule groups via the setting -ruler-storage.cache.rule-group-enabled
.
The distributor now supports promotion of OTel resource attributes to labels via the setting -distributor.promote-otel-resource-attributes
.
Bug fixes
- Alerts: Fix autoscaling metrics joins in
MimirAutoscalerNotActive
when series churn. - Alerts: Exclude failed cache "add" operations from alerting since failures are expected in normal operation.
- Alerts: Exclude read-only replicas from
IngesterInstanceHasNoTenants
alert. - Alerts: Use resident set memory for the
EtcdAllocatingTooMuchMemory
alert so that ephemeral file cache memory doesn't cause the alert to misfire. - Dashboards: Fix autoscaling metrics joins when series churn.
- Distributor: Fix pooling buffer reuse logic when
-distributor.max-request-pool-buffer-size
is set. - Ingester: Fix issue where active series requests error when encountering a stale posting.
- Ingester: Fix race condition in per-tenant TSDB creation.
- Ingester: Fix race condition in exemplar adding.
- Ingester: Fix race condition in native histogram appending.
- Ingester: Fix bug in concurrent fetching where a failure to list topics on startup would cause to use an invalid topic ID (0x00000000000000000000000000000000).
- Ingester: Fix data loss bug in the experimental ingest storage when a Kafka Fetch is split into multiple requests and some of them return an error.
- Ingester: Fix bug where chunks could have one unnecessary zero byte at the end.
- OTLP: Support integer exemplar value type.
- OTLP receiver: Preserve colons and combine multiple consecutive underscores into one when generating metric names in suffix adding mode (
-distributor.otel-metric-suffixes-enabled
). - Prometheus: Fix issue where negation of native histograms (e.g.
-some_native_histogram_series
) did nothing. - Prometheus: Always return unknown hint for first sample in non-gauge native histograms chunk to avoid incorrect counter reset hints when merging chunks from different sources.
- Prometheus: Ensure native histograms counter reset hints are corrected when merging results from different sources.
- PromQL: Fix issue where functions such as
rate()
over native histograms could return incorrect values if a float stale marker was present in the selected range. - PromQL:
round
now removes the metric name again. - PromQL: Fix issue where
metric might not be a counter, name does not end in _total/_sum/_count/_bucket
annotation would be emitted even ifrate
orincrease
did not have enough samples to compute a result. - Querier: Fix the behaviour of binary operators between native histograms and floats.
- Querier: Fix stddev+stdvar aggregations to always ignore native histograms, and to treat Infinity consistently.
- Query-frontend: Fix issue where sharded queries could return annotations with incorrect or confusing position information.
- Query-frontend: Fix issue where downstream consumers may not generate correct cache keys for experimental error caching.
- Query-frontend: Support
X-Read-Consistency-Offsets
on labels queries too. - Ruler: Fix issue when using the experimental
-ruler.max-independent-rule-evaluation-concurrency
feature, where the ruler could panic as it updates a running ruleset or shutdowns.
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm charts are released independently.
Refer to the Grafana Mimir Helm chart documentation.
Changelog
2.15.0-rc.0
Grafana Mimir
- [CHANGE] Alertmanager: the following metrics are not exported for a given
user
when the metric value is zero: #9359cortex_alertmanager_alerts_received_total
cortex_alertmanager_alerts_invalid_total
cortex_alertmanager_partial_state_merges_total
cortex_alertmanager_partial_state_merges_failed_total
cortex_alertmanager_state_replication_total
cortex_alertmanager_state_replication_failed_total
cortex_alertmanager_alerts
cortex_alertmanager_silences
- [CHANGE] Distributor: Drop experimental
-distributor.direct-otlp-translation-enabled
flag, since direct OTLP translation is well tested at this point. #9647 - [CHANGE] Ingester: Change
-initial-delay
for circuit breakers to begin when the first request is received, rather than at breaker activation. #9842 - [CHANGE] Query-frontend: apply query pruning before query sharding instead of after. #9913
- [CHANGE] Ingester: remove experimental flags
-ingest-storage.kafka.ongoing-records-per-fetch
and-ingest-storage.kafka.startup-records-per-fetch
. They are removed in favour of-ingest-storage.kafka.max-buffered-bytes
. #9906 - [CHANGE] Ingester: Replace
cortex_discarded_samples_total
label fromsample-out-of-bounds
tosample-timestamp-too-old
. #9885 - [CHANGE] Ruler: the
/prometheus/config/v1/rules
do...
2.13.1
This release contains 9 PRs from 4 authors. Thank you!
Changelog
2.13.1
Grafana Mimir
- [BUGFIX] Upgrade Go to 1.22.9 to address CVE-2024-34156. #10097
- [BUGFIX] Update module google.golang.org/grpc to v1.64.1 to address GHSA-xr7q-jx4m-x55m. #8717
- [BUGFIX] Upgrade github.com/rs/cors to v1.11.0 address GHSA-mh55-gqvf-xfwm. #8611
All changes in this release: mimir-2.13.0...mimir-2.13.1
2.14.2
This release contains 3 PRs from 2 authors. Thank you!
Changelog
2.14.2
Grafana Mimir
- [BUGFIX] Query-frontend: Do not break scheduler connection on malformed queries. #9833
All changes in this release: mimir-2.14.1...mimir-2.14.2
2.14.1
This release contains 2 PRs from 2 authors. Thank you!
Changelog
2.14.1
Grafana Mimir
- [BUGFIX] Update objstore library to resolve issues observed for some S3-compatible object stores, which respond to
StatObject
withRange
incorrectly. #9625
All changes in this release: mimir-2.14.0...mimir-2.14.1
2.14.0
This release contains 599 PRs from 66 authors, including new contributors Adrian Berger, Albert Kerr, Alexander Davis, Alyssa Wada, Aofei Sheng, Bailhache Pierre, Bradley, David Stevens, Davin Kevin, Dennis Haney, Felipe Ferreira, Jeongseup, Nicholas Kress, Paul Farver, Pooya, Rajguru, Sephia Laureencia, Sviat Loginov, Taehyun Kim, Taylor C, Tito Lins, Willem Gillis, William Travis Holton, William Wernert, Yuri Tseretyan. Thank you!
Grafana Mimir version 2.14.0 release notes
Grafana Labs is excited to announce version 2.14 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bug fixes in this release. For the complete list of changes, refer to the CHANGELOG.
Features and enhancements
The streaming of chunks from store-gateways to queriers is now enabled by default. This reduces the memory usage in queriers. This was an experimental feature since Mimir 2.10, and is now considered stable.
Compactor adds a new cortex_compactor_disk_out_of_space_errors_total
counter metric that tracks how many times a compaction fails due to the compactor being out of disk.
The distributor now replies with the Retry-After
header on retriable errors by default. This protects Mimir from clients, including Prometheus, that default to retrying very quickly, making recovering from an outage easier. The feature was originally added as experimental in Mimir 2.11.
Incoming OTLP requests were previously size-limited with the distributor's -distributor.max-recv-msg-size
configuration.
The distributor has a new -distributor.max-otlp-request-size
configuration for limiting OTLP requests. The default value is 100 MiB.
Ingesters can be marked as read-only as part of their downscaling procedure. The new prepare-instance-ring-downscale
endpoint updates the read-only status of an ingester in the ring.
Important changes
In Grafana Mimir 2.14, the following behavior has changed:
When running a remote read request, the querier honors the time range specified in the read hints.
The default inactivity timeout of active series in ingesters, controlled by the -ingester.active-series-metrics-idle-timeout
configuration, is increased from 10m
to 20m
.
The following features of store-gateway are changed: -blocks-storage.bucket-store.max-concurrent-queue-timeout
is set to five seconds; -blocks-storage.bucket-store.index-header.lazy-loading-concurrency-queue-timeout
is set to five seconds; -blocks-storage.bucket-store.max-concurrent
is set to 200;
The experimental support for Redis caching is now deprecated and set to be removed in the next major release. Users are encouraged
to switch to use Memcached.
The following deprecated configuration options were removed in this release:
- The
-ingester.return-only-grpc-errors
option in the ingester - The
-ingester.client.circuit-breaker.*
options in the ingester - The
-ingester.limit-inflight-requests-using-grpc-method-limiter
option in the ingester - The
-ingester.client.report-grpc-codes-in-instrumentation-label-enabled
option in the distributor and ruler - The
-distributor.limit-inflight-requests-using-grpc-method-limiter
option in the distributor - The
-distributor.enable-otlp-metadata-storage
option in the distributor - The
-ruler.drain-notification-queue-on-shutdown
option in the ruler - The
-querier.max-query-into-future
option in the querier - The
-querier.prefer-streaming-chunks-from-store-gateways
option in the querier and the store-gateway - The
-query-scheduler.use-multi-algorithm-query-queue
option in the querier-scheduler - The YAML configuration
frontend.align_queries_with_step
in the query-frontend
Experimental features
Grafana Mimir 2.14 includes some features that are experimental and disabled by default. Use these features with caution and report any issues that you encounter:
The ingester added an experimental -ingester.ignore-ooo-exemplars
configuration. When set, out-of-order exemplars are no longer reported to the remote write client.
The querier supports the experimental limitk()
and limit_ratio()
PromQL functions. This feature is disabled by default, but you can enable it with the -querier.promql-experimental-functions-enabled=true
setting in the query-frontend and the querier.
Bug fixes
- Alertmanager: fix configuration validation gap around unreferenced templates.
- Alertmanager: fix goroutine leak when stored configuration fails to apply and there is no existing tenant alertmanager.
- Alertmanager: fix receiver firewall to detect
0.0.0.0
and IPv6 interface-local multicast address as local addresses. - Alertmanager: fix per-tenant silence limits not reloaded during runtime.
- Alertmanager: fix bugs in silences that could cause an existing silence to expire/be deleted when updating the silence fails. This could happen when the updated silence was invalid or exceeded limits.
- Alertmanager: fix help message for utf-8-strict-mode.
- Compactor: fix a race condition between different compactor replicas that may cause a deleted block to be referenced as non-deleted in the bucket index.
- Configuration: multi-line environment variables are flattened during injection to be compatible with YAML syntax.
- HA Tracker: store correct timestamp for the last-received request from the elected replica.
- Ingester: fix the sporadic
not found
error causing an internal server error if label names are queried with matchers during head compaction. - Ingester, store-gateway: fix case insensitive regular expressions not correctly matching some Unicode characters.
- Ingester: fixed timestamp reported in the "the sample has been rejected because its timestamp is too old" error when the write request contains only histograms.
- Query-frontend: fix
-querier.max-query-lookback
and-compactor.blocks-retention-period
enforcement in query-frontend when one of the two is not set. - Query-frontend: "query stats" log includes the actual
status_code
when the request fails due to an error occurring in the query-frontend itself. - Query-frontend: ensure that internal errors result in an HTTP 500 response code instead of a 422 response code.
- Query-frontend: return annotations generated during evaluation of sharded queries.
- Query-scheduler: fix a panic in request queueing.
- Querier: fix the issue where "context canceled" is logged for trace spans for requests to store-gateways that return no series when chunks streaming is enabled.
- Querier: fix issue where queries can return incorrect results if a single store-gateway returns overlapping chunks for a series.
- Querier: do not return
grpc: the client connection is closing
errors as HTTP499
. - Querier: fix issue where some native histogram-related warnings were not emitted when
rate()
was used over native histograms. - Querier: fix invalid query results when multiple chunks are merged.
- Querier: support optional start and end times on
/prometheus/api/v1/labels
,/prometheus/api/v1/label/<label>/values
, and/prometheus/api/v1/series
whenmax_query_into_future: 0
. - Querier: fix issue where both recently compacted blocks and their source blocks can be skipped during querying if store-gateways are restarting.
- Ruler: add support for draining any outstanding alert notifications before shutting down. Enable this setting with the
-ruler.drain-notification-queue-on-shutdown=true
CLI flag. - Store-gateway: fixed a case where, on a quick subsequent restart, the previous lazy-loaded index header snapshot was overwritten by a partially loaded one.
- Store-gateway: store sparse index headers atomically to disk.
- Ruler: map invalid org-id errors to the 400 status code.
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm charts are released independently. Refer to the Grafana Mimir Helm chart documentation.
Changelog
2.14.0
Grafana Mimir
- [CHANGE] Update minimal supported version of Go to 1.22. #9134
- [CHANGE] Store-gateway / querier: enable streaming chunks from store-gateways to queriers by default. #6646
- [CHANGE] Querier: honor the start/end time range specified in the read hints when executing a remote read request. #8431
- [CHANGE] Querier: return only samples within the queried start/end time range when executing a remote read request using "SAMPLES" mode. Previously, samples outside of the range could have been returned. Samples outside of the queried time range may still be returned when executing a remote read request using "STREAMED_XOR_CHUNKS" mode. #8463
- [CHANGE] Querier: Set minimum for
-querier.max-concurrent
to four to prevent queue starvation with querier-worker queue prioritization algorithm; values below the minimum four are ignored and set to the minimum. #9054 - [CHANGE] Store-gateway: enabled
-blocks-storage.bucket-store.max-concurrent-queue-timeout
by default with a timeout of 5 seconds. #8496 - [CHANGE] Store-gateway: enabled
-blocks-storage.bucket-store.index-header.lazy-loading-concurrency-queue-timeout
by default with a timeout of 5 seconds . #8667 - [CHANGE] Distributor: Incoming OTLP requests were previously size-limited by using limit from
-distributor.max-recv-msg-size
option. We have added option-distributor.max-otlp-request-size
for limiting OTLP requests, with default value of 100 MiB. #8574 - [CHANGE] Distributor: remove metric
cortex_distributor_sample_delay_seconds
. #8698 - [CHANGE] Query-frontend: Remove deprecated
frontend.align_queries_with_step
YAML configuration. The configuration option has been moved to per-tenant and defaultlimits
since Mimir 2.12. #8733 #8735 - [CHANGE] Store-gateway: Change default of
-blocks-storage.bucket-store.max-concurrent
to 200. #8768 - [CHANGE] Added new metric `cortex_compactor_disk_ou...