Idea: Bloom Filter - CPU usage reduction

### What is the problem you are trying to solve?

Our Grafana Mimir has been suffering from high CPU usage from query path, and one solution to lower CPU usage is through CPU/Bloom filter. 

### Which solution do you envision (roughly)?

1. An existing well-established solution would be incorporating Bloom Filter, which is a probabilistic data structure returning instantly for non-existing timeseries data and already well implemented in [Loki](https://grafana.com/docs/loki/latest/operations/bloom-filters/), [Thanos](https://github.com/thanos-io/thanos/issues/1611), [M3DB](https://m3db.io/docs/architecture/m3db/storage/) etc.

  In a very similar set up TSDB Thanos (also a multi-tenant Prometheus Inside), we have incorporated Cuckoo filter (a relative to Bloom filter) just on metric names, and we can see the CPU usage instantly dropped from 50% to <20%, which is 30% reduction through this simple feature, see this [PR](https://github.com/thanos-io/thanos/pull/7787) for refence for implementation.

  I have also extensively worked with M3DB, which has a more robust bloom filter bitset of all series contained in this fileset for quick knowledge of whether to attempt retrieving a series for this fileset volume. Working with M3DB makes us never have a problem with CPU usage.

2. A second approach, orthogonal to the Bloom filter would be separation of Storage and Query Engine, Right now ingester is handling both write and read traffic, which makes it super heavy and critical. A solution to separate write and read path would be so much helpful not only for resource usage management, but also for better isolation and less chance of failures on both read an write path.

![Image](https://github.com/user-attachments/assets/586354dc-1e3a-4cdd-a1a9-81d917c6f44b)

### Have you considered any alternatives?

_No response_

### Any additional context to share?

_No response_

### How long do you think this would take to be developed?

Small (<= 1 month dev)

### What are the documentation dependencies?

_No response_

### Proposer?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: Bloom Filter - CPU usage reduction #11182

What is the problem you are trying to solve?

Which solution do you envision (roughly)?

Have you considered any alternatives?

Any additional context to share?

How long do you think this would take to be developed?

What are the documentation dependencies?

Proposer?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Idea: Bloom Filter - CPU usage reduction #11182

Description

What is the problem you are trying to solve?

Which solution do you envision (roughly)?

Have you considered any alternatives?

Any additional context to share?

How long do you think this would take to be developed?

What are the documentation dependencies?

Proposer?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions