Skip to content

Idea: Bloom Filter - CPU usage reduction #11182

Open
@sfc-gh-chli

Description

@sfc-gh-chli

What is the problem you are trying to solve?

Our Grafana Mimir has been suffering from high CPU usage from query path, and one solution to lower CPU usage is through CPU/Bloom filter.

Which solution do you envision (roughly)?

  1. An existing well-established solution would be incorporating Bloom Filter, which is a probabilistic data structure returning instantly for non-existing timeseries data and already well implemented in Loki, Thanos, M3DB etc.

In a very similar set up TSDB Thanos (also a multi-tenant Prometheus Inside), we have incorporated Cuckoo filter (a relative to Bloom filter) just on metric names, and we can see the CPU usage instantly dropped from 50% to <20%, which is 30% reduction through this simple feature, see this PR for refence for implementation.

I have also extensively worked with M3DB, which has a more robust bloom filter bitset of all series contained in this fileset for quick knowledge of whether to attempt retrieving a series for this fileset volume. Working with M3DB makes us never have a problem with CPU usage.

  1. A second approach, orthogonal to the Bloom filter would be separation of Storage and Query Engine, Right now ingester is handling both write and read traffic, which makes it super heavy and critical. A solution to separate write and read path would be so much helpful not only for resource usage management, but also for better isolation and less chance of failures on both read an write path.

Image

Have you considered any alternatives?

No response

Any additional context to share?

No response

How long do you think this would take to be developed?

Small (<= 1 month dev)

What are the documentation dependencies?

No response

Proposer?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions