Skip to content

Issue #1024: Calculate baseline by the fastest benchmark #2171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Serg046
Copy link
Contributor

@Serg046 Serg046 commented Oct 26, 2022

Fixes #1024
The idea is inside Summary.cs

Copy link
Contributor

@YegorStepanov YegorStepanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the following benchmark will be executed instead of the failed validation:

[AutomaticBaseline(AutomaticBaselineMode.Fastest)]
public class Benchmarks
{
    [Benchmark(Baseline = true)] public void M1() { }
    [Benchmark] public void M2() { Thread.Sleep(1);}
}

You can set a baseline for a Job, check it out too:

[AutomaticBaseline(AutomaticBaselineMode.Fastest)]
[SimpleJob(RuntimeMoniker.Net60, baseline: true)]
public class Benchmarks { ... }

For maintainers:

maybe we should add Slowest for consistency?
maybe we should add "Fastest"/"Slowest" for some other columns, like Allocation?

var fastestReport = reports.First();
foreach (var report in reports.Skip(1))
{
if (report.ResultStatistics.Mean < fastestReport.ResultStatistics.Mean)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResultStatistics can be null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the failed validation:

I actually thought about it but decided to keep a new feature overriding more local configurations. Added a validation.

maybe we should add "Fastest"/"Slowest" for some other columns, like Allocation?

Could make sense but not to me. Ratio column is enough for me, the fastest is just a baseline (ratio == 1)

ResultStatistics can be null.

Fixed

Thanks for your review Yegor 👍

Reports = BenchmarksCases.Select(b => ReportMap[b]).ToImmutableArray(); // we use sorted collection to re-create reports list
BaseliningStrategy = BaseliningStrategy.Create(BenchmarksCases);
Style = GetConfiguredSummaryStyleOrDefaultOne(BenchmarksCases).WithCultureInfo(cultureInfo);
Table = GetTable(Style);
AllRuntimes = BuildAllRuntimes(HostEnvironmentInfo, Reports);
}

private static BenchmarkCase GetFastestBenchmarkCase(ImmutableArray<BenchmarkReport> reports)
{
if (reports.Any() && reports.All(r => r.BenchmarkCase.Config.AutomaticBaselineMode == AutomaticBaselineMode.Fastest))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reports.All(r => r.BenchmarkCase.Config.AutomaticBaselineMode == AutomaticBaselineMode.Fastest))

Is it correctly works when 2 benchmark classes have different AutomaticBaselineMode?

[AutomaticBaseline(AutomaticBaselineMode.Fastest)]
public class C1 { }

[AutomaticBaseline(AutomaticBaselineMode.None)]
public class C2 { }

To join summaries, use:

  1. --join (for cmd)
  2. BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).RunAllJoined()

Copy link
Contributor Author

@Serg046 Serg046 Oct 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea itself works because the benchmarks are already grouped by the type so that Summary contains the cases of either C1 or C2 only. However, this particular example doesn't print the ratio for all the methods. I guess BDN trims columns that don't exist in all results (C1's benchmarks contain ration but C2's ones don't). If you set AutomaticBaselineMode.Fastest for C2, it prints correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ManualConfig.Union(...) resets the AutomaticBaselineMode:

config = ManualConfig.Union(config, configFromAttribute.Config) // C1, set the value to "Fastest"
config = ManualConfig.Union(config, configFromAttribute.Config) // C2, set the value to "None" (even if the `[AutomaticBaseline(None)]` is not used)

The problem is that the final config has only one value of AutomaticBaselineMode, but it should have a different value for each benchmark class to allow:

[AutomaticBaseline(AutomaticBaselineMode.Fastest)] public class B1 { }
public class B2 { } //implicit AutomaticBaselineMode.None
[AutomaticBaseline(AutomaticBaselineMode.Slowest)] public class B3 { }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not like that, the configs are different. I see that this inferredBaselineBenchmarkCase is calculated properly. There is probably a need to add a column or something like that, I will try to identify the problem.

Copy link
Contributor

@YegorStepanov YegorStepanov Oct 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is like that :)

Apply it first:

Suggested change
if (reports.Any() && reports.All(r => r.BenchmarkCase.Config.AutomaticBaselineMode == AutomaticBaselineMode.Fastest))
if (reports.Any() && reports.Any(r => r.BenchmarkCase.Config.AutomaticBaselineMode == AutomaticBaselineMode.Fastest))

Sample

[DryJob]
// [AutomaticBaseline(AutomaticBaselineMode.Fastest)]
[GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByCategory)] // it applies to all summaries, we may not duplicate it
[BenchmarkCategory("A")] // without this, the table is not split by classes
public class MyClass1
{
    [Benchmark] public void MyMethod1() => Thread.Sleep(100);
    [Benchmark] public void MyMethod2() => Thread.Sleep(200);
}

[DryJob]
// [AutomaticBaseline(AutomaticBaselineMode.Fastest)]
[BenchmarkCategory("B")]
public class MyClass2
{
    [Benchmark] public void MyMethod1() => Thread.Sleep(400);
    [Benchmark] public void MyMethod2() => Thread.Sleep(200);
}

If we uncomment AutomaticBaseline for MyClass1 only:

image

If we uncomment AutomaticBaseline for MyClass2 only (this is the correct table):

image

Debug these lines to see how IConfig.AutomaticBaselineMode changes:

foreach (var configFromAttribute in methodAttributes)
config = ManualConfig.Union(config, configFromAttribute.Config);

Copy link
Contributor Author

@Serg046 Serg046 Oct 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you are right. I didn't notice that Summary is built the third time (where you have all 4 methods). I was checking first two Summary instances (first two benchmarks, then second two). Thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is what I suggest we do d20c117

  1. We skip all defaut (None) values to build configs.
  2. We take the latest non-default value to determine whether we need to override baseline functionality or we don't.

@adamsitnik
Copy link
Member

I am not very familiar with this part of BDN and if possible I would prefer @AndreyAkinshin to review it. Thanks!

Copy link
Member

@AndreyAkinshin AndreyAkinshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this feature, it looks cool! Thanks for the contribution!

My primary concern is about the tests. Thread.Sleep-based integration tests are too fragile and too slow, we should avoid them whenever it's possible. We need fast regular unit tests to cover this feature. In order to do it, we can use MockFactory to create an artifical summary based on the given type and check the column values manually. It would be a fast, reliable and isolated testing approach.

@Serg046
Copy link
Contributor Author

Serg046 commented Oct 31, 2022

I tried to fix all the notes we have here, please let me know if there is anything else to deal with @YegorStepanov @AndreyAkinshin @adamsitnik

@YegorStepanov
Copy link
Contributor

Cons of the current implementation:

We cannot extend the enum, because attribute actually set a boolean flag in config.
It make sense to drop AutomaticBaselineMode completely. (Attribute will set IConfig.IsAutomaticBaseline = true).

I like the current attribute API but it will need to rename the attribute, if we drop the enum.

I'm thinking of a more powerful solution:

Not for all users Mean is the most interesting column:

  • In the real-time apps such as games, allocation is more painful than mean time.
  • Andrey will like this: Mean is never interesting. Compare methods by percentage (like P90) or at least by Median.
  • Indeed, all the math columns can be interesting here.

API:

[AutomaticBaseline(string columnName, Strategy strategy = Maximum)]

enum Strategy { Minimum, Maximum } // AutomaticBaselineStrategy or AutomaticStrategy or BaselineStrategy?

how to use:

[AutomaticBaseline("P90")]
[AutomaticBaseline(Column.P90))]

We can compare values from the string IColumn.GetValue(...) method (if IColumn.IsNumeric == true than we do TryParse and compare them by number, not by string).
Some columns such as Runtime need special handling.

Questions:

Does it make sense to add support for assembly: to attribute?

[assembly: [AutomaticBaseline("Mean")]]

public Benchmarks
{
    // currently, it would be an error: "Do not use both AutomaticBaseline and [Benchmark(Baseline = true)]"
    [Benchmark(Baseline = true)] public void B1() { }
    [Benchmark] public void B1() => Thread.Sleep(100);
}

How can this be resolved:

  1. make assembly: attribute overridable by [Benchmark(Baseline = true)] and [class: AutomaticBaseline("another-column")]
  2. add parameter: [AutomaticBaseLine(..., bool overridable = false)]

If Serg046 does not have so much time, then I can do it within a few months.

Serg046 and others added 2 commits October 31, 2022 16:08
Co-authored-by: Yegor Stepanov <yegor.stepanov@outlook.com>
@Serg046
Copy link
Contributor Author

Serg046 commented Oct 31, 2022

The idea looks very interesting to me. If mainterners like it too, I am also open to make it happen. The only thought I have is that it looks as a separate feature. I think this should be working for both automatic and explicit/manual baseline modes, while this PR is focused on a new auto one only. We can create a new issue and discuss a feature design there. However, if we want to do something here, I am still ready for that!

@YegorStepanov
Copy link
Contributor

Are both solutions really needed? They look very similar:

[AutomaticBaseline(AutomaticBaselineMode.Fastest)]
[AutomaticBaseline("Mean")] // how to name it?

AutomaticBaselineMode.Fastest

As I wrote above, it is not clear what the Fastest means.

Why is it compared by the Mean and not the Median?
Maybe BDN calculates Fastest benchmark with complex formulas based on benchmark iterations?

If we change it to the Median, users may think that is a bug, because by default the Median column is not shown, only the Mean column.

@Serg046
Copy link
Contributor Author

Serg046 commented Oct 31, 2022

Are both solutions really needed?

No, I meant it would be great to suport it for manual/explicit mode as well as for the new AutomaticBaseline attribute. Something like that:

public Benchmarks
{
    // Existing functionality with a new feature
    [Benchmark(Baseline = true, BaselineMetric = "Mean")] public void B1() { }
}

Why is it compared by the Mean and not the Median?

Just my choice during development. I am fine to change this default to something else, e.g. Median like you say.

If we change it to the Median, users may think that is a bug, because by default the Median column is not shown, only the Mean column.

Exactly

@AndreyAkinshin
Copy link
Member

The Ratio column is not just the ratio of means (and it shouldn't be a ratio of other statistics). The current implementation can be found in BenchmarkDotNet.Mathematics.Statistics.Divide, but I'm not happy with it. I'm working on a new approach based on ratio functions (which is a part of a new summary table with a better statistical engine; the research notes can be found in my blog).
tl;dr: avoid using specific metrics in the baseline API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Calculate baseline by fastest with config
4 participants