Skip to content

Token throughput monitor assumes batch size is fixed but does not raise meaningful error #20235

Open
@alex-hh

Description

@alex-hh

Bug description

If using token throughput monitor with variable batch size the samples counter will be incorrect leading to a possibly non-monotonically increasing sample count. Although the docs do say that batch size should be fixed, there is no explicit check for this, leading to an error message that is hard to understand.

e.g. if batch sizes are 1, 2, 1
then samples passed to throughput in update are 1, 4, 3, and a value error is raised:
ValueError: Expected the value to increase, last: 4, current: 3

Is there any reason not to support variable batch size on throughput monitor?

What version are you seeing the problem on?

v2.4

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions