Phase end cutoff mitigation #1161

ArneTR · 2025-04-21T12:00:18Z

This PR provides a mitigation method for measurement cut-offs.

This is particular helpful if your sampling resolution is prone to undersampling.

We have seen issues for instance in:

The mechanic implemented does the following:

After each ending phase a sleep is introduced which is as big as the slowest provider configured
Then a padding is applied to the cut-off time of the ending phase which is as big as the sleep just issued.
- Through this approach we guarantee that after running all commands in a step we capture the last tick of the slowest provider
Now the RUNTIME phase cannot be build anymore as it used to - By just taking all values between the start and the end of the RUNTIME phase markers. Since we have overhead and artificial sleeps in there it must be reconstructred
- Reconstruction is done by simply summing up the runtime sub-phases
- Average weights are done for MEAN values

Greptile Summary

This PR addresses measurement cut-off issues by implementing phase timing adjustments and runtime phase reconstruction. Key changes include:

Added _sampling_interval_padding in scenario_runner.py to store slowest provider's resolution and extend phase windows
Added reconstruct_runtime_phase() in phase_stats.py to rebuild runtime statistics by aggregating sub-phase data
Added check_largest_sampling_rate() in system_checks.py to warn when providers exceed 1000ms sampling rate
Modified phase handling to ensure last measurements are captured by extending phase windows with sleep duration
Improved runtime phase reconstruction by summing sub-phases instead of using direct measurements

…with potential overhead

* main: (68 commits) Guard clause that runner.run_until may never be used without a context Software add now accepts but logs unknown endpoints other than GitHub and GitLab Added volumes key allow_unsafe and skip_unsafe technically usable via configuration. Not exposed Bump psycopg[binary] from 3.2.8 to 3.2.9 (#1188) Bump redis from 5.2.1 to 6.1.0 (#1187) Added statistical significance schedule mode [skip ci] (#1186) Bump deepdiff from 8.4.2 to 8.5.0 (#1181) Bump pylint from 3.3.6 to 3.3.7 (#1175) Adding dev-no-sleeps and dev-no-optimization to the cluster (#1185) Shm size (#1184) Shutdown in the cluster happens now dynamically depending on what is configured in shutdown_on_job_no Memory resource limits may also be plain int or float\nWill then be interpreted as bytes Bump playwright/python in /docker/auxiliary-containers/gcb_playwright (#1174) Bump psycopg[binary] from 3.2.7 to 3.2.8 (#1183) Bump hiredis from 3.1.0 to 3.1.1 (#1182) Temperature errors are reset now inline (fix): Version is allowed to be int, float or datetime too Added --dev-no-save flag (#1179) Removed duplicate ln ...

…rics

greptile-apps

_{3 file(s) reviewed, 3 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

lib/scenario_runner.py

lib/phase_stats.py

lib/system_checks.py

* main: (22 commits) Hotfix: Compare values were 3 orders of magnitude to low due to double division (#1191) Sampling rate rework (#1194) Phase padding can now be turned on and off (#1193) User 0 should have flow_process_duration and total duration only at 30 minutes and no data in json 'measurement' AI-Tests can now activated and deactivated in tests (Testing QoL): JS errors in frontend tests are now reported typo added no-else-raise Checking in more cases now if github detected even if path broken AI Optimisations Frontend added to FOSS version as appetizer (#1192) Allow repo URLs with unknown schemes but issues warning Revert "Test fix\nwe changed from failing on unknowns to allowing them due to allowing other vendors or private repos with reduced capbility tokens that might be cloneable but do not expose the API" general wording Runtime phase reconstruction only when runtime phase is present (fix): shutdown_on_job_no must only be non false (fix): Null check for resolution must also be in system_checks (fix): Providers without resolution must also be mappable to _sampling_interval_padding Test fix\nwe changed from failing on unknowns to allowing them due to allowing other vendors or private repos with reduced capbility tokens that might be cloneable but do not expose the API Phase end cutoff mitigation (#1161) Guard clause that runner.run_until may never be used without a context ...

ArneTR added 5 commits April 21, 2025 11:53

Checking for sampling rates > 1000 ms

be91301

Runtime phase is now reconstructed instead of just calculated inline …

2795990

…with potential overhead

Added sleep padding to runner

7fb6126

Phase extension was not in us; Added some debug infos in detailed met…

50c7c1f

…rics

ArneTR marked this pull request as ready for review May 15, 2025 09:37

greptile-apps bot reviewed May 15, 2025

View reviewed changes

lib/scenario_runner.py Show resolved Hide resolved

lib/phase_stats.py Outdated Show resolved Hide resolved

lib/system_checks.py Show resolved Hide resolved

ArneTR added 4 commits May 15, 2025 11:53

Added coalesce [skip ci]

77f595b

Test fixes

02ce250

Reworked how SCI is saved

7ef32bf

(fix): SCI was for all phases, not only for runtime

b07edde

ArneTR merged commit 68394b6 into main May 15, 2025
1 check failed

ArneTR deleted the phase-end-cutoff-mitigation branch May 15, 2025 15:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Phase end cutoff mitigation #1161

Phase end cutoff mitigation #1161

Uh oh!

ArneTR commented Apr 21, 2025 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Phase end cutoff mitigation #1161

Phase end cutoff mitigation #1161

Uh oh!

Conversation

ArneTR commented Apr 21, 2025 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArneTR commented Apr 21, 2025 •

edited by greptile-apps bot

Loading