Skip to content

Phase end cutoff mitigation #1161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 15, 2025
Merged

Phase end cutoff mitigation #1161

merged 9 commits into from
May 15, 2025

Conversation

ArneTR
Copy link
Member

@ArneTR ArneTR commented Apr 21, 2025

This PR provides a mitigation method for measurement cut-offs.

This is particular helpful if your sampling resolution is prone to undersampling.

We have seen issues for instance in:

The mechanic implemented does the following:

  • After each ending phase a sleep is introduced which is as big as the slowest provider configured
  • Then a padding is applied to the cut-off time of the ending phase which is as big as the sleep just issued.
    • Through this approach we guarantee that after running all commands in a step we capture the last tick of the slowest provider
  • Now the RUNTIME phase cannot be build anymore as it used to - By just taking all values between the start and the end of the RUNTIME phase markers. Since we have overhead and artificial sleeps in there it must be reconstructred
    • Reconstruction is done by simply summing up the runtime sub-phases
    • Average weights are done for MEAN values

Greptile Summary

This PR addresses measurement cut-off issues by implementing phase timing adjustments and runtime phase reconstruction. Key changes include:

  • Added _sampling_interval_padding in scenario_runner.py to store slowest provider's resolution and extend phase windows
  • Added reconstruct_runtime_phase() in phase_stats.py to rebuild runtime statistics by aggregating sub-phase data
  • Added check_largest_sampling_rate() in system_checks.py to warn when providers exceed 1000ms sampling rate
  • Modified phase handling to ensure last measurements are captured by extending phase windows with sleep duration
  • Improved runtime phase reconstruction by summing sub-phases instead of using direct measurements

ArneTR added 5 commits April 21, 2025 11:53
* main: (68 commits)
  Guard clause that runner.run_until may never be used without a context
  Software add now accepts but logs unknown endpoints other than GitHub and GitLab
  Added volumes key
  allow_unsafe and skip_unsafe technically usable via configuration. Not exposed
  Bump psycopg[binary] from 3.2.8 to 3.2.9 (#1188)
  Bump redis from 5.2.1 to 6.1.0 (#1187)
  Added statistical significance schedule mode [skip ci] (#1186)
  Bump deepdiff from 8.4.2 to 8.5.0 (#1181)
  Bump pylint from 3.3.6 to 3.3.7 (#1175)
  Adding dev-no-sleeps and dev-no-optimization to the cluster (#1185)
  Shm size (#1184)
  Shutdown in the cluster happens now dynamically depending on what is configured in shutdown_on_job_no
  Memory resource limits may also be plain int or float\nWill then be interpreted as bytes
  Bump playwright/python in /docker/auxiliary-containers/gcb_playwright (#1174)
  Bump psycopg[binary] from 3.2.7 to 3.2.8 (#1183)
  Bump hiredis from 3.1.0 to 3.1.1 (#1182)
  Temperature errors are reset now inline
  (fix): Version is allowed to be int, float or datetime too
  Added --dev-no-save flag (#1179)
  Removed duplicate ln
  ...
@ArneTR ArneTR marked this pull request as ready for review May 15, 2025 09:37
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 file(s) reviewed, 3 comment(s)
Edit PR Review Bot Settings | Greptile

@ArneTR ArneTR merged commit 68394b6 into main May 15, 2025
1 check failed
@ArneTR ArneTR deleted the phase-end-cutoff-mitigation branch May 15, 2025 15:24
ArneTR added a commit that referenced this pull request May 18, 2025
* main: (22 commits)
  Hotfix: Compare values were 3 orders of magnitude to low due to double division (#1191)
  Sampling rate rework (#1194)
  Phase padding can now be turned on and off (#1193)
  User 0 should have flow_process_duration and total duration only at 30 minutes and no data in json 'measurement'
  AI-Tests can now activated and deactivated in tests
  (Testing QoL): JS errors in frontend tests are now reported
  typo
  added no-else-raise
  Checking in more cases now if github detected even if path broken
  AI Optimisations Frontend added to FOSS version as appetizer (#1192)
  Allow repo URLs with unknown schemes but issues warning
  Revert "Test fix\nwe changed from failing on unknowns to allowing them due to allowing other vendors or private repos with reduced capbility tokens that might be cloneable but do not expose the API"
  general wording
  Runtime phase reconstruction only when runtime phase is present
  (fix): shutdown_on_job_no must only be non false
  (fix): Null check for resolution must also be in system_checks
  (fix): Providers without resolution must also be mappable to _sampling_interval_padding
  Test fix\nwe changed from failing on unknowns to allowing them due to allowing other vendors or private repos with reduced capbility tokens that might be cloneable but do not expose the API
  Phase end cutoff mitigation (#1161)
  Guard clause that runner.run_until may never be used without a context
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant