Skip to content

Releases: sustainable-computing-io/kepler

release-v0.0.3 (reboot)

15 Apr 16:58
3949007
Compare
Choose a tag to compare

What's Changed

🚀 Features

  • feat(node): node level power from RAPL by @sthaha in #1996

Full Changelog: v0.0.2-reboot...v0.0.3-reboot

release-v0.0.2 (reboot)

14 Apr 14:49
v0.0.2-reboot
75a40a3
Compare
Choose a tag to compare

What's Changed

🚀 Features

Full Changelog: v0.0.1-reboot...v0.0.2-reboot

release-v0.0.1 (reboot)

10 Apr 12:44
v0.0.1-reboot
2825e77
Compare
Choose a tag to compare

v0.0.1 is out now, please see the changelog below for details! 🚀

What's Changed

✨ Added

🐛 Fixed

Full Changelog: v0.0.0-reboot...v0.0.1-reboot

release-0.8.0

18 Mar 07:09
bfae0a7
Compare
Choose a tag to compare
  • b7add5d fix: false detection of Grace Hopper GPU
  • e3fb529 fix(openshift): patch deployment issue
  • b43260a fix(collector): data race when updating maps simultaenously
  • 92b8592 feat(config): Make /proc and /sys paths configurable
  • 7212f9a fix: Redfish: Use Chassis instead of Systems to obtain chassis ID
  • e86161e Fix: Bump golang.org/x/net from 0.28.0 to 0.34.0
  • 1440cc8 feat: add TLS endpoint to kepler exporter
  • 98275d0 [fix]: enhance kubelet package impl (#1894)
  • c9c2aba Fixes paths for grace arm
  • c9524a8 [fix]: try to reduce escapes to heap (#1886)
  • cf469c2 feat(sensor): support NVIDIA Grace Hopper
  • 662b0d5 fix: pre-commit auto update commit author (#1873)
  • a447077 [fix]: adjust gosec permissions settings according to GHA ref (#1867)
  • e45f6c4 [fix]: add test case with darwin OS (#1856)
  • 2ddabe2 [fix]: fix with permissions for PR pipeline
  • 133f219 [fix]: update gha permission settings
  • 59b2cfa [fix]: fix image libbpf build version (#1860)
  • 16fdd9f [fix]: base image elfutils version issue (#1859)
  • 46272c6 fix(cpu): fixed reading cpus.yaml
  • 11711e7 feat(config): allow config dir to be passed as argument (#1850)
  • 9ef9706 fix(yamllint): remove excess spaces
  • e2c3c6f fix(yamllint): disable yamllint quote rule
  • f4f5eb4 clean(process-exporter): Add yamllint fixes
  • fdcea6b fix(yamllint): Fix yamllint errors
  • 35e160b feat(process-exporter): Add process-exporter to dev and metal
  • 75b9533 fix(bpf): exclude swapper process bpf_cpu_time (#1830)
  • 6a968cd fix: epel-release installation via rpm
  • 2beddaf feat: move from ubi to ubi-minimal
  • fe78fd5 feat(validator): incorporate MAE in validatons
  • 91c8e62 fix: remove .dockerignore to ensure clean version computation (#1809)
  • 26de449 feat(metrics): refactor idle power exposure

New Contributors

Full Changelog: v0.7.12...v0.8.0

release-0.7.12

15 Oct 06:26
3eb6297
Compare
Choose a tag to compare

908db28 fix(bpf_collector): fix command name in case of kernel processes
b90a63b fix(bpf): exclude bpf overhead in bpf_cpu_time
987a139 fix(metrics): Remove resource usage check for skipping bpf metrics
c20c23c fix: error initializing dcgm
4121376 fix(aa66ada): reading libraries to the builder
4c63b43 fix(validator): update the bpf cpu time query
097541c fix: gosec failures (#1778)
5b95acb fix: nvml/dcgm builds
9d14581 fix: habana image build
aa66ada fix(dockerfile): remove redundant habanalabs installation steps
34d27b8 fix: do not probe for power-meters when disabled
ecd5f54 feat(models): update acpi dyn to 0.7.11
7303454 feat(models): update acpi abspower to 0.7.11
fcc8e0a feat(models): update intel-rapl abspower to 0.7.11
c84bfd0 feat(models): log model source url
7b07762 feat: get trainer from model_name in weight
a7b9892 fix(models): add predictor name in errors
aa822ee feat: compute core ratio for local regressor (#1743)
5d6ecc0 fix(config): trim spaces and new lines in MODEL_CONFIG
b2926d1 fix: limit max core ratio to 1
28d42a4 feat: compute idle power with core ratio (#1732)
d8a6c14 feat: add machine spec generator/reader for model weight request
11ff51d feat: add --disable-power-meter option
414533c fix: set default trainer only for local regressor
d446231 fix: format ComponentModelWeights
a6f75a4 feat: add model_name attribute to ComponentModelWeights
6858c58 fix: watcher resubmit items to workqueue (#1686)
eb5a72a fix(bpf): Fix overhead when sampling (#1685)
a97030f fix(bpf): use prev_tgid to register process
5432a39 feat: customize vm_id with libvirt metadata
73367d3 fix: correct regex path name for VM
6a6017b feat: save validation result as json and show the static dashboard using js
c99e399 fix(pkg/bpf): Use channel to process events (#1671)
cde7833 fix: resolve pid 0 to system_processes
b424607 feat(kubernetes): Use workqueues
a39ae55 fix: typo in filename
d73094e fix: apply suggestions from code review
5a113b2 fix(bpf): tgid is in the upper 32 bits

New Contributors

Full Changelog: v0.7.11...v0.7.12

release-0.7.11

11 Jul 06:10
bf1f62d
Compare
Choose a tag to compare

Changes

  • 1506691 - feat(validator): trigger validator workflow on changes (#1591)
  • c7b3ddb - fix(collector): convert cpu time in collection time instead of reporting time to avoid inconsistent use of cpu time in models
  • 9c80387 - bpf: account all running state processes (#1546)
  • 91fc8d4 - feat(validator): Add workflow for validator tests (#1570)
  • a2289d2 - fix: fallback to reading cpus.yaml relative to current dir (#1572)
  • bfaadae - pick up the go mod vendor changes
  • fb7ef35 - feat(metrics): selectively expose prom metrics to reduce overhead
  • d412bfb - fix: vendor/github.com/jaypipes/ghw/Dockerfile to reduce vulnerabilities (#1578)
  • 365ac03 - bpf: remove tgid map
  • c427a47 - fix(manifest): uncomment openshift SCC (#1575)
  • ec2a775 - fix(validator): improve the validator config sample (#1569)
  • 0e22839 - fix: update the VERSION variable assignment method (#1552)
  • 96dd443 - fix: Fix uncomment of YAML in hack/build-manifests.sh
  • 8931d61 - feat(validator): load validations from validations.yaml
  • 4a7bc31 - fix(compose): enable bpf cgroup id
  • a57041c - fix(bpf): Fix kepler_write_page_cache attach
  • fbe9b3c - fix(bpf): Access __state from task_struct (#1550)
  • 0b0b215 - fix(bpf): Use BTF-Defined Raw Tracepoints (#1542)
  • a08a5f6 - deps: Fix usage of textparse.NewPromParser
  • aad6964 - fix(bpf): Fix map lookup for IRQ/Page Cache
  • 9114e75 - Fix MSE and MAPE Single Queries (#1522)
  • 330a531 - fix(bpf): restore command label in process metrics
  • edd4d04 - review feedback: fix mse queries
  • d6420d5 - bump up local_dev_cluster_version version
  • 4ced508 - bpf-collector: change log verbosity to easily show it in CI
  • 34889bb - libbpf: update to use microseconds instead of milliseconds in the ebpf code because the low precision is identifying that the precess was not active
  • 4337a5e - bpf: remove task time
  • 0426e8f - feat(exporter): Graceful Shutdown
  • aec3ab5 - report validator results
  • 07636b1 - Replace expected and actual query with single query (#1489)
  • 1759cca - feat(compose): add build arguments for Kepler image
  • c678217 - use pmu name to get arm cpu id since archspec does not help here
  • 8bb405f - stats: update the verbosity of annoying key error message due to missing gpu metrics (#1480)
  • 59af568 - Add Test Cases for Prometheus, Config, Stresser for Validator (#1461)
  • bdd44b1 - bpf: fix the process parameter order to match the c and go code (#1479)
  • 747e7eb - fix: ensure all entries from bpf map is copied (#1477)
  • c092204 - make: quote ldflags
  • 468ed25 - add vm name option to validator (#1474)
  • 244ae8b - feat: expose version label in kepler_build_info (#1473)
  • 6ae21a0 - update validator usage; remove job from prom query
  • 3ac4f6b - feat(cgroup): Add podman support (#1455)
  • ada7884 - fix platform power return unit (#1468)
  • 3c7e777 - fix(collector): Fix use of waitgroups
  • b134a84 - fix(cmd/validator): Don't add when passing a wg
  • 0158b0b - fix(dev-dashboard): update and correct metrics in dev dashboard
  • f92532a - add new maintainers per 05/21 community meeting vote results (#1462)
  • efad46f - provide a simple template for maintainer nominate (#1463)
  • 9e957f3 - finish kepler on rhel tests
  • dcf78e6 - fix: remove logging while collecting GPU metrics
  • 49acca9 - fix(model): Use correct variable in IsNodeComponentPowerModelEnabled() (#1458)
  • 1b93eb1 - Adding New Metric Cases to Case module (#1453)
  • ea3e2f8 - add equinix metal instance to CI
  • 53d06d4 - add PR review bot (#1446)
  • 0baec47 - feat: Fixed eBPF Feature Detection (#1443)
  • 5f59172 - fix(bpf): cleanup initialising structs and nested ifs (#1444)
  • 2bca8dc - update hack/libbpf-headers.sh script to pull v1.3.0

New Contributors

Full Changelog: v0.7.10...v0.7.11

release-0.7.10

15 May 23:29
54f3613
Compare
Choose a tag to compare

Summary

  • fix(bpfassets): Fix object file lookup (#1419)
  • feat(bpf): Build for bpfel and bpfeb
  • feat(bpf): Bump up libbpf to 1.3.0
  • fix(dashboard): show metal and VM metrics correctly (#1395)
  • doc(dev): add section on how to profile (#1396)
  • feat(bpf): Portable eBPF Probes
  • feat(test): initial version of validator tool
  • dev(compose): add manifests for validation
  • fix(collector): Fix Segmentation fault when collecting CPU Freq from BPF (#1387)
  • feat(kepler): enable pprof (#1383)
  • fix habana installation
  • fix previous pid of finish_task_switch (#1370)
  • fix: update dashboard for docker-compose
  • fix(build): reduce image size by squashing install and clean steps
  • feat(compose): add docker-compose for easier local development
  • feat(exporter): log listening port
  • fix(build): reduce container image size (#1336)

New Contributors

Full Changelog: v0.7.9...v0.7.10

release-0.7.8

08 Apr 17:55
Compare
Choose a tag to compare
bot: Updated coverage badge.

Signed-off-by: sustainable-computing-bot <bot@sustainable-computing.io>

release-0.7.8

04 Mar 14:38
Compare
Choose a tag to compare
bot: Updated coverage badge.

Signed-off-by: sustainable-computing-bot <bot@sustainable-computing.io>

release-0.7.7

23 Feb 14:55
c34c19a
Compare
Choose a tag to compare
revert rpm source (#1254)

Signed-off-by: Huamin Chen <hchen@redhat.com>