Skip to content

Update e2e-resources and e2e for local Windows testing #1185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

TmNguyen12
Copy link
Contributor

@TmNguyen12 TmNguyen12 commented Mar 24, 2025

Description

This PR adds several resources to the e2e-resources chart that deploy onto Windows-based nodes (Windows Server 2019 & Windows Server 2022) when enabled.

Resource types include

  • deployment / replicaset
  • statefulset
  • cronjob
  • daemonset
  • failing jobs

It also includes e2e tests meant to run locally, separately from the standard e2e tests. Our current automated e2e tests run using Minikube, which does not support Windows nodes, so we follow the basic local test script pointing at a different exceptions file & specs file for Windows. Automating this will be a little involved, and will come later.

Note that several tests present for Linux are not present for Windows due to known limitations in available metrics (eg most netTx/netRx/netErrors), tests that are not relevant for Windows (anything to do with cpuCfs), & some that I haven't looked into yet (PVC, container pending). This should cover the bulk of the tests, though.

Known issues:

  • Until the PR updating the nri-kubernetes helm chart is merged, the main integration must be installed separately from the windowsUpdateHelmChart branch.
  • I don't like the way I've laid out the 2019 vs 2022 tests, but it should do for now.
  • There are only exceptions files for 1.29 & 1.32, as we do not currently have canaries for other k8s versions
  • Resource requests/limits are fairly high for the new Windows resources. Powershell won't run on the resource requests/limits we had in place for linux-based resources (too low), but we could probably tune these requests down further.

Type of change

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • New feature / enhancement (non-breaking change which adds functionality)
  • Security fix
  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • Add changelog entry following the contributing guide
  • Documentation has been updated
  • This change requires changes in testing:
    • unit tests
    • E2E tests

@TmNguyen12 TmNguyen12 requested a review from a team as a code owner March 24, 2025 20:37
@kondracek-nr kondracek-nr force-pushed the windows-e2e-tests branch 3 times, most recently from 1533135 to 897bcfa Compare April 4, 2025 23:45
@kondracek-nr kondracek-nr force-pushed the windows-e2e-tests branch 3 times, most recently from 3631c0d to 2b6f008 Compare April 9, 2025 18:51
@kondracek-nr kondracek-nr changed the title WIP: update for windows e2e testing Update e2e-resources and e2e for local Windows testing Apr 9, 2025
@kondracek-nr kondracek-nr force-pushed the windows-e2e-tests branch 2 times, most recently from 4c49647 to 35b7b07 Compare April 9, 2025 19:20
| persistentVolumeClaim.enabled | bool | `true` | Create PVCs |
| scraper.enabled | bool | `false` | Deploy the scraper pod |
| statefulSet.enabled | bool | `true` | Deploy a dummy statefulSet |
| windows.is2019 | bool | `false` | Deploy resources on Windows Server 2019 nodes |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to keep nit picking on this, but can we not use tolerations here? Really trying to avoid baking in new config options that we'll end up needing to support in the future. If the plan is to drop theses keys in a follow up PR because we're currently testing Windows, I'm okay with this. Just checking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like to keep the option in e2e-resources because we might not always want to deploy resources for windows. is there an expectation that we need to support e2e-resources for customers?

Copy link
Contributor

@dbudziwojskiNR dbudziwojskiNR Apr 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that, but wouldn't the windows resources not deploy anyways due to tolerations? I don't think we do need to have customer support on this, it's more that I think we should minimize configs that may be "redundant".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because they are deployments they deploy and are stuck in pending because they cant' find a node.

if they were daemonsets they would not deploy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only if the cluster the user is deploying e2e-resources to doesn't have windows nodes! our canaries, for example, have a windows node by default, so it's possible someone's just trying to test something real quick in linux nodes. i'd rather have more levers to pull to target testing, like we do with whether or not we want deployments/statefulsets/daemonsets in general.

@TmNguyen12 TmNguyen12 mentioned this pull request Apr 11, 2025
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants