-
Notifications
You must be signed in to change notification settings - Fork 11
Enable configuring containerd plugin runtime options #387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Enable configuring containerd plugin runtime options #387
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing users/operators to configure shim-specific options on the Shim CRD seems like a great fit to me. The generic map[string]string approach seems fitting, since these are intended to be runtime-specific options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick re-review to also note the failing smoke test. I believe this is due to the RCM Helm chart under deploy/helm
being managed manually -- so we'll need to update the chart's crd yaml with the new runtimeOptions
config. That should be the only Helm chart update needed, right?
(side note: spotted this #390)
Is there anything we'd want to explicitly check in the case of the e2e helm install test regarding this feature? I'm aware of one higher-level piece of functionality that relies on this -- pod metrics for eg scaling -- but if there's some quick check we can add, that'd be great. Otherwise, if an automated check/test would be non-trivial and manual verification is the best option for now, fine to discuss in a follow-up. |
764326c
to
cfe6290
Compare
@vdice can you point me to where e2e testing is happening in CI? Checking that the containerd config is updated to contain a specific line is an option |
@kate-goldenring here's the current e2e smoke test: https://github.com/spinframework/runtime-class-manager/blob/main/.github/workflows/helm-chart-smoketest.yml |
|
b6d0fd7
to
500d68f
Compare
internal/containerd/install_dbus.go
Outdated
// NOTE: this limits support to systems using systemctl to manage systemd | ||
func UsesSystemd() bool { | ||
// Check if is a systemd system | ||
cmd := nsenterCmd("systemctl", "is-system-running", "--quiet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might consider using a different command for this.
On the one hand, this current logic can be considered appropriate per the is-system-running cmd which will only exit 0 when systemd is fully up and running. On the other hand, systemd may be in other states (and this cmd will exit non-zero and fail) -- but systemd could still be considered as in use on the host. (I discovered this by pilfering the logic here for my PR and then testing on an AKS cluster, where systemd happened to be in a 'degraded' state, thus this function returned 'false'... I didn't dig into why my cluster's systemd was in this state, but it seemed to function fine otherwise.)
I ended up just using list-units, assuming systemctl
itself would not be installed on a host that doesn't use systemd, though that might not be foolproof either...
It looks like the minikube variant in the smoke test is failing on the dbus install. Seeing these logs from the installer pod (from the
|
@vdice we had this same issue in the spin shim due to expired SUSE keys. The workaround is to use a custom minikube image: spinframework/containerd-shim-spin@cc2e3de. Pushed up the fix |
@@ -17,6 +17,9 @@ spec: | |||
anonHttp: | |||
location: "https://github.com/spinframework/containerd-shim-spin/releases/download/v0.19.0/containerd-shim-spin-v2-linux-aarch64.tar.gz" | |||
|
|||
runtimeOptions: | |||
SystemdCgroup: "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far, RCM only supports containerD-based clusters. This is because the nodeinstaller's implementation is limited (for now). However, technically, the nodeinstaller could handle, e.g., OpenShift, which is based on cri-o.
Up until now, I think the shim spec was very container runtime agnostic. However, with the introduction of runtimeOption
, I'm not sure.
I'm not a CRI-O expert, but I can imagine that it requires different configurations or a different configuration format.
My question: Do we need to introduce a type to specify which runtime this configuration is for?
I'm aware that this is very hypothetical, especially since the CRD is in an alpha state and we have never talked in depth about supporting open shift until now. But I'm nevertheless curious about your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats a good point. I could see a few options:
- update this to
containerdRuntimeOptions
or something else containerd specific - in the future if we support a cri-o shim, we could add a field called
runtimeType
that has the optioncontainerd
orcrio
. This informs how to useruntimeOptions
- We make a breaking change when the time comes
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
fe6d153
to
4ddb0a1
Compare
with: | ||
container-runtime: containerd | ||
kubernetes-version: ${{ env.K8S_VERSION }} | ||
run: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an upstream (minikube?) issue we'd monitor in anticipation of returning to the GH action in the future? (Or is this probably a long-term fix?)
0480f80
to
ac78c23
Compare
@vdice it looks like the issue of the rcm pod restarting and erroring is popping up again:
|
@kate-goldenring can you capture logs (from the |
@kate-goldenring oh sorry, I now realize that was from CI on this PR. Check out the
I'm not sure why/how one of the pods finished with |
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
ac78c23
to
51e1176
Compare
Signed-off-by: Kate Goldenring <kate.goldenring@fermyon.com>
Describe your changes
Fixes #386 by enabling users to define runtime options for their shim in the
Shim
CRD. Ultimately, the node installer will inject them into the containerd config like the following to configure the SystemdCgroup.The Spin shim currently accepts a cgroup configuration option like above. Other shims may accept other options. Therefore, this implementation supports setting a generic map of strings as options.
Draft as I am a first time contributor and the maintainers may want to take a different approach. I also have not tested this locally on a cluster yet. Want to see if I am on track first.
Issue ticket number and link
Fixes #386
Checklist before requesting a review