OCPBUGS-62726: add CEL expression to enforce name cluster on singletons #834

ehearne-redhat · 2025-10-20T15:34:21Z

See https://issues.redhat.com/browse/OCPBUGS-62726 for reference .

What:

CEL expression added to enforce name as cluster on kubedescheduler instances.

How:

CEL validation expression added to pkg/apis/descheduler/v1/types_descheduler.go to enforce metadata.name == cluster

Why:

If name is not cluster, there is no error to handle it to the user. This ensures the user knows why exactly their kubedescheduler instance did not start if the name was not cluster .

How to test

Clone ehearne-redhat's fork of this repository.
Change to OCPBUGS-62726-cel-enforce-singleton-naming branch.
Launch a 4.20 cluster.
Log into the cluster via CLI and Console.

Test via Console

Install Kube Descheduler Operator from OperatorHub. You can find this in Ecosystem --> Software Catalog and then search for kube descheduler .
Apply the manifest manifests/kube-descheduler-operator.crd.yaml to the cluster via the CLI --> oc apply -f manifests/kube-descheduler-operator.crd.yaml .
Try to create a Kube Descheduler Instance in the console with an invalid name.
a. Go to Ecosystem --> Installed Operators.
b. Change the project to openshift-kube-descheduler-operator.
c. Click on the Kube Descheduler Operator .
d. Click on the Kube Descheduler tab, then click on the blue Create KubeDescheduler button .
e. Change the Name field to something other than cluster. E.g. not-cluster . Scroll to the bottom and click Create.
f. You should see the following error message:
g. Now try to create the instance using the name cluster . The instance should create as normal. Delete the instance and test via CLI below.

Test via CLI

Create a YAML file using this format.
Change .metadata.name to something other than cluster .
Apply the YAML file --> oc apply -f <your yaml filename>.yaml .
You should receive the following error: The KubeDescheduler "not-cluster" is invalid: <nil>: Invalid value: "object": kubedescheduler is a singleton, .metadata.name must be 'cluster'
Change .metadata.name to cluster and reapply using the same command as above.
You should be able to create an instance and see a message similar to: kubedescheduler.operator.openshift.io/cluster created .
You can now shut down the cluster.

openshift-ci-robot · 2025-10-20T15:34:28Z

@ehearne-redhat: This pull request references Jira Issue OCPBUGS-62726, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.21.0) matches configured target version for branch (4.21.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @kasturinarra

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

See https://issues.redhat.com/browse/OCPBUGS-62726 for reference .

What:

CEL expression added to enforce name as cluster on kubedescheduler instances.

How:

CEL validation expression added to pkg/apis/descheduler/v1/types_descheduler.go to enforce metadata.name == cluster

Why:

If name is not cluster, there is no error to handle it to the user. This ensures the user knows why exactly their kubedescheduler instance did not start if the name was not cluster .

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-10-20T15:34:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ehearne-redhat
Once this PR has been reviewed and has the lgtm label, please assign ingvagabund for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ehearne-redhat · 2025-10-20T15:42:15Z

Will test change on 4.20 cluster through custom built image and update description on how to test change tomorrow.

everettraven

You'll need to also regenerate the CustomResourceDefinition.

Looks like make regen-crd is what you'll need [1].

cluster-kube-descheduler-operator/Makefile

Lines 41 to 45 in ccddc66

    
           regen-crd: 
        
           	go build -o _output/tools/bin/controller-gen ./vendor/sigs.k8s.io/controller-tools/cmd/controller-gen 
        
           	cp manifests/kube-descheduler-operator.crd.yaml manifests/operator.openshift.io_kubedeschedulers.yaml 
        
           	./_output/tools/bin/controller-gen crd paths=./pkg/apis/descheduler/v1/... schemapatch:manifests=./manifests output:crd:dir=./manifests 
        
           	mv manifests/operator.openshift.io_kubedeschedulers.yaml manifests/kube-descheduler-operator.crd.yaml

ehearne-redhat · 2025-10-21T16:19:10Z

Had some build issues - will update tomorrow.

ehearne-redhat · 2025-10-21T16:19:36Z

/retest

ehearne-redhat · 2025-10-22T11:54:10Z

Hello - can confirm that the CEL expression works in console and through cli

Console

CLI

ehearne-mac:cluster-kube-descheduler-operator ehearne$ oc apply -f file.yaml 
The KubeDescheduler "not-cluster" is invalid: <nil>: Invalid value: "object": kubedescheduler is a singleton, .metadata.name must be 'cluster'

For some reason, .metadata.name is treated as object and not string. This results in an unclear message being seen on both CLI and console. Is there a way around this?

openshift-ci-robot · 2025-10-22T12:15:42Z

@ehearne-redhat: This pull request references Jira Issue OCPBUGS-62726, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.21.0) matches configured target version for branch (4.21.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @kasturinarra

In response to this:

See https://issues.redhat.com/browse/OCPBUGS-62726 for reference .

What:

CEL expression added to enforce name as cluster on kubedescheduler instances.

How:

CEL validation expression added to pkg/apis/descheduler/v1/types_descheduler.go to enforce metadata.name == cluster

Why:

If name is not cluster, there is no error to handle it to the user. This ensures the user knows why exactly their kubedescheduler instance did not start if the name was not cluster .

How to test

Clone ehearne-redhat's fork of this repository.

Change to OCPBUGS-62726-cel-enforce-singleton-naming branch.

Launch a 4.20 cluster.

Log into the cluster via CLI and Console.

Test via Console

Install Kube Descheduler Operator from OperatorHub. You can find this in Ecosystem --> Software Catalog and then search for kube descheduler .

Apply the manifest manifests/kube-descheduler-operator.crd.yaml to the cluster via the CLI --> oc apply -f manifests/kube-descheduler-operator.crd.yaml .

Try to create a Kube Descheduler Instance in the console with an invalid name.
a. Go to Ecosystem --> Installed Operators.
b. Change the project to openshift-kube-descheduler-operator.
c. Click on the Kube Descheduler Operator .
d. Click on the Kube Descheduler tab, then click on the blue Create KubeDescheduler button .
e. Change the Name field to something other than cluster. E.g. not-cluster . Scroll to the bottom and click Create.
f. You should see the following error message:
g. Now try to create the instance using the name cluster . The instance should create as normal. Delete the instance and test via CLI below.

Test via CLI

Create a YAML file using this format.

Change .metadata.name to something other than cluster .

Apply the YAML file --> oc apply -f <your yaml filename>.yaml .

You should receive the following error: The KubeDescheduler "not-cluster" is invalid: <nil>: Invalid value: "object": kubedescheduler is a singleton, .metadata.name must be 'cluster'

Change .metadata.name to cluster and reapply using the same command as above.

You should be able to create an instance and see a message similar to: kubedescheduler.operator.openshift.io/cluster created .

You can now shut down the cluster.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

ehearne-redhat · 2025-10-22T12:17:32Z

@ingvagabund kindly requesting your review on this PR as validation steps complete :)

ehearne-redhat · 2025-10-22T12:18:26Z

@everettraven would you have any thoughts on the strange error format seen in an above comment?

config/schemapatch/kube-descheduler-operator.crd.yaml

manifests/kube-descheduler-operator.crd.yaml

pkg/apis/descheduler/v1/types_descheduler.go

ingvagabund · 2025-10-22T12:22:23Z

Generating the CRD is a semi automatic step here. I suppose this could be improved, yet there's still something new to learn about the generators. Besides those few comments this looks good. Thank you for improving this :)

everettraven · 2025-10-22T12:35:55Z

@everettraven would you have any thoughts on the strange error format seen in an above comment?

@ehearne-redhat It is likely because of where the validation is placed (i.e KubeDescheduler is an openapi "object"). If you wanted to have a more granular error message, I believe you can set a field path to point directly to the field that is in error.

See https://book.kubebuilder.io/reference/markers/crd-validation for more information. It is easiest to find if you search the page for XValidation

ehearne-redhat · 2025-10-22T13:09:22Z

So it looks like from discussions here that it is not possible to use CEL expression to include metadata.name values within the error, but it will work to validate the field. That's why <nil> shows up in the error message.

To have a cleaner error message would involve using a validation webhook. Otherwise the user would see this unclear message when creating the kubedescheduler instance with an invalid name through CLI and on Console.

Is having <nil> in the error message acceptable? @ingvagabund If so, I'm happy to re-request a review and hopefully get this merged.

ehearne-redhat · 2025-10-22T13:39:31Z

/retest

everettraven · 2025-10-22T13:56:08Z

@ehearne-redhat Even if you specify +kubebuilder:validation:XValidation:=rule="...",message="...",fieldPath=".metadata.name" (field path may not need to start with . i dont recall exactly) it shows the nil field ?

ingvagabund · 2025-10-22T13:59:07Z

If there's a way it's better to replace nil with a more readable alternative.

ehearne-redhat · 2025-10-22T14:01:23Z

@everettraven @ingvagabund I will try this out and update you shortly.

openshift-ci · 2025-10-22T14:25:05Z

@ehearne-redhat: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-operator	`f3c037b`	link	true	`/test e2e-aws-operator`
ci/prow/images	`f3c037b`	link	true	`/test images`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

ehearne-redhat · 2025-10-22T14:59:34Z

@everettraven so I can add fieldPath='metadata.name' and make regen-crd does regenerate, but applying it to the cluster throws the following error:

ehearne-mac:cluster-kube-descheduler-operator ehearne$ oc apply -f manifests/kube-descheduler-operator.crd.yaml
Warning: resource customresourcedefinitions/kubedeschedulers.operator.openshift.io is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by oc apply. oc apply should only be used on resources created declaratively by either oc create --save-config or oc apply. The missing annotation will be patched automatically.
The CustomResourceDefinition "kubedeschedulers.operator.openshift.io" is invalid: spec.validation.openAPIV3Schema.x-kubernetes-validations[0].fieldPath: Invalid value: "metadata.name": fieldPath must be a valid path

I have checked other operators built with similar logic to this, such as Kueue Operator, which also has this similar error:

I was able to track when they first implemented similar logic here . However, I could not find any comment about this behaviour there.

So, it looks like this way of implementing the enforcement of name cluster on singletons is quite common within OpenShift . It still seems strange that we would allow <nil> error messages to be seen by the user.

I am happy to implement a better solution using a validation webhook if preferred, but given the following information I will leave it to you @ingvagabund to decide on that. :)

everettraven · 2025-10-22T15:17:10Z

@ehearne-redhat I wonder if it is considering that path invalid because it is missing the leading dot? Looking at the tests for path validation in https://github.com/kubernetes/apiextensions-apiserver/blob/4c7c8214a2fa680ac4f485e8ed8c52a248bafb7a/pkg/apiserver/schema/cel/validation_test.go#L3750-L3756 it looks like it wants a leading dot in the path.

Does using .metadata.name (note that this starts with .) resolve the issue?

everettraven · 2025-10-22T15:19:03Z

If not, I don't think it is a huge deal. It is pretty standard practice for us to have this check for cluster singletons and I don't think going down the path of a validating webhook is worth it for this small of an issue.

ehearne-redhat · 2025-10-22T15:20:45Z

@everettraven I should have mentioned that I did try different combinations for fieldPath such as name, Name, .metadata.name, etc. until metadata.name was accepted by controller-gen but application failed.

I think that makes sense, because it is so commonly used anyways.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 20, 2025

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Oct 20, 2025

openshift-ci bot requested review from ingvagabund and kasturinarra October 20, 2025 15:34

openshift-ci bot requested a review from ricardomaraschini October 20, 2025 15:34

everettraven reviewed Oct 20, 2025

View reviewed changes

ehearne-redhat force-pushed the OCPBUGS-62726-cel-enforce-singleton-naming branch from 0784906 to c3027f1 Compare October 21, 2025 08:14

ehearne-redhat changed the title ~~[WIP] OCPBUGS-62726: add CEL expression to enforce name cluster on singletons~~ OCPBUGS-62726: add CEL expression to enforce name cluster on singletons Oct 22, 2025

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 22, 2025

ingvagabund reviewed Oct 22, 2025

View reviewed changes

config/schemapatch/kube-descheduler-operator.crd.yaml Outdated Show resolved Hide resolved

ingvagabund reviewed Oct 22, 2025

View reviewed changes

manifests/kube-descheduler-operator.crd.yaml Outdated Show resolved Hide resolved

ingvagabund reviewed Oct 22, 2025

View reviewed changes

manifests/kube-descheduler-operator.crd.yaml Outdated Show resolved Hide resolved

ingvagabund reviewed Oct 22, 2025

View reviewed changes

pkg/apis/descheduler/v1/types_descheduler.go Show resolved Hide resolved

add CEL expression to enforce name cluster on singletons

f3c037b

ehearne-redhat force-pushed the OCPBUGS-62726-cel-enforce-singleton-naming branch from c3027f1 to f3c037b Compare October 22, 2025 12:35

	regen-crd:
	go build -o _output/tools/bin/controller-gen ./vendor/sigs.k8s.io/controller-tools/cmd/controller-gen
	cp manifests/kube-descheduler-operator.crd.yaml manifests/operator.openshift.io_kubedeschedulers.yaml
	./_output/tools/bin/controller-gen crd paths=./pkg/apis/descheduler/v1/... schemapatch:manifests=./manifests output:crd:dir=./manifests
	mv manifests/operator.openshift.io_kubedeschedulers.yaml manifests/kube-descheduler-operator.crd.yaml

OCPBUGS-62726: add CEL expression to enforce name cluster on singletons #834

Are you sure you want to change the base?

OCPBUGS-62726: add CEL expression to enforce name cluster on singletons #834

Conversation

ehearne-redhat commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What:

How:

Why:

How to test

Test via Console

Test via CLI

Uh oh!

openshift-ci-robot commented Oct 20, 2025

What:

How:

Why:

Uh oh!

openshift-ci bot commented Oct 20, 2025

Uh oh!

ehearne-redhat commented Oct 20, 2025

Uh oh!

everettraven left a comment

Choose a reason for hiding this comment

Uh oh!

ehearne-redhat commented Oct 21, 2025

Uh oh!

ehearne-redhat commented Oct 21, 2025

Uh oh!

ehearne-redhat commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Console

CLI

Uh oh!

openshift-ci-robot commented Oct 22, 2025

What:

How:

Why:

How to test

Test via Console

Test via CLI

Uh oh!

ehearne-redhat commented Oct 22, 2025

Uh oh!

ehearne-redhat commented Oct 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ingvagabund commented Oct 22, 2025

Uh oh!

everettraven commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ehearne-redhat commented Oct 22, 2025

Uh oh!

ehearne-redhat commented Oct 22, 2025

Uh oh!

everettraven commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ingvagabund commented Oct 22, 2025

Uh oh!

ehearne-redhat commented Oct 22, 2025

Uh oh!

openshift-ci bot commented Oct 22, 2025

Uh oh!

ehearne-redhat commented Oct 22, 2025

Uh oh!

everettraven commented Oct 22, 2025

Uh oh!

everettraven commented Oct 22, 2025

Uh oh!

ehearne-redhat commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

ehearne-redhat commented Oct 20, 2025 •

edited

Loading

ehearne-redhat commented Oct 22, 2025 •

edited

Loading

everettraven commented Oct 22, 2025 •

edited

Loading

everettraven commented Oct 22, 2025 •

edited

Loading