Skip to content

Conversation

JoelSpeed
Copy link
Contributor

@JoelSpeed JoelSpeed commented Aug 29, 2025

A number of our markers require information about the type before they can be applied, for example, +kubebuilder:validation:Minimum requires the type of the scheme to be an integer or number.

However, if I'm using a local type declaration

type Port int32

And I then use this in a field

// +kubebuilder:validation:Minimum:=1
Port Port `json:"port"`

Currently, this doesn't work.

The reason for this is that while we fetch the type schema, we return a link to the schema, and then try to apply the markers for the field.

Because we only have a link to the schema, the type of the schema is empty, and the markers fail to apply.

This change effectively moves some of the flattening logic earlier in the process, so that rather than returning a link to the type schema to be flattened later, we return the actual type schema with its markers already applied.

This allows the field level markers to override (breaking change!) what was on the schema for the type, where previously it would create an allOf validation, meaning there was no way to override locally, meaning local field level markers could only tighten, not loosen validation of an existing types markers. Since the local field markers could only tighten validation, this shouldn't actually be a breaking change for the APIs that have done this in the past, but their schemas will change and now only represent the tighter validation.

This could be an issue however if folks have previously tried to override locally (loosening) and not realised that their local field level markers were pointless, the schema will change to loosen the validation with this change in place.

/hold Since this could be breaking, this needs some discussion between maintainers and the community

CC @sbueringer @erikgb @alvaroaleman

Fixes #1269

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 29, 2025
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 29, 2025
@JoelSpeed
Copy link
Contributor Author

/approve cancel

As above, needs discussion before we merge this

@k8s-ci-robot k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 29, 2025
@dprotaso
Copy link
Contributor

I believe you'll need to adjust the min & max markers to follow refs - and other markers that assert json schema types

eg.

if !hasNumericType(schema) {
return fmt.Errorf("must apply minimum to a numeric value, found %s", schema.Type)
}
if schema.Type == "integer" && !isIntegral(m.Value()) {
return fmt.Errorf("cannot apply non-integral minimum validation (%v) to integer value", m.Value())
}

@sbueringer
Copy link
Member

sbueringer commented Aug 29, 2025

@JoelSpeed
In case of

// +kubebuilder:validation:Minimum:=1
Port Port `json:"port"`

it was simply not possible to have these markers on the field (#1269). So for this case this wouldn't be a breaking change. I assume there are other cases where it was possible to define the markers and now the behavior changes?

P.S. Just a single data point. I ran this against Cluster API and there it leads to no changes

@JoelSpeed
Copy link
Contributor Author

I assume there are other cases where it was possible to define the markers and now the behavior changes?

If I were to create a type alias as

// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=255
type StringAliasWithValidation = string

And then use that in a field

// This tests that field-level overrides are handled correctly
// for local type aliases.
// +kubebuilder:validation:MinLength=10
// +kubebuilder:validation:MaxLength=10
FieldLevelAliasOverride StringAliasWithValidation `json:"fieldLevelAliasOverride,omitempty"`

It would generate the schema validation

fieldLevelAliasOverride
  allOf:
  - maxLength: 255
    minLength: 1
  - maxLength: 10
    minLength: 10

And then with this PR, instead would generate

fieldLevelAliasOverride
  maxLength: 10
  minLength: 10

The effect on this example is that the validation actually remains the same. To satisfy the allOf, you have to satisfy the second entry (min/max 10) which also happens to satisfy the first.

If however, you set the field MinLength to 0, it would create

fieldLevelAliasOverride
  allOf:
  - maxLength: 255
    minLength: 1
  - maxLength: 0
    minLength: 10

Now I can set this field to anything in the range {1...10}

With this PR, it would then become

fieldLevelAliasOverride
  maxLength: 0
  minLength: 10

And now the valid range is {0...10}, loosening the validation.

Now you could argue that this was always the intention, but it may have been a mistake from folks that they never picked up, and this change would reveal it

@JoelSpeed
Copy link
Contributor Author

I believe you'll need to adjust the min & max markers to follow refs

To avoid sprawling ref following across all the markers (and forgetting to do this in the future for new markers), I've effectively flattened the refs in this PR before the markers are applied

@sbueringer
Copy link
Member

I would be fine with this change

@alvaroaleman
Copy link
Member

And now the valid range is {0...10}, loosening the validation

Is it possible to detect this situation from the generator (with reasonable effort)?

@JoelSpeed
Copy link
Contributor Author

Is it possible to detect this situation from the generator (with reasonable effort)?

Potentially? We could update the markers to emit a warning if they overwrite an existing value?

Though I think that's actually a desired use case? Think about the port example where we know a port number must be in the range {1-65535}. Sometimes you might want to re-use this but limit the range (making it tighter), but you may also want to allow the 0 value, so you'd want to loosen it.

I wouldn't want to prevent the use case where someone locally loosens the validation at the field level

@JoelSpeed
Copy link
Contributor Author

Is it possible to detect this situation from the generator (with reasonable effort)?

To my knowledge we don't have the prior state so we can't compare A/B before and after to see if the PR is breaking someone

@sbueringer
Copy link
Member

sbueringer commented Aug 29, 2025

Q: If this leads to "effective" changes for someone, they are able to "fix" that situation for them by adjusting their "local" markers on the fields accordingly, right?

@dprotaso
Copy link
Contributor

Is it possible to detect this situation from the generator (with reasonable effort)?

Potentially? We could update the markers to emit a warning if they overwrite an existing value?

Though I think that's actually a desired use case? Think about the port example where we know a port number must be in the range {1-65535}. Sometimes you might want to re-use this but limit the range (making it tighter), but you may also want to allow the 0 value, so you'd want to loosen it.

I wouldn't want to prevent the use case where someone locally loosens the validation at the field level

This is the exactly gateway APIs use case. In general I'd expect default validation to be tighter and then potentially loosened at different uses of a type.

@JoelSpeed
Copy link
Contributor Author

Q: If this leads to "effective" changes for someone, they are able to "fix" that situation for them by adjusting their "local" markers on the fields accordingly, right?

Yes. All they need to do is adjust the markers at the field level. Field level will now always be the source of truth if the marker is defined on both, for things like Min/Max{Length,Items,Properties}.

In theory this makes the project more flexible, not less

@erikgb
Copy link
Member

erikgb commented Aug 29, 2025

I don't fully understand the code change here, but agree to all comments and examples above. Most/all projects using controller-gen have generated files under source control. Some might hide the diff for generated CRDs in PRs, but hopefully have tests to catch any regression. I am 👍 for this change.

@erikgb
Copy link
Member

erikgb commented Sep 1, 2025

Is there any chance of getting this fix into a patch release of controller-tools, or would we have to wait for the next major release? I am really eager to upgrade to K8s 1.34 in cert-manager, as we also have downstream projects currently blocked by this. I opened up kubernetes-sigs/gateway-api#4049 as a workaround (thanks, Joel), and I also have another potential workaround available. But I was hoping to avoid working around this issue.

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 4, 2025
@sbueringer
Copy link
Member

sbueringer commented Sep 8, 2025

I don't know if we even have consensus to merge this PR (I'm in favor though)

In general it's also possible to use controller-gen from the main branch

@cbandy
Copy link
Contributor

cbandy commented Sep 9, 2025

AFAICT, Kubernetes considers type: string, allOf: [{ maxLength: 63 },{ maxLength: 55 }] an unbounded string, significantly limiting what can be done in CEL

I have this type alias that I tried to use later with tighter field validations:

// ---
// +kubebuilder:validation:MinLength=1
// +kubebuilder:validation:MaxLength=63
// +kubebuilder:validation:Pattern=`^[a-z0-9]([-a-z0-9]*[a-z0-9])?$`
type DNS1123Label = string

type Example struct {
	// ---
	// We prepend "example-" to avoid collisions with other [corev1.PodSpec.Volumes],
	// so the maximum is 8 less than the inherited 63.
	// +kubebuilder:validation:MaxLength=55
	// +required
	Name DNS1123Label `json:"name"`
}

Touching that name field with a CEL rule produces an invalid CRD:

spec.versions[0].schema.openAPIV3Schema.properties[spec].x-kubernetes-validations[2].rule: Forbidden: estimated rule cost exceeds budget by factor of 5.0x

crd-schema-checker estimates similarly:

info: "MustNotExceedCostBudget": ^.spec: Rule 2 raw cost is 50332987. Estimated total cost of 50332987. The maximum allowable value is 10000000. Rule is 503.33% of allowed budget.

Removing the field-level MaxLength changes the schema to type: string, maxLength: 63 making the CRD valid, and crd-schema-checker agrees:

info: "MustNotExceedCostBudget": ^.spec: Rule 2 raw cost is 5467. Estimated total cost of 5467. The maximum allowable value is 10000000. Rule is 0.05% of allowed budget.

@sbueringer
Copy link
Member

Sounds like a bug, can you please report this to k/k?

@JoelSpeed
Copy link
Contributor Author

Agreed, sounds like a bug in K/K, but also, this PR would fix the generated schema for you

@cbandy
Copy link
Contributor

cbandy commented Sep 10, 2025

I'll try to make a minimal reproduction. 🤞🏻 I hope not to get a "don't do that" when I show a schema with two maxLength on one field.

@cbandy
Copy link
Contributor

cbandy commented Sep 10, 2025

Regarding this PR, I suspect that there is no approach to combining markers that can express every OpenAPI schema. I like the simplicity of fields "overriding" types.

🤔 I'm struggling to imagine a case where allOf is useful.

Edit: Rather, every way I imagine using allOf can also be expressed without it. The allOf property is an excellent way for the schema generator to combine a collection of opaque constraints.

@cbandy
Copy link
Contributor

cbandy commented Sep 12, 2025

📝 exclusiveMinimum and exclusiveMaximum must be considered together with minimum and maximum.

Consider a type with min: 0, excl: true which is then used as a field with min: 1. That might look like the following today:

type: integer
allOf:
  # smallest value is one
  - minimum: 0
    exclusiveMinimum: true

  # smallest value is one
  - minimum: 1

@JoelSpeed
Copy link
Contributor Author

@alvaroaleman Any strong opinion on whether this should or shouldn't move forward? @sbueringer (#1270 (comment)) and I are both ok with moving it forward I believe

@alvaroaleman
Copy link
Member

No strong opinion either way

@sbueringer
Copy link
Member

/lgtm
/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 16, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 16, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2e74693c106978b1c41c5737dcd17ce3b3dba4c6

@sbueringer
Copy link
Member

Is there any chance of getting this fix into a patch release of controller-tools, or would we have to wait for the next major release? I am really eager to upgrade to K8s 1.34 in cert-manager, as we also have downstream projects currently blocked by this. I opened up kubernetes-sigs/gateway-api#4049 as a workaround (thanks, Joel), and I also have another potential workaround available. But I was hoping to avoid working around this issue.

I would recommend using controller-gen from a main branch commit

@k8s-ci-robot k8s-ci-robot merged commit f1c7919 into kubernetes-sigs:main Sep 16, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to generate CRD with min/max validation for field with int32 alias type
7 participants