From e88760239cea16c2165b1a3851e25e66a23c4a72 Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 27 Mar 2023 17:12:01 -0700 Subject: [PATCH 1/8] Document how Gateway API resources are validated This commits documents how Gateway API resources are validated. Fixes https://github.com/nginxinc/nginx-kubernetes-gateway/issues/364 --- docs/resource-validation.md | 137 ++++++++++++++++++++++++++++++++++++ 1 file changed, 137 insertions(+) create mode 100644 docs/resource-validation.md diff --git a/docs/resource-validation.md b/docs/resource-validation.md new file mode 100644 index 0000000000..b1992c8a68 --- /dev/null +++ b/docs/resource-validation.md @@ -0,0 +1,137 @@ +# Gateway API Resource Validation + +This document describes how NGINX Kubernetes Gateway (NKG) validates Gateway API resources. + +## Overview + +There are several reasons why NKG validates Gateway API resources: + +- *Robustness*, to gracefully handle invalid resources. +- *Security*, to prevent malicious input from propagating to the NGINX configuration. +- *Correctness*, to conform to the Gateway API specification for handling invalid resources. + +Ultimately, the goal is to ensure that NGINX continues to handle traffic even if invalid Gateway API resources were +created. + +A Gateway API resource (a new resource or an update for the existing one) is validated by the following steps, some of +which are provided by the Gateway API, while others are done by NKG: + +1. OpenAPI schema validation by the Kubernetes API server. +2. Webhook validation by the Gateway API webhook. +3. Webhook validation by NKG. +4. Validation by NKG. + +To confirm that a resource is valid and accepted by NKG, check the `Accepted` condition in the resource status, +which must have the Status field set to `True`. For example, in a status of a valid HTTPRoute, if NKG accepts a +parentRef, the status of that parentRef will look like this: +``` +Status: + Parents: + Conditions: + Last Transition Time: 2023-03-30T23:18:00Z + Message: The route is accepted + Observed Generation: 2 + Reason: Accepted + Status: True + Type: Accepted + Controller Name: k8s-gateway.nginx.org/nginx-gateway-controller + Parent Ref: + Group: gateway.networking.k8s.io + Kind: Gateway + Name: gateway + Namespace: default + Section Name: http +``` + +> Make sure that the reported observed generation is the same as the resource generation. + +The remaining part of this document describes each step in detail with examples of how validation errors are reported. + +### Step 1 - OpenAPI Scheme Validation by Kubernetes API Server + +The Kubernetes API server validates Gateway API resources against the OpenAPI schema embedded in the Gateway API CRDs. +For example, if you create an HTTPRoute with an invalid hostname `cafe.!@#$%example.com`, the API server will reject it +with the following error: + +``` +kubectl apply -f coffee-route.yaml +The HTTPRoute "coffee" is invalid: spec.hostnames[0]: Invalid value: "cafe.!@#$%example.com": spec.hostnames[0] in body should match '^(\*\.)?[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$' +``` + +> While unlikely, this validation step can be bypassed if the Gateway API CRDs are modified to remove the validation. +> If this happens, Step 4 will ensure that invalid values (from NGINX perspective) are rejected. + +### Step 2 - Webhook Validation by Gateway API Webhook + +The Gateway API comes with a validating webhook which is enabled by default in the Gateway API installation manifests. +It validates Gateway API resources using advanced rules not available in the OpenAPI schema validation. For example, if +you create a Gateway resource with a TCP listener that configures a hostname, the webhook will reject it with the +following error: + +``` +kubectl apply -f gateway.yaml +Error from server: error when creating "gateway.yaml": admission webhook "validate.gateway.networking.k8s.io" denied the request: spec.listeners[1].hostname: Forbidden: should be empty for protocol TCP +``` + +> This validation step can be bypassed if the webhook is not running in the cluster. +> If this happens, Step 3 will ensure invalid values are rejected. + +### Step 3 - Webhook validation by NKG + +The previous step relies on the Gateway API webhook running in the cluster. To ensure that the resources are validated +with the webhook validation rules even if the webhook is not running, NKG performs the same validation. However, NKG +will perform the validation *after* a resource is already accepted by the Kubernetes API server. + +Below is an example of how NKG rejects an invalid resource (a Gateway resource with a TCP listener that configures a +hostname) with a Kubernetes event: + +``` +kubectl describe gateway gateway +. . . +Events: + Type Reason Age From Message + ---- ------ ---- ---- ------- + Warning Rejected 6s nginx-kubernetes-gateway-nginx the resource failed webhook validation, however the Gateway API webhook failed to reject it with the error; make sure the webhook is installed and running correctly; validation error: spec.listeners[1].hostname: Forbidden: should be empty for protocol TCP; NKG will delete any existing NGINX configuration that corresponds to the resource +``` + +> This validation step always runs and cannot be bypassed. + +> NKG will ignore any resources that fail the webhook validation like in the example above. +> If the resource previously existed, NKG will remove any existing NGINX configuration for that resource. + +### Step 4 - Validation by NKG + +This step catches the following cases of invalid values: + +* Values valid from the Gateway API perspective but not supported by NKG yet. For example, a certain filter in an + HTTPRoute routing rule. +* Some values in Gateway API resources which are valid by the CRD and webhook validation, but invalid for NGINX. Such + values will cause NGINX to fail to reload or operate erroneously. +* Invalid values in Gateway API resources that were not rejected because Step 1 was bypassed. +* Malicious values that inject unrestricted NGINX config into the NGINX configuration (similar to an SQL injection + attack). + +Below is an example of how NGK rejects an invalid resource. The validation error is reported via the status: + +``` +kubectl describe httproutes.gateway.networking.k8s.io coffee +. . . +Status: + Parents: + Conditions: + Last Transition Time: 2023-03-30T22:37:53Z + Message: All rules are invalid: spec.rules[0].matches[0].method: Unsupported value: "CONNECT": supported values: "DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT" + Observed Generation: 1 + Reason: UnsupportedValue + Status: False + Type: Accepted + Controller Name: k8s-gateway.nginx.org/nginx-gateway-controller + Parent Ref: + Group: gateway.networking.k8s.io + Kind: Gateway + Name: gateway + Namespace: default + Section Name: http +``` + +> This validation step always runs and cannot be bypassed. From 62ced0ce81261ab571dbb227f49277445b6e4137 Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 3 Apr 2023 17:18:11 -0700 Subject: [PATCH 2/8] Apply suggestions from code review Co-authored-by: Kate Osborn <50597707+kate-osborn@users.noreply.github.com> --- docs/resource-validation.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/resource-validation.md b/docs/resource-validation.md index b1992c8a68..75002bbdf0 100644 --- a/docs/resource-validation.md +++ b/docs/resource-validation.md @@ -43,7 +43,7 @@ Status: Section Name: http ``` -> Make sure that the reported observed generation is the same as the resource generation. +> Make sure the reported observed generation is the same as the resource generation. The remaining part of this document describes each step in detail with examples of how validation errors are reported. @@ -58,13 +58,13 @@ kubectl apply -f coffee-route.yaml The HTTPRoute "coffee" is invalid: spec.hostnames[0]: Invalid value: "cafe.!@#$%example.com": spec.hostnames[0] in body should match '^(\*\.)?[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$' ``` -> While unlikely, this validation step can be bypassed if the Gateway API CRDs are modified to remove the validation. -> If this happens, Step 4 will ensure that invalid values (from NGINX perspective) are rejected. +> While unlikely, bypassing this validation step is possible if the Gateway API CRDs are modified to remove the validation. +> If this happens, Step 4 will reject any invalid values (from NGINX perspective). ### Step 2 - Webhook Validation by Gateway API Webhook The Gateway API comes with a validating webhook which is enabled by default in the Gateway API installation manifests. -It validates Gateway API resources using advanced rules not available in the OpenAPI schema validation. For example, if +It validates Gateway API resources using advanced rules unavailable in the OpenAPI schema validation. For example, if you create a Gateway resource with a TCP listener that configures a hostname, the webhook will reject it with the following error: @@ -73,14 +73,14 @@ kubectl apply -f gateway.yaml Error from server: error when creating "gateway.yaml": admission webhook "validate.gateway.networking.k8s.io" denied the request: spec.listeners[1].hostname: Forbidden: should be empty for protocol TCP ``` -> This validation step can be bypassed if the webhook is not running in the cluster. -> If this happens, Step 3 will ensure invalid values are rejected. +> Bypassing this validation step is possible if the webhook is not running in the cluster. +> If this happens, Step 3 will reject the invalid values. ### Step 3 - Webhook validation by NKG The previous step relies on the Gateway API webhook running in the cluster. To ensure that the resources are validated -with the webhook validation rules even if the webhook is not running, NKG performs the same validation. However, NKG -will perform the validation *after* a resource is already accepted by the Kubernetes API server. +with the webhook validation rules, even if the webhook is not running, NKG performs the same validation. However, NKG +performs the validation *after* the Kubernetes API server accepts the resource. Below is an example of how NKG rejects an invalid resource (a Gateway resource with a TCP listener that configures a hostname) with a Kubernetes event: @@ -96,7 +96,7 @@ Events: > This validation step always runs and cannot be bypassed. -> NKG will ignore any resources that fail the webhook validation like in the example above. +> NKG will ignore any resources that fail the webhook validation, like in the example above. > If the resource previously existed, NKG will remove any existing NGINX configuration for that resource. ### Step 4 - Validation by NKG From 280d7a8987938470eeabd3f99c20b2cbd3d5928a Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 3 Apr 2023 17:19:38 -0700 Subject: [PATCH 3/8] Use ':' --- docs/resource-validation.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/resource-validation.md b/docs/resource-validation.md index 75002bbdf0..25f03d4d86 100644 --- a/docs/resource-validation.md +++ b/docs/resource-validation.md @@ -6,9 +6,9 @@ This document describes how NGINX Kubernetes Gateway (NKG) validates Gateway API There are several reasons why NKG validates Gateway API resources: -- *Robustness*, to gracefully handle invalid resources. -- *Security*, to prevent malicious input from propagating to the NGINX configuration. -- *Correctness*, to conform to the Gateway API specification for handling invalid resources. +- *Robustness*: to gracefully handle invalid resources. +- *Security*: to prevent malicious input from propagating to the NGINX configuration. +- *Correctness*: to conform to the Gateway API specification for handling invalid resources. Ultimately, the goal is to ensure that NGINX continues to handle traffic even if invalid Gateway API resources were created. From ffbc17987a89fb050950ebadf546e8ebfdfd780d Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 3 Apr 2023 17:20:54 -0700 Subject: [PATCH 4/8] Remove "some of which are provided by the Gateway API, while others are done by NKG" --- docs/resource-validation.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/resource-validation.md b/docs/resource-validation.md index 25f03d4d86..253a36e1fc 100644 --- a/docs/resource-validation.md +++ b/docs/resource-validation.md @@ -13,8 +13,7 @@ There are several reasons why NKG validates Gateway API resources: Ultimately, the goal is to ensure that NGINX continues to handle traffic even if invalid Gateway API resources were created. -A Gateway API resource (a new resource or an update for the existing one) is validated by the following steps, some of -which are provided by the Gateway API, while others are done by NKG: +A Gateway API resource (a new resource or an update for the existing one) is validated by the following steps: 1. OpenAPI schema validation by the Kubernetes API server. 2. Webhook validation by the Gateway API webhook. From 25f0f8489f13a3a5b7cb42176c1b26397d6e4604 Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 3 Apr 2023 17:22:38 -0700 Subject: [PATCH 5/8] Reword "check that the Accepted condition" --- docs/resource-validation.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/resource-validation.md b/docs/resource-validation.md index 253a36e1fc..969ecfee2a 100644 --- a/docs/resource-validation.md +++ b/docs/resource-validation.md @@ -20,9 +20,9 @@ A Gateway API resource (a new resource or an update for the existing one) is val 3. Webhook validation by NKG. 4. Validation by NKG. -To confirm that a resource is valid and accepted by NKG, check the `Accepted` condition in the resource status, -which must have the Status field set to `True`. For example, in a status of a valid HTTPRoute, if NKG accepts a -parentRef, the status of that parentRef will look like this: +To confirm that a resource is valid and accepted by NKG, check that the `Accepted` condition in the resource status +has the Status field set to `True`. For example, in a status of a valid HTTPRoute, if NKG accepts a parentRef, +the status of that parentRef will look like this: ``` Status: Parents: From b7c93303587a101d373959ed4f6771e193c69e70 Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 3 Apr 2023 17:51:29 -0700 Subject: [PATCH 6/8] Use prod-gateway instead of gateway --- docs/resource-validation.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/resource-validation.md b/docs/resource-validation.md index 969ecfee2a..15cae1b4e7 100644 --- a/docs/resource-validation.md +++ b/docs/resource-validation.md @@ -68,8 +68,8 @@ you create a Gateway resource with a TCP listener that configures a hostname, th following error: ``` -kubectl apply -f gateway.yaml -Error from server: error when creating "gateway.yaml": admission webhook "validate.gateway.networking.k8s.io" denied the request: spec.listeners[1].hostname: Forbidden: should be empty for protocol TCP +kubectl apply -f prod-gateway.yaml +Error from server: error when creating "prod-gateway.yaml": admission webhook "validate.gateway.networking.k8s.io" denied the request: spec.listeners[1].hostname: Forbidden: should be empty for protocol TCP ``` > Bypassing this validation step is possible if the webhook is not running in the cluster. @@ -85,7 +85,7 @@ Below is an example of how NKG rejects an invalid resource (a Gateway resource w hostname) with a Kubernetes event: ``` -kubectl describe gateway gateway +kubectl describe gateway prod-gateway . . . Events: Type Reason Age From Message @@ -128,7 +128,7 @@ Status: Parent Ref: Group: gateway.networking.k8s.io Kind: Gateway - Name: gateway + Name: prod-gateway Namespace: default Section Name: http ``` From 44b82106b0edfd4e8917f22f498b4800be81da6e Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 3 Apr 2023 17:56:30 -0700 Subject: [PATCH 7/8] Improve example of unsupported feature --- docs/resource-validation.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/resource-validation.md b/docs/resource-validation.md index 15cae1b4e7..6f7f5c0ac3 100644 --- a/docs/resource-validation.md +++ b/docs/resource-validation.md @@ -102,8 +102,9 @@ Events: This step catches the following cases of invalid values: -* Values valid from the Gateway API perspective but not supported by NKG yet. For example, a certain filter in an - HTTPRoute routing rule. +* Values valid from the Gateway API perspective but not supported by NKG yet. For example, a feature in an + HTTPRoute routing rule. Note: for the list of supported features, + see [Gateway API Compatibility](gateway-api-compatibility.md) doc. * Some values in Gateway API resources which are valid by the CRD and webhook validation, but invalid for NGINX. Such values will cause NGINX to fail to reload or operate erroneously. * Invalid values in Gateway API resources that were not rejected because Step 1 was bypassed. From d604a92caad4246f9e280bce5b4b0e5ff9e375b6 Mon Sep 17 00:00:00 2001 From: Michael Pleshakov Date: Mon, 3 Apr 2023 18:05:37 -0700 Subject: [PATCH 8/8] Reworded validation cases --- docs/resource-validation.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/resource-validation.md b/docs/resource-validation.md index 6f7f5c0ac3..e5a0b0da19 100644 --- a/docs/resource-validation.md +++ b/docs/resource-validation.md @@ -102,12 +102,13 @@ Events: This step catches the following cases of invalid values: -* Values valid from the Gateway API perspective but not supported by NKG yet. For example, a feature in an +* Valid values from the Gateway API perspective but not supported by NKG yet. For example, a feature in an HTTPRoute routing rule. Note: for the list of supported features, see [Gateway API Compatibility](gateway-api-compatibility.md) doc. -* Some values in Gateway API resources which are valid by the CRD and webhook validation, but invalid for NGINX. Such - values will cause NGINX to fail to reload or operate erroneously. -* Invalid values in Gateway API resources that were not rejected because Step 1 was bypassed. +* Valid values from the Gateway API perspective, but invalid for NGINX, because NGINX has stricter validation + requirements for certain fields. Such values will cause NGINX to fail to reload or operate erroneously. +* Invalid values (both from the Gateway API and NGINX perspectives) that were not rejected because Step 1 was bypassed. + Similarly to the previous case, such values will cause NGINX to fail to reload or operate erroneously. * Malicious values that inject unrestricted NGINX config into the NGINX configuration (similar to an SQL injection attack).