Skip to content

blog: updated E2E best practices #392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 9, 2023

Conversation

pohly
Copy link
Contributor

@pohly pohly commented Apr 5, 2023

"Writing good E2E tests" was already updated a while ago in kubernetes/community#7021. As suggested there (kubernetes/community#7021 (comment)), we should bring this update to the attention of more contributors, hence this blog post.

/cc @aojea @jberkus

@k8s-ci-robot k8s-ci-robot requested review from aojea and jberkus April 5, 2023 13:08
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 5, 2023
@pohly
Copy link
Contributor Author

pohly commented Apr 5, 2023

It would be great to get this published before KubeCon EU 2023 because then readers have the opportunity to ask questions on-site in person.

@aojea: The intro, the architecture and the "next steps" are new. The rest is text that you already reviewed earlier for kubernetes/community#7021, just updated a bit to make it flow better in a blog post.

Copy link

@xmcqueen xmcqueen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 6, 2023
@pohly
Copy link
Contributor Author

pohly commented Apr 6, 2023

/assign @mrbobbytables

For approval. Let's get this published before KubeCon, then folks can chat with me about it there.

`ginkgo.DeferCleanup` executes code in the more useful last-in-first-out order,
i.e. things that get set up first get removed last.

Objects created in the test namespace do not need to be deleted because
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe you should introduce this at the beginning of the section, that the framework creates a test namespace to avoid test pollution ... I don't know if this behavior of "test namespaces" is well known outside of kubernetes


Objects created in the test namespace do not need to be deleted because
deleting the namespace will also delete them. However, if deleting an object
may fail, then explicitly cleaning it up is better because then failures or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, if you create objects that are going to take some times to be completed delated, I had this mistakes with "terminating pods", I set a grace period of 300 seconds and the pod blocks the garbage collectors for that time

may fail, then explicitly cleaning it up is better because then failures or
timeouts related to it will be more obvious.

In cases where the test may have removed the object, `framework.IgnoreNotFound`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed this is a common mistake


## Polling and timeouts

When waiting for something to happen, use a reasonable timeout. Without it, a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When waiting for something to happen, use a reasonable timeout. Without it, a
When waiting for something to happen and you need to do asynchronous assertions, use a reasonable timeout. Without it, a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o asynchronous checks, I think people is familiar with this term

When waiting for something to happen, use a reasonable timeout. Without it, a
test might keep running until the entire test suite gets killed by the
CI. Beware that the CI under load may take a lot longer to complete some
operation compared to running the same test locally. On the other hand, a too
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" On the other hand," here is misleading, I think that you may express the too main problems

  • short timeout, the test will flake, per example if the CI is slow
  • long timeout, the test may hide underline issues, per example, if there are some races with other components and eventually the condition pass

The thing with timeouts, is that you also should define what is the expected time you consider valid for a an operation to succeed, e2e are not only functiona, i.e. creating a pod and it takes more than 10 minutes to run should not pass because that environment is too busy, or Services can not take more than 1 minute in program the dataplance, ... timeouts are also important to set the upper limits for some behaviors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the key is to obtain the right balance, a timeout that doesn't flake and that the time is considerable acceptable for that operation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two problems with too long timeouts:

  • a feature is broken and some expected state will never occur, but the test needs to run till it times out waiting for that state
  • a feature is normally supposed to work within a certain time frame and for some reason is taking too long

The problem is that we don't have good enough control over the performance of the clusters that we test against, nor do many features have any solid "must work within XYZ seconds". Solving that problem goes beyond what we can solve right now.

I'll clarify the first point and add a comment about the second.

Comment on lines 295 to 297
- informative during interactive use (i.e. intermediate reports, either
periodically or on demand)
- little to no output during a CI run except when it fails
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 2 sentences can be confusing, since is difficult to be informative without giving output 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll explain that the amount of information should depend on how the E2E suite was invoked.

- extension mechanism for writing custom checks
- early abort when condition cannot be reached anymore

[`gomega.Eventually`](https://pkg.go.dev/github.com/onsi/gomega#Eventually)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also accepts contexts :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is part of "all criteria", right? So no need to change anything in the text.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was just a personal comment to show my +1 to this function

area, so beware that these APIs may
change at some point.

- Use `gomega.Consistently` to ensure that some condition is true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is important, unfortunately I'm afraid we are not doing it as much as we should

@aojea
Copy link
Member

aojea commented Apr 7, 2023

LGTM
overall good summary of the work done and the expectations for e2e tests, it will be good to promote this blog for reviewers of the different SIGs , so we can have consistency on all the e2e tests.

I left some comments that are not blockers and if necessary can be follow ups

/assign @sftim

you need docs people IIRC for approval

@pohly
Copy link
Contributor Author

pohly commented Apr 7, 2023

/hold

I'll follow up on some of the suggestions before this gets merged.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Apr 7, 2023
@pohly pohly force-pushed the e2e-testing-bkms-blog branch from c66c356 to 846fba3 Compare April 8, 2023 16:35
@pohly
Copy link
Contributor Author

pohly commented Apr 8, 2023

/hold cancel

PR updated, ready for another LGTM and approval.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 8, 2023
Copy link

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm cancel

The publication date should be in the future at the time we merge it. However, it looks otherwise OK.

---
layout: blog
title: "E2E Testing Best Practices, Reloaded"
date: 2023-04-04
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a publication date, how about 2023-04-12?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sftim
Copy link

sftim commented Apr 8, 2023

BTW, approval is SIG ContribEx blog team.

@aojea
Copy link
Member

aojea commented Apr 9, 2023

/lgtm

Thanks

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2023
pohly added 2 commits April 9, 2023 18:06
"Writing good E2E tests" was already updated a while ago in
kubernetes/community#7021. As suggested
there (kubernetes/community#7021 (comment)),
we should bring this update to the attention of more contributors, hence this
blog post.
@pohly pohly force-pushed the e2e-testing-bkms-blog branch from 846fba3 to 7f1a7e9 Compare April 9, 2023 16:06
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2023
Copy link
Member

@mrbobbytables mrbobbytables left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrbobbytables, pohly, xmcqueen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 9, 2023
@k8s-ci-robot k8s-ci-robot merged commit fc8ecbf into kubernetes:master Apr 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants