Skip to content

Conversation

portante
Copy link
Member

@portante portante commented Jan 25, 2023

We are having problems with the stability of the functional tests (see #3209). As a result, we need a way to iteratively build a new container and test the changes with a much shorter feedback loop.

To that end, this PR provides a way to build and run the functional tests locally without involving a container registry.

Suggested PR commit message below for review:


Build and run functional tests all locally

The `jenkins/runlocal` script now builds a Pbench Server RPM from the
checked-out sources in the current working tree, builds the Pbench
Server CI container, and runs the functional tests against the CI
container image.

As a result, `jenkins/runlocal` no longer pushes images to a registry.

It was necessary to support alternate base images for server container
to allow for fully local builds.  Prior to this change, if one tried to
use `jenkins/runlocal` without setting `BASE_IMAGE`, the default of
`UBI:9` would result in an error if the system on which this script was
invoked did not have a RHEL entitlement.

This change simplifies things so that the RPM built is truly local to
the environment, and using the installed environment, derives the
`BASE_IMAGE` is to work with the built RPM.

For example, if running on a Fedora 37 box, a Fedora 37 RPM would be
built and installed using the `fedora:37` base image.

NOTE, we only support defaulting to a `ubi9:latest` or `centos:stream9`
image in those respective environments.

We now only perform "hard" cleanup on INT, QUIT, and TERM signals.  If a
functional test is successful, and `--cleanup` has been requested, we
will have already removed the POD and container.  There is no need to
also run the "hard" cleanup script.

Further, we emit a note when `--cleanup` is not requested after the
functional tests have finished to alert the user that the POD and
container are still running.  We also emit the print statements for the
clean up being performed.

All arguments to `jenkins/runlocal` are passed along to
`jenkins/run-server-unit-tests` as a simple way to support the
`--cleanup` option on `jenkins/runlocal`.

We also:

 * Fail running server functional tests if pod creation fails
 * Remove the need for the `jenkins/runner`
 * Remove the unnecessary use of `--format=docker` in the `buildah
   from` command
 * Drop specific linting references, add `isort` in `README.md`
 * Use local file to keep CI container image name
   This allows us to entirely remove all references to a specific
   container registry from the code so that developers can create a
   local file in which to store the registry want to use in parallel
   with what the CI jobs already do with Jenkins credentials.
 * Correct a comment in `jenkins/Pipeline.gy`
 * Correct `GITTOP` definition location
 * Use `/bin/bash -e` consistently, adding it to `run-pbench-in-a-can`

Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly, sure; this could be convenient. But a couple of questions...

dbutenhof
dbutenhof previously approved these changes Jan 25, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than go in the direction of trying to conditionalize the registry and repositories for the all the images, I would like to propose following the approach that I've laid out in #3215.

How we name the images is less important than how we manage the pull policy. If we use a pull policy of newer (or never) instead of always (and, if we always name the images to match the remote repository), then I believe that we will be able to build and run repeatedly without anything having to be pulled (once the remote images are resident locally), and pushing the Pbench Server image is unnecessary in the build/run/repeat local scenario.

@portante
Copy link
Member Author

How we name the images is less important than how we manage the pull policy. If we use a pull policy of newer (or never) instead of always (and, if we always name the images to match the remote repository), then I believe that we will be able to build and run repeatedly without anything having to be pulled (once the remote images are resident locally), and pushing the Pbench Server image is unnecessary in the build/run/repeat local scenario.

Kubernetes vs podman not withstanding (see https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy, and PR #3216), I'll try not building with localhost and see how that behaves.

dbutenhof
dbutenhof previously approved these changes Jan 31, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, with the possible exception of the Pbench Server container image pull policy in run-pbench-in-a-can. And, if that needs fixing, then there are a few other things which you might want to fix or change.

webbnh
webbnh previously approved these changes Jan 31, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you've incorporated almost of of #3215 -- thanks!

In noting that, I found two more things to consider.

@portante portante dismissed stale reviews from webbnh and dbutenhof via 563334c January 31, 2023 22:41
Copy link
Member Author

@portante portante left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased and ready for review.

dbutenhof
dbutenhof previously approved these changes Feb 1, 2023
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One slightly messed up comment, but not a big deal unless you make other changes ...

dbutenhof
dbutenhof previously approved these changes Feb 1, 2023
Prior to this change, if one tried to use `jenkins/runlocal` without
setting `BASE_IMAGE`, the default of `UBI:9` would result in an error
if the system on which this script was invoked did not have a RHEL
entitlement.

This change simplifies things so that the RPM built is truly local to
the environment, and using the installed environment, the `BASE_IMAGE`
is to work with that RPM.

For example, if running on a Fedora 37 box, a Fedora 37 RPM would be
built and installed using the `fedora:37` base image.
If a functional test is successful, and `--cleanup` has been requested,
we will have already removed the POD and container.  There is not need
to also run the "hard" cleanup script.

Further, we emit a note when `--cleanup` is not requested after the
tests have finished to alert the user that the POD and container are
still running.

We also emit the print statements for the clean up being performed.
All arguments to `jenkins/runlocal` are passed along to
`jenkins/run-server-unit-tests` as a simple way to support the
`--cleanup` option on `jenkins/runlocal`

Further, we only support defaulting to a `ubi9:latest` or
`centos:stream9` image in those respective environments.

Lastely, we add support for `BASE_IMAGE` to be defined ahead of time.
@portante portante dismissed stale reviews from dbutenhof, ndokos, and webbnh via 09b36e1 February 7, 2023 15:39
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capitalization in "Support Services pod" looks a bit odd: I don't think I would capitalize either since it's not exactly a formal title. But that's trivial; let's get this in.

Copy link
Member

@npalaska npalaska left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked for me locally, ship it 👍

@portante portante merged commit 34daca3 into distributed-system-analysis:main Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants