-
Notifications
You must be signed in to change notification settings - Fork 64
✨ Add an internal package that implements a dynamic caching layer for ClusterExtension managed content #1001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Add an internal package that implements a dynamic caching layer for ClusterExtension managed content #1001
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1001 +/- ##
==========================================
- Coverage 78.05% 77.00% -1.05%
==========================================
Files 18 19 +1
Lines 1253 1357 +104
==========================================
+ Hits 978 1045 +67
- Misses 193 222 +29
- Partials 82 90 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
d202e6d
to
09a8f23
Compare
09a8f23
to
ed56a32
Compare
oclabels "github.com/operator-framework/operator-controller/internal/labels" | ||
) | ||
|
||
type ContentManager interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically I wouldn't define the interface where it is implemented, but we wanted to have a good starting point without having to wire everything together (scoped for #975) and created this interface based on what we think would be needed by the ClusterExtensionReconciler.
This interface and code will likely get moved around a bit when we wire everything up in #975
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to note that it hasn't been wired up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically I wouldn't define the interface where it is implemented
I know this was on advice from Andy, but I haven't gotten used to this pattern at all and I think we may be taking it to an extreme. Looking at the Go standard library, locations of interfaces are fairly intuitive (e.g. io.Reader
lives in the io
package where there are lots of methods that accept or return it. json.Marshaler
lives in the encoding/json
package, where JSON marshaling/unmarshaling implementations exist.
To me, it is super-unituitive for our controllers
package to be the place where the InstalledBundleGetter and Preflight interfaces are defined.
operator-controller/internal/controllers/clusterextension_controller.go
Lines 103 to 120 in 8947c3d
type InstalledBundleGetter interface { | |
GetInstalledBundle(ctx context.Context, ext *ocv1alpha1.ClusterExtension) (*ocv1alpha1.BundleMetadata, error) | |
} | |
// Preflight is a check that should be run before making any changes to the cluster | |
type Preflight interface { | |
// Install runs checks that should be successful prior | |
// to installing the Helm release. It is provided | |
// a Helm release and returns an error if the | |
// check is unsuccessful | |
Install(context.Context, *release.Release) error | |
// Upgrade runs checks that should be successful prior | |
// to upgrading the Helm release. It is provided | |
// a Helm release and returns an error if the | |
// check is unsuccessful | |
Upgrade(context.Context, *release.Release) error | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is more so that the interfaces are defined where they are used. That being said, this isn't a hill I'm willing to die on, but it is a pattern I've come to appreciate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a problem with it. It's not C/C++ where you want to hide the implementation behind a header file. The way golang is compiled, it really doesn't matter.
// TODO: Instead of stopping the existing cache and replacing it every time | ||
// we should stop the informers that are no longer required | ||
// and create any new ones as necessary. To keep the initial pass | ||
// simple, we are going to keep this as is and optimize in a follow-up. | ||
// Doing this in a follow-up gives us the opportunity to verify that this functions | ||
// as expected when wired up in the ClusterExtension reconciler before going too deep | ||
// in optimizations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be done in a follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we hooked this up as is, what would end up triggering a cache teardown and re-creation? Would it be just when we call actionclient.Upgrade
? Or would we do it even after actionclient.Reconcile
?
I think if it is just for an upgrade, we can probably get by because that would only happen when the resolved bundle changes or (eventually when we support it) the user-supplied configuration changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hooked up as is, I would assume we only call this whenever the identified state is one of Install
or Upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, by identified state I mean the same state that is used in a few other places in the reconcile
function of the ClusterExtensionReconciler, so I imagine something like what we have for preflights here:
operator-controller/internal/controllers/clusterextension_controller.go
Lines 341 to 354 in db5a75c
switch state { | |
case stateNeedsInstall: | |
err := preflight.Install(ctx, desiredRel) | |
if err != nil { | |
setInstalledStatusConditionFailed(ext, fmt.Sprintf("%s:%v", ocv1alpha1.ReasonInstallationFailed, err)) | |
return ctrl.Result{}, err | |
} | |
case stateNeedsUpgrade: | |
err := preflight.Upgrade(ctx, desiredRel) | |
if err != nil { | |
setInstalledStatusConditionFailed(ext, fmt.Sprintf("%s:%v", ocv1alpha1.ReasonInstallationFailed, err)) | |
return ctrl.Result{}, err | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments.
} | ||
|
||
ctx, cancel := context.WithCancel(ctx) | ||
go c.Start(ctx) //nolint:errcheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No error check? What happens when Start()
returns an error and we don't check it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was scratching my head on how to resolve this problem and couldn't come up with anything because the Start()
method is blocking. The only thing I could think of is maybe something like:
errChan := make(chan error)
go func(){
err := c.Start(ctx)
if err != nil {
errChan <- err
}
}()
// handle errors down here
But I'm not sure what would happen if the function that created the channel had already exited.
I decided to go the route of not doing an error check here, but waiting for the cache to sync as the next step. If the cache could not sync we cancel the context and return an error. Otherwise the cache has started and we should be good to go.
I'm open to suggestions for better ways to handle this, so if you have some ideas here I'm all ears!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll end up blocking on the channel, because you don't know exactly when the error will come in, if at all. You should have the error checking, but it should call something.Cancel()
on error. This may mean you need to put the Start()
below the addition to i.extensionCaches[]
below. The Cancel()
function will remove the item from the caches. This way, Cancel()
is called regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go c.Start(ctx) //nolint:errcheck | |
i.extensionCaches[ce.Name] := extensionCacheData{ | |
Cache: c, | |
Cancel: cancel, | |
} | |
go func(){ | |
err := c.Start(ctx) | |
if err != nil { | |
cancel() | |
} | |
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may need a synchronous chat to better understand how this should be handled. I haven't had a scenario like this before and my brain is failing me on understanding the approach you are suggesting @tmshort
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be addressed in the latest push. Thanks for the code example @tmshort !
a6df505
to
ef5558b
Compare
Signed-off-by: everettraven <[email protected]>
10516e7
to
0b03e11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
30c61b5
…rator-framework#1001) Signed-off-by: everettraven <[email protected]>
Description
internal/contentmanager
that has an implementation of a dynamic caching layer for managing watches on content managed by aClusterExtension
Fixes #974
His contributions were lost due to squashing of commits, but a huge shoutout to @OchiengEd for collaborating with me on this PR. @OchiengEd drove while we pair programmed and wrote most of the logic here (I started the branch with a skeleton which is why I am creating the PR)
Reviewer Checklist