-
Notifications
You must be signed in to change notification settings - Fork 17
feat(api): add stable build ID override via spec.workerOptions.buildID #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Adds support for user-controlled build IDs via spec.workerOptions.buildID, enabling rolling updates for non-workflow code changes while preserving new deployment creation for workflow code changes. Key changes: - Add BuildID field to WorkerOptions struct in API types - Update ComputeBuildID to use spec field instead of annotation - Implement drift detection by comparing deployed spec with desired spec - Only check for drift when buildID is explicitly set by user Drift detection currently monitors: replicas, minReadySeconds, container images, container resources (limits/requests), and init container images. Other fields (env vars, volumes, commands) are not monitored - this is documented in the BuildID field comment. Note: CRD regeneration includes some unrelated changes from controller-gen (default values for name fields, x-kubernetes-map-type annotations). These are standard regeneration artifacts and don't affect functionality. This solves deployment proliferation for PINNED versioning strategy where any pod spec change (image tag, env vars, resources) would generate a new build ID and create unnecessary deployments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
4f23d55 to
6362a09
Compare
carlydf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there, sorry for the late review, just getting back to this after the holidays. I'm committed to turning this PR around quickly though and I think it's really close! I'll be keeping an eye on this tomorrow so we can iterate quickly.
Replace per-field comparison with SHA256 hash of user-provided pod template spec. This detects ALL changes (env vars, commands, volumes, etc.) rather than a subset of fields. Changes: - Add ComputePodTemplateSpecHash() using spew for deterministic hashing - Store hash in pod-template-spec-hash annotation on deployments - Compare hashes instead of individual fields for drift detection - Add backwards compatibility for legacy deployments without hash Tests: - 6 tests for hash computation (images, env vars, commands, volumes) - Updated drift detection tests including env var change case - Backwards compatibility test for deployments without hash annotation Docs: - Update BuildID field documentation to reflect hash-based detection
8ba2666 to
7aad356
Compare
carlydf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So close to approving! The main decision I still want to make together is how to name the new field.
My review was delayed today because I wanted to write an integration test for this new functionality, which you can look at in this commit. Confirmed it works :) In the early phases of this project, features that only had unit tests and no integration tests experienced regressions, so I want to make sure that doesn't happen here. Feel free to cherry-pick my test commit onto your fork (if that can be done with a fork), or I can merge my integration test right after your PR merges.
|
|
||
| // updateDeploymentWithPodTemplateSpec updates an existing deployment with a new pod template spec | ||
| // from the TWD spec. This applies all the controller modifications that NewDeploymentWithOwnerRef does. | ||
| func updateDeploymentWithPodTemplateSpec( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you re-use the pod-spec-modification code from NewDeploymentWithOwnerRef (if the whole thing can't be reused, maybe a shared helper function) so that we don't have to keep two separate functions in sync?
| // +optional | ||
| // +kubebuilder:validation:MaxLength=63 | ||
| // +kubebuilder:validation:Pattern=`^[a-zA-Z0-9]([a-zA-Z0-9._-]*[a-zA-Z0-9])?$` | ||
| BuildID string `json:"buildID,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to name this in a way that makes it clear that providing your own Build ID is not required, since as we say in the warning, it's not safe unless you do work on your own side to ensure that Build ID is updated when incompatible changes happen.
I worry that people could get confused, not read the doc-string description of the field, and copy-pasta themselves into a situation where they're not updating the Build ID and get NDEs in their workflows despite "using versioning."
Would you be opposed to calling it something like CustomBuildID?
Summary
Adds support for user-controlled build IDs via
spec.workerOptions.buildID, enabling rolling updates for non-workflow code changes while preserving new deployment creation for workflow code changes.Problem
With PINNED versioning strategy and long-running workflows, any pod spec change (image tag, env vars, resources) generates a new build ID, causing deployment proliferation.
Result: 10-15 active deployments running simultaneously, causing resource waste and operational complexity.
Solution
Allow users to set a stable build ID via
spec.workerOptions.buildID. When the build ID is stable but pod spec changes, trigger a rolling update instead of creating a new deployment.Key Changes
BuildIDfield toWorkerOptionsstruct in API typesComputeBuildIDto use spec field instead of annotationtemporal.io/pod-template-spec-hashannotation on deploymentsDrift Detection
When
spec.workerOptions.buildIDis set, the controller detects spec drift by comparing a SHA256 hash of the user-provided pod template spec against the hash stored in the deployment annotation.How it works:
This approach:
When drift is detected: Controller triggers a rolling update of the existing deployment (preserves same build ID).
Usage
Behavior Matrix
Backwards Compatibility
spec.workerOptions.buildIDis explicitly setTest plan
go test ./internal/k8s/... ./internal/planner/... -vpasses