mupdate/update recovery flow should ensure that all deployment units are at known versions before proceeding with other operations

As of #8456, the blueprint planner is able to set and clear the `remove_mupdate_override` field within blueprints. Part of the PR is determining when we have fully recovered from a MUPdate.

Currently, there are two conditions, either of which being false is an indication that we still need to do work to recover from a MUPdate:

1. The target release has been updated since the last time a MUPdate was detected.
2. The `remove_mupdate_override` field has been cleared from all sleds in the blueprint.

We need to add a third requirement here: all deployment units on all sleds must be at known versions before we're confident about proceeding with updates. "All deployment units" includes:

* zone image sources
* host phase 2 and phase 1 images
* SP and RoT images
* (other units?)

Some questions:

* Do we want to block zone additions on this condition long-term? We have a chicken switch called `add_zones_with_mupdate_override` for this, set to true on customer systems for r16, but the current plan is for that switch to be set to false for r17. Is that desired?
* Currently, the only TUF repo we check deployment unit sources against is the current target release. This means that the remediation path in case sleds with different MUPdate versions are detected is annoying: the operator would have to set each MUPdated-to release as the target release, wait for a planner run, and repeat until done.

There's also the open question of how we present these conditions and remediation paths to the operator. That doesn't block this issue, but it would probably block r17.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mupdate/update recovery flow should ensure that all deployment units are at known versions before proceeding with other operations #8726

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mupdate/update recovery flow should ensure that all deployment units are at known versions before proceeding with other operations #8726

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions