-
Couldn't load subscription status.
- Fork 60
[nexus] Update status API #8897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ff5dd1c to
0bf80f5
Compare
|
|
||
| let counts = status.components_by_release_version; | ||
| assert_eq!(counts.get("install dataset").unwrap(), &7); | ||
| assert_eq!(counts.get("unknown").unwrap(), &15); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will probably want to change the Display impl to make these snake case. I could just write my own helper that does that and leave the Display impl alone, but it's probably bad to have to two ways of doing this.
omicron/nexus/types/src/internal_api/views.rs
Lines 573 to 588 in 8571be3
| impl Display for TufRepoVersion { | |
| fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { | |
| match self { | |
| TufRepoVersion::Unknown => write!(f, "unknown"), | |
| TufRepoVersion::InstallDataset => { | |
| write!(f, "install dataset") | |
| } | |
| TufRepoVersion::Version(version) => { | |
| write!(f, "{}", version) | |
| } | |
| TufRepoVersion::Error(s) => { | |
| write!(f, "{}", s) | |
| } | |
| } | |
| } | |
| } |
4a75d91 to
3a017f1
Compare
3a017f1 to
5f8076c
Compare
| /// rack should eventually correspond to the release described here. | ||
| /// | ||
| /// Will only be null if a target release has never been set. | ||
| pub target_release: Option<TargetRelease>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've collapsed the TargetReleaseSource enum with its Unspecified and SystemVersion arms into Option<TargetRelease>, where TargetRelease is always a system version. This is based on my understanding that it's basically never going to be Unspecified after the first few weeks of update being released. By handling that with target_release: null, we ensure future values will be nice ones like target_release: { version: "2.0.0", time_requested: "..." } that don't distract the user with the possibility of other shapes. In the previous TargetReleaseSource enum, this was the distinction between { "type": "unspecified" } vs. { "type": "system_version", ... }. The option approach lets us make the unspecified arm nearly invisible.
| /// The source of the target release. | ||
| pub release_source: TargetReleaseSource, | ||
| /// The specified release of the rack's system software | ||
| pub version: Version, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the word "source" here very confusing. Now there is no source. There either is no target release or there is one, and when there is one, it always has a system version.
I also thought it was a little weird that you could have a target release like TargetRelease { time_requested: "...", release_source: Unspecified }, namely unspecified but still with a non-null time_requested. Based on the migration adding the unspecified value, time_requested is just whenever the migration ran, which to me is not very meaningful to the user. In this PR, time_requested is still inside TargetRelease, but since TargetRelease is now optional, you only have time_requested in the response when you have a release set.
omicron/schema/crdb/create-target-release/up3.sql
Lines 1 to 12 in c7673aa
| -- System software is by default from the `install` dataset. | |
| INSERT INTO omicron.public.target_release ( | |
| generation, | |
| time_requested, | |
| release_source, | |
| tuf_repo_id | |
| ) VALUES ( | |
| 1, | |
| NOW(), | |
| 'unspecified', | |
| NULL | |
| ) ON CONFLICT DO NOTHING; |
| #[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize, JsonSchema)] | ||
| pub struct TargetRelease { | ||
| /// The target-release generation number. | ||
| pub generation: i64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been removed. I don't think it's meaningful to the user.
| .chain(rot_version) | ||
| .chain(std::iter::once(bootloader_version)) | ||
| .chain(host_version) | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to do this by hand here, but "make a flat list of all the versions in the system" feels like a good operation to have a canonical version of that lives somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not had a chance to look carefully at the changes to nexus/src/app/update.rs or the updates.rs integration test. The API changes like fine to me. We'll need to coordinate an update to the CLI as people are testing this.
| self.datastore() | ||
| .target_release_get_generation(opctx, Generation(prev_gen)) | ||
| .await | ||
| .internal_context("fetching previous target release")? | ||
| .and_then(|r| r.tuf_repo_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a little sketchy? I'm not sure.
Example: after a MUPdate, you might set the target release to the same value it was before. Then the previous row will not be the previous release.
Today, the PlanningInput has this information. Maybe PlanningInputFromDb has a better way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, that is where I copied this logic from. The comment suggests that #8056 helps but I am not sure it does.
omicron/nexus/reconfigurator/preparation/src/lib.rs
Lines 163 to 199 in 4d5bdc6
| let tuf_repo = TufRepoPolicy { | |
| target_release_generation: target_release.generation.0, | |
| description: target_release_desc, | |
| }; | |
| // NOTE: We currently assume that only two generations are in play: the | |
| // target release generation and its previous one. This depends on us | |
| // not setting a new target release in the middle of an update: see | |
| // https://github.com/oxidecomputer/omicron/issues/8056. | |
| // | |
| // We may need to revisit this decision in the future. See that issue | |
| // for some discussion. | |
| let old_repo = if let Some(prev) = target_release.generation.prev() { | |
| let prev_release = datastore | |
| .target_release_get_generation(opctx, Generation(prev)) | |
| .await | |
| .internal_context("fetching previous target release")?; | |
| let description = if let Some(prev_release) = prev_release { | |
| if let Some(repo_id) = prev_release.tuf_repo_id { | |
| TargetReleaseDescription::TufRepo( | |
| datastore | |
| .tuf_repo_get_by_id(opctx, repo_id.into()) | |
| .await | |
| .internal_context( | |
| "fetching previous target release repo", | |
| )? | |
| .into_external(), | |
| ) | |
| } else { | |
| TargetReleaseDescription::Initial | |
| } | |
| } else { | |
| TargetReleaseDescription::Initial | |
| }; | |
| TufRepoPolicy { target_release_generation: prev, description } | |
| } else { | |
| TufRepoPolicy::initial() | |
| }; |
I certainly see your point, though, and it is a problem. In the internal API logic I'm borrowing, it seems like we will only turn up component versions that match either the old or the new one passed in, because we're matching against artifact hashes. In the MUPdate case where current and prev release are the same, components that are on the the MUPdate's version will all show up as TufRepoVersion::Unknown.
omicron/nexus/types/src/internal_api/views.rs
Lines 937 to 968 in 4d5bdc6
| fn zone_image_source_to_version( | |
| old: &TargetReleaseDescription, | |
| new: &TargetReleaseDescription, | |
| source: &OmicronZoneImageSource, | |
| res: &ConfigReconcilerInventoryResult, | |
| ) -> TufRepoVersion { | |
| if let ConfigReconcilerInventoryResult::Err { message } = res { | |
| return TufRepoVersion::Error(message.clone()); | |
| } | |
| let &OmicronZoneImageSource::Artifact { hash } = source else { | |
| return TufRepoVersion::InstallDataset; | |
| }; | |
| if let Some(new) = new.tuf_repo() { | |
| if new.artifacts.iter().any(|meta| meta.hash == hash) { | |
| return TufRepoVersion::Version( | |
| new.repo.system_version.clone(), | |
| ); | |
| } | |
| } | |
| if let Some(old) = old.tuf_repo() { | |
| if old.artifacts.iter().any(|meta| meta.hash == hash) { | |
| return TufRepoVersion::Version( | |
| old.repo.system_version.clone(), | |
| ); | |
| } | |
| } | |
| TufRepoVersion::Unknown | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jgallagher says to use this instead, and sort by generation number to figure out what are old and new releases (not sure I actually need old and new anyway, I just need all the versions on hand).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you decide not to do this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think the upshot of the discussion in chat was that since the internal API logic doesn't do any better, it makes sense to do it this way for now and improve both later. We could use that linked method to fetch more releases, but the code I'm hooking into to match up versions with hashes on the components only takes old and new.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking pretty good. Just some minor suggestions here (one of which, the <= thing, is probably important).
| self.datastore() | ||
| .target_release_get_generation(opctx, Generation(prev_gen)) | ||
| .await | ||
| .internal_context("fetching previous target release")? | ||
| .and_then(|r| r.tuf_repo_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you decide not to do this now?
…#9106) Built on top of #8897 These are breaking API changes and I won't merge until I have a PR up on the CLI side that makes this all work. ### Add list endpoint and fix other ones Made the endpoints match the other ones. Not married to the verb `upload`, but `system_update_repository_update` felt a little silly. ```diff -system_update_get_repository GET /v1/system/update/repository/{system_version} -system_update_put_repository PUT /v1/system/update/repository +system_update_repository_list GET /v1/system/update/repositories +system_update_repository_upload PUT /v1/system/update/repositories +system_update_repository_view GET /v1/system/update/repositories/{system_version} ``` ### Standardize and simplify response types What's responsible for this PR being so big is that I reworked these endpoints to return a new `TufRepo` struct that is only the metadata and no artifacts. The main reason for this was that the list endpoint would have to pull artifacts for every repo in the list. The list is likely to be small, so it's probably not a big deal, but it also seems that the artifacts are not useful to the end user. Once the list endpoint was simplified, it makes sense to return the same thing from the view and update endpoints. The upload endpoint returns a `TufRepoUpload` which has the `TufRepo` together with an indicator saying whether the repo contents were new repo or already existed.
Closes #8869
update_statusendpoint at/v1/system/update/statusthat includes:target_release_view)time_last_progress, representing thetime_made_targetof the latestbp_target— meant to indicate the last time the update system Did Somethingcomponents_by_release_version, a map where the keys are to_string'dTufRepoVersions and the values are counts of components on that versiontarget_release_viewendpoint, which is fully redundant with update statustarget_release_updatethat you can use update status to check the current target releaseTargetReleasestruct that was previously being returned fromtarget_release_viewand is now part of the update status responsegenerationfromTargetReleasebecause it is not meaningful to the end userListing blockers or problems preventing further update progress is mentioned as a goal of #8869, but they are currently stored as JSON blobs in the DB for debug purposes. They will soon be stored in a more regimented way that should make it easy to stick a list of blockers in the response body. If this PR gets merged without that, I'll make a dedicated issue for it.
Example status response
{ "target_release": { "time_requested": "2025-09-24T22:37:43.266338Z", "version": "2.0.0" }, "components_by_release_version": { "install dataset": 7, "unknown": 15 }, "time_last_step_planned": "2025-09-24T22:37:41.556717Z", "suspended": false }