-
Notifications
You must be signed in to change notification settings - Fork 54
[10/n] [sled-agent] validate zone images as written in zone manifest #8190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[10/n] [sled-agent] validate zone images as written in zone manifest #8190
Conversation
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1 [skip ci]
fn from_directory(dir: &Utf8Path) -> Result<Self> { | ||
let control_plane_dir = dir.join("zones"); | ||
// The install dataset goes into a directory called "install". | ||
let control_plane_dir = dir.join("install"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this just wrong before? (Hopefully only used by tests??)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it was only used by tests. Actually hooking sled-agent's zone logic to installinator found the issue.
pub boot_disk_override: | ||
Result<Option<MupdateOverrideInfo>, MupdateOverrideReadError>, | ||
|
||
/// Status of the non-boot disks. This results in warnings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit / question - this only results in warnings in bad cases, right? The typical case is MatchesPresent
which would not emit warnings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, updated the doc to make this clearer.
)?; | ||
} | ||
ArtifactReadResult::Error(error) => { | ||
writeln!(f, " {}: error ({})", artifact.file_name, error)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this print the whole error chain since ArtifactReadResult
itself isn't an Error
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly -- in practice there isn't going to be a source, but I've added InlineErrorChain::new just in case there is one in the future.
let artifacts = | ||
MupdateOverrideArtifactsResult::new(dataset_dir, data); | ||
if artifacts.is_valid() { | ||
// If there are errors, return them as appropriate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment in the wrong branch? If artifacts.is_valid()
, there are no errors right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Though I ended up redoing most of this in the latest update since I had to split up the zone manifest and mupdate override into separate files.
Created using spr 1.3.6-beta.1 [skip ci]
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
#[derive(Clone, Debug, PartialEq)] | ||
pub enum ZoneManifestNonBootResult { | ||
/// The manifest is present and matches the value on the boot disk. | ||
/// Information about individual zone hashes matching is stored in the | ||
/// `ZoneManifestArtifactsResult`. | ||
Matches(ZoneManifestArtifactsResult), | ||
|
||
/// The manifest is absent -- this is an error case because it is expected | ||
/// to be present. | ||
NotFound, | ||
|
||
/// A mismatch between the boot disk and the other disk was detected. | ||
Mismatch(ZoneManifestNonBootMismatch), | ||
|
||
/// An error occurred while reading the mupdate override info on this disk. | ||
ReadError(InstallMetadataReadError), | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of these types are duplicated with some minor modifications -- deliberately so, since there are subtle differences (e.g. the underlying install_dataset_metadata
doesn't error out if the file is missing, but here we do).
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
/// Testing aid. | ||
impl PartialEq for ArcSerdeJsonError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to put this behind #[cfg(test)]
? Maybe doesn't matter, since I wouldn't expect real code to be comparing errors anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be ok for now, but in an upcoming PR I move this into a more central location and it would become more annoying to manage.
manifest, | ||
)), | ||
Ok(None) => { | ||
// The file is missing -- this is an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this some in chat - I think this will break a couple of manufacturing / dev workflows, because they produce sleds that start up without installinator having run (and therefore without having had a chance to get a zones.json
file). We could change the workflows to write out a zones.json
(with some amount of work / logistical pain, I think), but it also seems okay for us to hash the zones we find ourselves if there's no zones.json
present at all.
Sorry I didn't think about this the first time around :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, working on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the back and forth on this - the new synthesized manifest LGTM.
] | ||
.into_iter() | ||
.collect(), | ||
mupdate_id: MupdateUuid::new_v4(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is causing the new test failures, maybe? The display impl now includes this ID, which is different each time the test runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I get for writing code at midnight. Thanks.
Created using spr 1.3.6-beta.1
* Move `ZoneImageFileSource` to illumos-utils and use it in the zone image builder. * Move default file name logic to sled-agent-zone-images, since it cares about that in the context of mupdate override logic. Depends on: * #8190 * everything before that
Included in this PR:
test_installinator_fetch
wicketd integration test to also ensure installinator + sled-agent have a coherent view of the worldThe last point forced me to expose a bunch of the status via a public interface. We'll be able to reuse this information (though probably not in this full form) and send it up as part of the inventory.
Depends on: