Skip to content

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Sep 22, 2021

  • Make use of Selectable in more spots to embed identity structures.
  • Add a derive macro for Resource and Asset, which produces a table-bespoke structure.

Example:

#[derive(Selectable, Resource)]  // Generates `InstanceIdentity` structure...
struct Instance {
  #[diesel(embed)]
  identity: InstanceIdentity, // ... which can be embedded here. The type is verified by the proc macro.
  ...
}

Continues the process described in #209

@smklein smklein changed the title Identity metadata [db] De-duplicate IdentityMetadata Sep 22, 2021
@smklein smklein added the database Related to database access label Sep 22, 2021
pub fn name(&self) -> &omicron_common::api::external::Name {
&self.identity.name
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if these are really necessary. I don't mind .identity.id and .identity.name — whenever I see the method versions it makes me wonder what's going on inside

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it's worth making a common language abstraction, but my inclination is to push this in the direction of a trait, rather than a "magically implemented accessor". IMO this would make sense if we defined a trait - visible outside the macro - and just stated "using this proc macro means this trait is impl'd for you".

Copy link
Collaborator Author

@smklein smklein Oct 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, just implemented this with two traits, Resource and Asset, to describe the "full identity", and "the same thing, but for unnamed objects without soft deletes".

I kinda like the "access-through-trait" (or at least, "getter") version, it prevents most users from having writable access to these fields without doing an explicit database update.

@smklein smklein changed the title [db] De-duplicate IdentityMetadata [db] Implement derive macros for deriving "Identity" structures. Oct 6, 2021
@smklein smklein marked this pull request as ready for review October 6, 2021 19:11
@smklein smklein requested a review from davepacheco October 6, 2021 19:11
@smklein
Copy link
Collaborator Author

smklein commented Oct 6, 2021

A guide to this PR, for reviewers:

New stuff

  • omicron-nexus/src/db/identity.rs: Describes the identity-related traits which may be derived.
  • omicron-nexus/src/db/db-macros/src/lib.rs: Contains the new derive macros for Resource and Asset.

Old stuff that was updated to use the new stuff

  • omicron-nexus/src/db/model.rs: Where most of the usage has been updated - de-duplication here is a big goal of this PR.
  • omicron-common/src/sql/dbinit.sql: Rack added to DB, so we can make the "identity" structures consistent everywhere. Not actually stored in the DB yet, still floating in memory only.

//
// However, it likely will be in the future - for the single-rack
// case, however, it is synthesized.
#[derive(Queryable, Insertable, Debug, Clone, Selectable, Asset)]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example of usage: "Rack" and "Server" implement the "Asset" trait, which is a subset of Identity without soft deletion (or names, or descriptions).


/// Describes a project within the database.
#[derive(Queryable, Identifiable, Insertable, Debug)]
#[derive(Selectable, Queryable, Insertable, Debug, Resource)]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... meanwhile, the user-visible stuff (Projects, Instances, Disks, etc) all implement "Resource".

If we want to replace these names ("Asset" and "Resource") with something that makes this more clear, I'm happy to make that change. I couldn't really come up with anything better, but this does get the job done.

@smklein smklein added the cleanup Code cleanliness label Oct 8, 2021
@smklein smklein force-pushed the identity-metadata branch from 1baac75 to f784bfe Compare October 8, 2021 21:26
) -> ListResultVec<db::model::Sled> {
use db::schema::sled::dsl;
paginated(dsl::sled, dsl::id, pagparams)
.filter(dsl::time_deleted.is_null())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own notes, there seem to be three patterns to the changes in this file:

  • Any place we were previously issuing a "select" query, we're now explicitly doing select(the_table::as_select()). Without this, because we're using nested structures, Diesel would incorrectly try to select the nested struct as a column. With this, Diesel uses a different process to construct the list of fields to select, and that process takes into account that there's a nested struct whose columns need to be selected.
  • Similarly, any place we were previously issuing an "insert" query, we're now explicitly adding returning(the_table::as_returning()) for presumably the same reason.
  • In cases like this one, we've dropped "time_deleted" from this table, so we're removing that filter. Presumably in the future sleds might have a time_decommissioned instead, and we'd presumably add this here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very helpful, I was wondering why we needed the as_selects

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is all accurate.

I didn't know if we'd do time_decommissioned or not - this was following up from our discussion a few weeks ago (pets vs cattle), where we mentioned that "some object may not need soft deletes" - tracking hardware like Sleds was one such example.

description: "Self-Identified Sled".to_string(),
};
let sled = db::model::Sled::new(id, address, create_params);
let sled = db::model::Sled::new(id, address);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a sled that already exists, is this going to wind up setting "time_created" to the current time instead of leaving it what it was?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so. sled_upsert is implemented as follows:

        diesel::insert_into(dsl::sled)
            .values(sled.clone())
            .on_conflict(dsl::id)
            .do_update()
            .set((
                dsl::time_modified.eq(Utc::now()),
                dsl::ip.eq(sled.ip),
                dsl::port.eq(sled.port),
            ))
            .returning(db::model::Sled::as_returning())
            .get_result_async(self.pool())
            .await
            ...

This should only update:

  • Time_modified
  • IP
  • Port

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be. I got here because I thought the sled created at L181 has "time_created" set to the current time? It seems like this might work but it feels like surprising behavior.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It creates that field because if the upsert is an insert, it'll use it, and if it's an update, it will not.

Is there something you'd like me to do to make this more clear?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be clearer if we didn't create and use an object that has fields we know we're going to throw away. Elsewhere, we've used a separate type for that (e.g., the old UpdateParams pattern didn't contain fields that couldn't be updated). I haven't looked at the blast radius for doing that here. It might not be worth it. It's really up to you.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure how to do this - even if we delay creating the object until the sled_upsert command, we are calling diesel.insert_into(dsl::sled).values(sled) - we need to have the full sled object at this point, even if it already exists in the case of conflict.

The only viable way to avoid creating this struct would be to perform a different set of database operations, which seems significantly more expensive to me - we'd have to query for the existence of the sled, rather than call "UPDATE" + "ON CONFLICT", and only create the struct if we don't find anything from the query. But then we'd need to add either transactions or optimistic concurrency control, because one statement would be split into multiple...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right -- we definitely want to use INSERT ON CONFLICT here. Makes sense!

@smklein
Copy link
Collaborator Author

smklein commented Oct 15, 2021

Does anyone have any feedback for this PR? I normally wouldn't push on it, but this one has been kinda a pain to continually rebase with the other changes going on - I'm happy to make requested changes, I just wanna get this one in!

Copy link
Contributor

@david-crespo david-crespo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little surprised that there were net more lines than before, but I guess a lot is overhead, so if we added maybe 10 more models we'd break even. Kinda sad each model has to get its own *Identity struct but it doesn't seem to leak out of the impl so it's fine.

///
/// Although these fields can be refactored into a common structure (to be used
/// within the context of Diesel) they must be uniquely identified for a single
/// table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious what this is a consequence of — is it Selectable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah: https://docs.diesel.rs/master/diesel/prelude/derive.Selectable.html

It has a table_name attribute, and embeds table-specific information into the structs it generates.

If it didn't have this field, I could probably avoid the usage of a macro altogether, and use a regular struct - but that's why we're generating "per-model structures".

@smklein
Copy link
Collaborator Author

smklein commented Oct 15, 2021

I was a little surprised that there were net more lines than before, but I guess a lot is overhead, so if we added maybe 10 more models we'd break even. Kinda sad each model has to get its own *Identity struct but it doesn't seem to leak out of the impl so it's fine.

Yeah, admittedly, I also added a few tests and wasn't aiming for "smallest LoC" in the identity generation - but the lines removed from model.rs was sorta the metric that I was aiming for.

@smklein smklein merged commit e4f179a into main Oct 15, 2021
@smklein smklein deleted the identity-metadata branch October 15, 2021 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Code cleanliness database Related to database access
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[diesel][cleanup] Make use of Diesel "Selectable" to cut down on repetition
3 participants