Skip to content

Commit bf49833

Browse files
IP pools data model and API rework (#4261)
Closes #2148 Closes #4002 Closes #4003 Closes #4006 ## Background #3985 (and followups #3998 and #4007) made it possible to associate an IP pool with a silo so that instances created in that silo would get their ephemeral IPs from said pool by default (i.e., without the user having to say anything other than "I want an ephemeral IP"). An IP pool associated with a silo was not accessible for ephemeral IP allocation from other silos — if a disallowed pool was specified by name at instance create time, the request would 404. However! That was the quick version, and the data model left much to be desired. The relation was modeled by adding a nullable `silo_id` and sort-of-not-really-nullable `is_default` column directly on the IP pool table, which has the following limitations (and there are probably more): * A given IP pool could only be associated with at most one silo, could not be shared * The concept of `default` was treated as a property of the pool itself, rather than a property of the _association_ with another resource, which is quite strange. Even if you could associate the pool with multiple silos, you could not have it be the default for one and not for the other * There is no way to create an IP pool without associating it with either the fleet or a silo * Extending this model to allow association at the project level would be inelegant — we'd have to add a `project_id` column (which I did in #3981 before removing it in #3985) More broadly (and vaguely), the idea of an IP pool "knowing" about silos or projects doesn't really make sense. Entities aren't really supposed to know about each other unless they have a parent-child relationship. ## Changes in this PR ### No such thing as fleet-scoped pool, only silo Thanks to @zephraph for encouraging me to make this change. It is dramatically easier to explain "link silo to IP pool" than it is to explain "link resource (fleet or silo) to IP pool". The way to recreate the behavior of a single default pool for the fleet is to simply associate a pool with all silos. Data migrations ensure that existing fleet-scoped pools will be associated with all silos. There can only be one default pool for a silo, so in the rare case where pool A is a fleet default and pool B is default on silo S, we associate both A and B with S, but only B is made silo default pool. ### API These endpoints are added. They're pretty self-explanatory. ``` ip_pool_silo_link POST /v1/system/ip-pools/{pool}/silos ip_pool_silo_list GET /v1/system/ip-pools/{pool}/silos ip_pool_silo_unlink DELETE /v1/system/ip-pools/{pool}/silos/{silo} ip_pool_silo_update PUT /v1/system/ip-pools/{pool}/silos/{silo} ``` The `silo_id` and `is_default` fields are removed from the `IpPool` response as they are now a property of the `IpPoolLink`, not the pool itself. I also fixed the silo-scoped IP pools list (`/v1/ip-pools`) and fetch (`/v1/ip-pools/{pool}`) endpoints, which a) did not actually filter for the current silo, allowing any user to fetch any pool, and b) took a spurious `project` query param that didn't do anything. ### DB The association between IP pools and fleet or silo (or eventually projects, but not here) is now modeled through a polymorphic join table called `ip_pool_resource`: ip_pool_id | resource_type | resource_id | is_default -- | -- | -- | -- 123 | silo | 23 | true 123 | silo | 4 | false ~~65~~ | ~~fleet~~ | ~~FLEET_ID~~ | ~~true~~ Now, instead of setting the association with a silo or fleet at IP pool create or update time, there are separate endpoints for adding and removing an association. A pool can be associated with any number of resources, but a unique index ensures that a given resource can only have one default pool. ### Default IP pool logic If an instance ephemeral IP or a floating IP is created **with a pool specified**, we simply use that pool if it exists and is linked to the user's silo. If an instance ephemeral IP or a floating IP is created **without a pool unspecified**, we look for a default pool for the current silo. If there is a pool linked with the current silo with `is_default=true`, use that. Otherwise, there is no default pool for the given scope and IP allocation will fail, which means the instance create or floating IP create request will fail. The difference introduced in this PR is that we do not fall back to fleet default if there is no silo default because we have removed the concept of a fleet-scoped pool. ### Tests and test helpers This is the source of a lot of noise in this PR. Because there can no longer be a fleet default pool, we can no longer rely on that for tests. The test setup was really confusing. We assumed a default IP pool existed, but we still had to populate it (add a range) if we had to do anything with it. Now, we don't assume it exists, we create it and add a range and associate it with a silo all in one helper. ## What do customers have to do when they upgrade? They should not _have_ to do anything at upgrade time. If they were relying on a single fleet default pool to automatically be used by new silos, when they create silos in the future they will have to manually associate each new silo with the desired pool. We are working on ways to make that easier or more automatic, but that's not in this change. It is less urgent because silo creation is an infrequent operation. If they are _not_ using the previously fleet default IP pool named `default` and do not want it to exist, they can simply delete any IP ranges it contains, unlink it from all silos and delete it. If they are not using it, there should not be any IPs allocated from it (which means they can delete it). --------- Co-authored-by: Justin Bennett <[email protected]>
1 parent 69733d8 commit bf49833

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+2633
-1120
lines changed

common/src/api/external/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -739,6 +739,7 @@ pub enum ResourceType {
739739
LoopbackAddress,
740740
SwitchPortSettings,
741741
IpPool,
742+
IpPoolResource,
742743
InstanceNetworkInterface,
743744
PhysicalDisk,
744745
Rack,

end-to-end-tests/src/bin/bootstrap.rs

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ use end_to_end_tests::helpers::{generate_name, get_system_ip_pool};
44
use omicron_test_utils::dev::poll::{wait_for_condition, CondCheckError};
55
use oxide_client::types::{
66
ByteCount, DeviceAccessTokenRequest, DeviceAuthRequest, DeviceAuthVerify,
7-
DiskCreate, DiskSource, IpRange, Ipv4Range, SiloQuotasUpdate,
7+
DiskCreate, DiskSource, IpPoolCreate, IpPoolSiloLink, IpRange, Ipv4Range,
8+
NameOrId, SiloQuotasUpdate,
89
};
910
use oxide_client::{
1011
ClientDisksExt, ClientHiddenExt, ClientProjectsExt,
@@ -38,9 +39,27 @@ async fn main() -> Result<()> {
3839

3940
// ===== CREATE IP POOL ===== //
4041
eprintln!("creating IP pool... {:?} - {:?}", first, last);
42+
let pool_name = "default";
43+
client
44+
.ip_pool_create()
45+
.body(IpPoolCreate {
46+
name: pool_name.parse().unwrap(),
47+
description: "Default IP pool".to_string(),
48+
})
49+
.send()
50+
.await?;
51+
client
52+
.ip_pool_silo_link()
53+
.pool(pool_name)
54+
.body(IpPoolSiloLink {
55+
silo: NameOrId::Name(params.silo_name().parse().unwrap()),
56+
is_default: true,
57+
})
58+
.send()
59+
.await?;
4160
client
4261
.ip_pool_range_add()
43-
.pool("default")
62+
.pool(pool_name)
4463
.body(IpRange::V4(Ipv4Range { first, last }))
4564
.send()
4665
.await?;

end-to-end-tests/src/helpers/ctx.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,6 +287,10 @@ impl ClientParams {
287287
.build()?;
288288
Ok(Client::new_with_client(&base_url, reqwest_client))
289289
}
290+
291+
pub fn silo_name(&self) -> String {
292+
self.rss_config.recovery_silo.silo_name.to_string()
293+
}
290294
}
291295

292296
async fn wait_for_records(

nexus/db-model/src/ip_pool.rs

Lines changed: 35 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,10 @@
55
//! Model types for IP Pools and the CIDR blocks therein.
66
77
use crate::collection::DatastoreCollectionConfig;
8+
use crate::impl_enum_type;
89
use crate::schema::ip_pool;
910
use crate::schema::ip_pool_range;
11+
use crate::schema::ip_pool_resource;
1012
use crate::Name;
1113
use chrono::DateTime;
1214
use chrono::Utc;
@@ -35,42 +37,23 @@ pub struct IpPool {
3537
/// Child resource generation number, for optimistic concurrency control of
3638
/// the contained ranges.
3739
pub rcgen: i64,
38-
39-
/// Silo, if IP pool is associated with a particular silo. One special use
40-
/// for this is associating a pool with the internal silo oxide-internal,
41-
/// which is used for internal services. If there is no silo ID, the
42-
/// pool is considered a fleet-wide pool and will be used for allocating
43-
/// instance IPs in silos that don't have their own pool.
44-
pub silo_id: Option<Uuid>,
45-
46-
pub is_default: bool,
4740
}
4841

4942
impl IpPool {
50-
pub fn new(
51-
pool_identity: &external::IdentityMetadataCreateParams,
52-
silo_id: Option<Uuid>,
53-
is_default: bool,
54-
) -> Self {
43+
pub fn new(pool_identity: &external::IdentityMetadataCreateParams) -> Self {
5544
Self {
5645
identity: IpPoolIdentity::new(
5746
Uuid::new_v4(),
5847
pool_identity.clone(),
5948
),
6049
rcgen: 0,
61-
silo_id,
62-
is_default,
6350
}
6451
}
6552
}
6653

6754
impl From<IpPool> for views::IpPool {
6855
fn from(pool: IpPool) -> Self {
69-
Self {
70-
identity: pool.identity(),
71-
silo_id: pool.silo_id,
72-
is_default: pool.is_default,
73-
}
56+
Self { identity: pool.identity() }
7457
}
7558
}
7659

@@ -93,6 +76,37 @@ impl From<params::IpPoolUpdate> for IpPoolUpdate {
9376
}
9477
}
9578

79+
impl_enum_type!(
80+
#[derive(SqlType, Debug, Clone, Copy, QueryId)]
81+
#[diesel(postgres_type(name = "ip_pool_resource_type"))]
82+
pub struct IpPoolResourceTypeEnum;
83+
84+
#[derive(Clone, Copy, Debug, AsExpression, FromSqlRow, PartialEq)]
85+
#[diesel(sql_type = IpPoolResourceTypeEnum)]
86+
pub enum IpPoolResourceType;
87+
88+
Silo => b"silo"
89+
);
90+
91+
#[derive(Queryable, Insertable, Selectable, Clone, Debug)]
92+
#[diesel(table_name = ip_pool_resource)]
93+
pub struct IpPoolResource {
94+
pub ip_pool_id: Uuid,
95+
pub resource_type: IpPoolResourceType,
96+
pub resource_id: Uuid,
97+
pub is_default: bool,
98+
}
99+
100+
impl From<IpPoolResource> for views::IpPoolSilo {
101+
fn from(assoc: IpPoolResource) -> Self {
102+
Self {
103+
ip_pool_id: assoc.ip_pool_id,
104+
silo_id: assoc.resource_id,
105+
is_default: assoc.is_default,
106+
}
107+
}
108+
}
109+
96110
/// A range of IP addresses for an IP Pool.
97111
#[derive(Queryable, Insertable, Selectable, Clone, Debug)]
98112
#[diesel(table_name = ip_pool_range)]

nexus/db-model/src/schema.rs

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ use omicron_common::api::external::SemverVersion;
1313
///
1414
/// This should be updated whenever the schema is changed. For more details,
1515
/// refer to: schema/crdb/README.adoc
16-
pub const SCHEMA_VERSION: SemverVersion = SemverVersion::new(22, 0, 0);
16+
pub const SCHEMA_VERSION: SemverVersion = SemverVersion::new(23, 0, 1);
1717

1818
table! {
1919
disk (id) {
@@ -504,7 +504,14 @@ table! {
504504
time_modified -> Timestamptz,
505505
time_deleted -> Nullable<Timestamptz>,
506506
rcgen -> Int8,
507-
silo_id -> Nullable<Uuid>,
507+
}
508+
}
509+
510+
table! {
511+
ip_pool_resource (ip_pool_id, resource_type, resource_id) {
512+
ip_pool_id -> Uuid,
513+
resource_type -> crate::IpPoolResourceTypeEnum,
514+
resource_id -> Uuid,
508515
is_default -> Bool,
509516
}
510517
}
@@ -1426,8 +1433,9 @@ allow_tables_to_appear_in_same_query!(
14261433
);
14271434
joinable!(system_update_component_update -> component_update (component_update_id));
14281435

1429-
allow_tables_to_appear_in_same_query!(ip_pool_range, ip_pool);
1436+
allow_tables_to_appear_in_same_query!(ip_pool_range, ip_pool, ip_pool_resource);
14301437
joinable!(ip_pool_range -> ip_pool (ip_pool_id));
1438+
joinable!(ip_pool_resource -> ip_pool (ip_pool_id));
14311439

14321440
allow_tables_to_appear_in_same_query!(inv_collection, inv_collection_error);
14331441
joinable!(inv_collection_error -> inv_collection (inv_collection_id));
@@ -1478,6 +1486,11 @@ allow_tables_to_appear_in_same_query!(
14781486
allow_tables_to_appear_in_same_query!(dns_zone, dns_version, dns_name);
14791487
allow_tables_to_appear_in_same_query!(external_ip, service);
14801488

1489+
// used for query to check whether an IP pool association has any allocated IPs before deleting
1490+
allow_tables_to_appear_in_same_query!(external_ip, instance);
1491+
allow_tables_to_appear_in_same_query!(external_ip, project);
1492+
allow_tables_to_appear_in_same_query!(external_ip, ip_pool_resource);
1493+
14811494
allow_tables_to_appear_in_same_query!(
14821495
switch_port,
14831496
switch_port_settings_route_config

nexus/db-queries/src/db/datastore/external_ip.rs

Lines changed: 27 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -76,22 +76,18 @@ impl DataStore {
7676
.fetch_for(authz::Action::CreateChild)
7777
.await?;
7878

79-
// If the named pool conflicts with user's current scope, i.e.,
80-
// if it has a silo and it's different from the current silo,
81-
// then as far as IP allocation is concerned, that pool doesn't
82-
// exist. If the pool has no silo, it's fleet-scoped and can
83-
// always be used.
84-
let authz_silo_id = opctx.authn.silo_required()?.id();
85-
if let Some(pool_silo_id) = pool.silo_id {
86-
if pool_silo_id != authz_silo_id {
87-
return Err(authz_pool.not_found());
88-
}
79+
// If this pool is not linked to the current silo, 404
80+
if self.ip_pool_fetch_link(opctx, pool.id()).await.is_err() {
81+
return Err(authz_pool.not_found());
8982
}
9083

9184
pool
9285
}
9386
// If no name given, use the default logic
94-
None => self.ip_pools_fetch_default(&opctx).await?,
87+
None => {
88+
let (.., pool) = self.ip_pools_fetch_default(&opctx).await?;
89+
pool
90+
}
9591
};
9692

9793
let pool_id = pool.identity.id;
@@ -147,36 +143,29 @@ impl DataStore {
147143
) -> CreateResult<ExternalIp> {
148144
let ip_id = Uuid::new_v4();
149145

150-
// See `allocate_instance_ephemeral_ip`: we're replicating
151-
// its strucutre to prevent cross-silo pool access.
152-
let pool_id = if let Some(name_or_id) = params.pool {
153-
let (.., authz_pool, pool) = match name_or_id {
154-
NameOrId::Name(name) => {
155-
LookupPath::new(opctx, self)
156-
.ip_pool_name(&Name(name))
157-
.fetch_for(authz::Action::CreateChild)
158-
.await?
159-
}
160-
NameOrId::Id(id) => {
161-
LookupPath::new(opctx, self)
162-
.ip_pool_id(id)
163-
.fetch_for(authz::Action::CreateChild)
164-
.await?
165-
}
166-
};
167-
168-
let authz_silo_id = opctx.authn.silo_required()?.id();
169-
if let Some(pool_silo_id) = pool.silo_id {
170-
if pool_silo_id != authz_silo_id {
171-
return Err(authz_pool.not_found());
172-
}
146+
// TODO: NameOrId resolution should happen a level higher, in the nexus function
147+
let (.., authz_pool, pool) = match params.pool {
148+
Some(NameOrId::Name(name)) => {
149+
LookupPath::new(opctx, self)
150+
.ip_pool_name(&Name(name))
151+
.fetch_for(authz::Action::Read)
152+
.await?
153+
}
154+
Some(NameOrId::Id(id)) => {
155+
LookupPath::new(opctx, self)
156+
.ip_pool_id(id)
157+
.fetch_for(authz::Action::Read)
158+
.await?
173159
}
160+
None => self.ip_pools_fetch_default(opctx).await?,
161+
};
174162

175-
pool
176-
} else {
177-
self.ip_pools_fetch_default(opctx).await?
163+
let pool_id = pool.id();
164+
165+
// If this pool is not linked to the current silo, 404
166+
if self.ip_pool_fetch_link(opctx, pool_id).await.is_err() {
167+
return Err(authz_pool.not_found());
178168
}
179-
.id();
180169

181170
let data = if let Some(ip) = params.address {
182171
IncompleteExternalIp::for_floating_explicit(

0 commit comments

Comments
 (0)