Skip to content

Conversation

@hawkw
Copy link
Member

@hawkw hawkw commented Nov 4, 2025

When a Nexus attempts to commit a new fault management situation report to the sitrep history but fails to do so because another sitrep with the same parent has already been inserted, that sitrep is said to be orphaned. Records pertaining to it are left behind in the database, but it will not be accessed by the rest of the system. Thus, we must occasionally garbage-collect such sitreps. This branch adds a background task for doing so.

Depends on #9320

@hawkw hawkw requested a review from smklein November 4, 2025 00:55
@hawkw hawkw self-assigned this Nov 4, 2025
@hawkw hawkw force-pushed the eliza/sitrep-gc branch 2 times, most recently from 80d8d00 to e186ab4 Compare November 4, 2025 18:57
Comment on lines 179 to 167
"failed to get fetch metadata for sitrep {} (v{}): {e}",
v.id, v.version
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be the expected case if we did a partial insert of sitrep data, but the "final sitrep" was not inserted?

Or are we saying: "if any records exist correlated with a sitrep, it is guaranteed to have an entry in the fm_sitrep table"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: Seems like you're relying on that ordering, as I'm reading below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bf33ebf added a quick comment describing this expectation.


paginator = p.found_batch(&versions, &|v| SqlU32::from(v.version));

for v in versions {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Related to my earlier comment) I think there's some complexity in identifying "orphans by version", since it makes it unsafe to delete any part of the version history while orphans might remain.

It might be possible to do orphan deletion in a more version-independent way:

  • Lookup all sitreps that are not referenced by fm_sitrep_history
  • Delete all these sitreps, unless parent_sitrep_id is the current active sitrep

Basically:

WITH current_sitrep_id AS (
    SELECT sitrep_id
    FROM omicron.public.fm_sitrep_history
    ORDER BY version DESC
    LIMIT 1
),
-- Scan through `fm_sitreps` to avoid full table scans
batch AS (
    SELECT s.id, s.parent_sitrep_id
    FROM omicron.public.fm_sitrep s
    WHERE s.id > $1
    ORDER BY s.id
    LIMIT $2
)
SELECT FROM omicron.public.fm_sitrep
WHERE id IN (
    SELECT b.id
    FROM batch b
    -- Lookup the sitrep in the history
    LEFT JOIN omicron.public.fm_sitrep_history h ON h.sitrep_id = b.id
    -- Find sitreps missing from history
    WHERE h.sitrep_id IS NULL
      -- ... where the sitrep cannot be made active
      AND (b.parent_sitrep_id IS NULL 
           OR b.parent_sitrep_id != (SELECT sitrep_id FROM current_sitrep_id))
)
RETURNING id;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah... I had done it based on entries in the version table as a way to avoid doing a full table scan, but I think this approach would also work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so, I was originally thinking I would do this using the current sitrep ID from the sitrep_loader watch channel, rather than as a CTE, but I realize that actually doesn't work. The current sitrep in the database may have moved, and if a "stale" current sitrep ID was used for this query, it would return children of the current sitrep which are in progress. I'm not sure if the CTE is safe either: would this ensure that the current sitrep is locked while we are executing the SELECT, or could it have moved forwards in the midst of this query?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I wrote this proposed CTE incorrectly, my intent was that:

  • Every sitrep in history
  • Every sitrep which could be added to history

Are never going to be returned from the SELECT. In fact, we'd only return the opposite: sitreps which are not in history, and which cannot be in history.

As a result: this should be safe from concurrent insertions - current / possible future sitreps shouldn't be returnable from it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I wasn't sure about was whether the fm_sitrep_history table would be locked as long as the query is executing. If it is, then this should be fine. But if a new current sitrep could be added after we observe the current ID, then I think we would incorrectly observe children of the new current sitrep as eligible for GC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, cool, thank you. I wasn't totally sure what the locking semantics were here, so thanks for explaining. In that case, this seems like the right thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ta-dah: 642854f (man, writing CTEs in Diesel really sucks)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I still need to re-review the rest of this, but:)

I recommend looking at nexus/db-queries/src/db/raw_query_builder.rs -- there's a struct called QueryBuilder which I added, and which we're using in a few spots in Omicron.

Basically: I agree, the Diesel mechanism to write CTEs is incredibly painful. It sorta integrates with the existing Diesel type system, and allows passing named columns/tables, nested ASTs via walk_ast, but IMO the syntax is complex enough and already involves pushing raw SQL strings, so the extra complexity becomes a bit of a burden. I have the opinion that: "if you need to have a comment clarifying what your SQL query should be, right next to where you actually write your SQL query, something has gone wrong".

QueryBuilder tries to limit your input to:

  • Raw SQL as a string (with some validations to prevent SQL injection)
  • Bind parameters (which still use Diesel sql_types, but just that -- no "diesel query" involvement)

We also have expectorate_query_contents to make it easy to validate "the query is valid SQL" before any other testing.

If I'm writing a simple, non-CTE diesel query: I usually reach for Diesel's query system first. The types are validated, it's almost always valid SQL, no raw strings, etc.

However, if I'm writing CTEs: The diesel support is so bad (mostly: using diesel's types for columns, table names, subqueries, etc, ends up getting in the way more than it helps) - so I just go for the QueryBuilder approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's nice, I think I overlooked the existence of the more unchecked QueryBuilder API. I think that could definitely have helped me avoid a lot of pain from accidental SQL syntax errors from splitting the query up into a bunch of tiny push_sql calls. I'll see about rewriting this to use that which should hopefully make future modifications less painful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is much nicer (see 08ce99b). I'll probably give the other CTE in this module the same treatment in a follow-up PR.

@hawkw hawkw marked this pull request as ready for review November 8, 2025 00:14
@hawkw hawkw requested a review from smklein November 8, 2025 00:15
@hawkw hawkw changed the title [2/n][nexus] garbage collect orphaned FM sitreps [nexus] garbage collect orphaned FM sitreps Nov 8, 2025
Base automatically changed from eliza/fm-sitrep to main November 8, 2025 00:52
hawkw added a commit that referenced this pull request Nov 10, 2025
this was recommended by the `EXPLAIN` output for the list orphans query,
and does seem like a good idea.
Comment on lines 290 to 292
WHERE
h.sitrep_id IS NULL \
AND (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick; idk if you care about the linebreak after "WHERE" / "AND (".

The "expectorate" tests use crdb's sqlfmt, so it shouldn't make a difference in the output

WHERE
h.sitrep_id IS NULL
AND (
b.parent_sitrep_id IS NULL
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm looking at this query more closely, I think there miiiiight be one edge case here: the case where we have no history.

In this scenario:

  • h.sitrep_id IS NULL because a new sitrep won't appear in history
  • b.parent_sitrep IS NULL because the batch of sitrep(s) won't have a parent

So it's technically possible that:

  • If we create a new sitrep, but haven't yet inserted it into history...
  • ... and this background task runs concurrently...
  • ... we'll delete the first sitrep we're trying to insert.

This shouldn't happen afterwards. I'm not sure it even matters for this first sitrep. but figured I'd call it out?

We could probably avoid this by changing this check to:

(b.parent_sitrep_id IS NULL AND current_sitrep_id.sitrep_id IS NOT NULL)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch, I think you're right, we should handle that too. I'll make that change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this actually has to be:

b.parent_sitrep_id IS NULL AND (SELECT sitrep_id FROM current_sitrep_id) IS NOT NULL

but otherwise, yeah.

@hawkw hawkw enabled auto-merge (squash) November 10, 2025 20:38
Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! looks good to me.

@hawkw hawkw merged commit 017597c into main Nov 10, 2025
16 checks passed
@hawkw hawkw deleted the eliza/sitrep-gc branch November 10, 2025 21:51
hawkw added a commit that referenced this pull request Nov 11, 2025
Depends on #9335.

In [this comment][1] on PR #9335, @smklein suggested using the
`raw_query_builder` API for writing CTEs, rather than using Diesel's
`QueryFragment` API. This turns out to be a lot less unpleasant and,
despite performing less validation of the generated SQL, actually is a
lot *less* likely to introduce hard-to-debug SQL syntax errors. Since
most of the query can be written in a single string literal, you
actually see more or less what you're gonna get, which is much less
obvious with Diesel's API. The `InsertSitrepVersionQuery` CTE in the
same module still uses the `QueryFragment` API, rather than the
`raw_query_builder` module, since I didn't want to change the existing
code in that branch.

This PR refactors the CTE for inserting a new sitrep version to also use
`raw_query_builder`, which is much nicer and should be easier to modify
in the future. There should be no functional changes here, just a change
to use a different API to generate the query.

Incidentally, this was one of the first times I had real success getting
Claude to do a fairly trivial "transform this code from using one API to
using another" task --- it wrote most of the first commit, and I tweaked
some of its style decisions I didn't love.

[1]:
#9335 (comment)
@hawkw hawkw added the fault-management Everything related to the fault-management initiative (RFD480 and others) label Nov 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fault-management Everything related to the fault-management initiative (RFD480 and others)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants