Skip to content

Add health checker #119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions firecracker.go
Original file line number Diff line number Diff line change
Expand Up @@ -307,3 +307,12 @@ func (f *Client) PatchGuestDriveByID(ctx context.Context, driveID, pathOnHost st

return f.client.Operations.PatchGuestDriveByID(params)
}

// DescribeInstance is a wrapper for the swagger generated client to make
// calling of the API easier.
func (f *Client) DescribeInstance(ctx context.Context) (*ops.DescribeInstanceOK, error) {
params := ops.NewDescribeInstanceParams()
params.SetContext(ctx)

return f.client.Operations.DescribeInstance(params)
}
4 changes: 4 additions & 0 deletions firecracker_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,4 +52,8 @@ func TestClient(t *testing.T) {
if _, err := client.PutGuestDriveByID(ctx, "test", drive); err != nil {
t.Errorf("unexpected error on PutGuestDriveByID, %v", err)
}

if _, err := client.DescribeInstance(ctx); err != nil {
t.Errorf("unexpected error on DescribeInstance, %v", err)
}
}
7 changes: 7 additions & 0 deletions machine.go
Original file line number Diff line number Diff line change
Expand Up @@ -707,6 +707,13 @@ func (m *Machine) UpdateGuestDrive(ctx context.Context, driveID, pathOnHost stri
return nil
}

// IsAlive makes sure that a Firecracker instance is running and responds via API by
// calling `DescribeInstance` endpoint behind.
func (m *Machine) IsAlive(ctx context.Context) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like one of the valid states that can be returned by DescribeInstance is Halted: https://github.com/firecracker-microvm/firecracker/blob/35b73437641122865b1603bd74c144035e1790c9/api_server/swagger/firecracker.yaml#L453

There's also Uninitialized. That makes me feel like IsAlive is not really what one would expect. It seems more like it's checking whether the VM manager is alive rather than what we'd consider the Machine.

If we want to keep the behavior as is, I can't think of any great names, but maybe something like Exists would be better than IsAlive?

On the other hand, would it make sense to change the behavior and return an error here if the State is Halted or Uninitialized? What's the actual motivation behind this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I accidentally clicked the approve button, I think this should be addressed before merging)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@sipsma - I've "dismissed" your review per your comment here.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions for names: IsVMMAlive, IsVMMResponsive, Reachable

_, err := m.client.DescribeInstance(ctx)
return err
}

// refreshMachineConfiguration synchronizes our cached representation of the machine configuration
// with that reported by the Firecracker API
func (m *Machine) refreshMachineConfiguration() error {
Expand Down
8 changes: 8 additions & 0 deletions machine_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,7 @@ func TestMicroVMExecution(t *testing.T) {
t.Run("TestAttachSecondaryDrive", func(t *testing.T) { testAttachSecondaryDrive(ctx, t, m) })
t.Run("TestAttachVsock", func(t *testing.T) { testAttachVsock(ctx, t, m) })
t.Run("SetMetadata", func(t *testing.T) { testSetMetadata(ctx, t, m) })
t.Run("IsAlive", func(t *testing.T) { testIsAlive(ctx, t, m) })
t.Run("UpdateMetadata", func(t *testing.T) { testUpdateMetadata(ctx, t, m) })
t.Run("GetMetadata", func(t *testing.T) { testGetMetadata(ctx, t, m) }) // Should be after testSetMetadata and testUpdateMetadata
t.Run("TestUpdateGuestDrive", func(t *testing.T) { testUpdateGuestDrive(ctx, t, m) })
Expand Down Expand Up @@ -691,6 +692,13 @@ func testGetMetadata(ctx context.Context, t *testing.T, m *Machine) {
}
}

func testIsAlive(ctx context.Context, t *testing.T, m *Machine) {
err := m.IsAlive(ctx)
if err != nil {
t.Errorf("machine is expected to be alive: %v", err)
}
}

func TestLogFiles(t *testing.T) {
cfg := Config{
Debug: true,
Expand Down
1 change: 1 addition & 0 deletions machineiface.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,4 +30,5 @@ type MachineIface interface {
SetMetadata(context.Context, interface{}) error
UpdateGuestDrive(context.Context, string, string, ...PatchGuestDriveByIDOpt) error
UpdateGuestNetworkInterfaceRateLimit(context.Context, string, RateLimiterSet, ...PatchGuestNetworkInterfaceByIDOpt) error
IsAlive(ctx context.Context) error
}