Skip to content

Allow snapshot tap changes #4731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

andrewla
Copy link
Contributor

@andrewla andrewla commented Aug 15, 2024

Changes

Allow renaming of tap devices on snapshot restore

Reason

In some scenarios it is not possible to use the jailer, especially in limited privilege environments where the security is external to firecracker itself. But in these cases a snapshot may have to use a different tap device than the one that it was using when it was snapshotted.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • If a specific issue led to this PR, this PR closes the issue.
  • The description of changes is clear and encompassing.
  • Any required documentation changes (code and docs) are included in this
    PR.
  • API changes follow the Runbook for Firecracker API changes.
  • User-facing changes are mentioned in CHANGELOG.md.
  • All added/changed functionality is tested.
  • New TODOs link to an issue.
  • Commits meet
    contribution quality standards.

  • This functionality cannot be added in rust-vmm.

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 7991d9f to 8d1a0a9 Compare August 15, 2024 21:24
@pb8o
Copy link
Contributor

pb8o commented Aug 19, 2024

Hi @andrewla thank you for your contribution! We would like to understand the use case better in case it can be resolved through other means first. We recommend using a network namespace where you can create TAP devices with the same name, but that probably requires CAP_SYS_ADMIN, which I understand is what you mean with "limited privilege environments".

Could you elaborate on your use case? Is there a way you could create the namespace in a privileged setting and then use something like nsenter firecracker ...?

@andrewla
Copy link
Contributor Author

That assessment is correct -- basically to run the jailer in a network namespace you need the setns syscall which requires CAP_SYS_ADMIN. So nsenter is not an option.

Our particular case is running in a containerized environment where our privileges are limited by the nature of the general environment. Once we're in our particular container we have lost all relevant privileges.

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch 5 times, most recently from 3415816 to 03c3be9 Compare August 26, 2024 20:43
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch from 03c3be9 to 265ea94 Compare August 27, 2024 13:24
@andrewla andrewla marked this pull request as ready for review August 27, 2024 13:25
@pb8o
Copy link
Contributor

pb8o commented Aug 29, 2024

Hi again @andrewla, we have been talking internally about this PR and we may need to spend some time to decide on the API aspects of it to make sure it doesn't conflict with other efforts.

In the meantime, we thought of another workaround. The snapshot-editor could be enhanced to rename the tap devices in an snapshot file. That would be an easier decision for us, but we want to make sure it would handle your use case.

For example we imagine the tool would work like this:

snapshot-editor edit-vmstate rename-network eth0 tap1

Would this work within your environment?

@andrewla
Copy link
Contributor Author

andrewla commented Sep 3, 2024

This was our initial approach as it required minimal changes. But we found that the performance cost of making the copy (as opposed to hardlinking) during the operation (plus serde costs) were more expensive than we were willing to tolerate in our environment.

@andrewla
Copy link
Contributor Author

Hi @pb8o -- is there anything we can do to help move this forward?

@pb8o
Copy link
Contributor

pb8o commented Oct 16, 2024

Hi @andrewla I haven't had time to look at this, but this is next on my list now. Thanks for your patience!

@kanpov
Copy link
Contributor

kanpov commented Nov 1, 2024

On a related note, another reason why renaming the tap device is a better approach than namespaced NAT from the "Network for Clones" guide is that the namespaced NAT imposes measurable overhead onto the host kernel due to the addition of about 5 more iptables/nft rules, plus an RTNETLINK route for forwarding the guest IP out of the netns.

Even though I made an effort to support namespaced NAT in fcnet, it increased complexity by a factor of 4-5x in comparison to regular NAT only to support one usecase: two simultaneous microVM clones. So I'd be in favor of this change, or a snapshot-editor equivalent.

@pb8o
Copy link
Contributor

pb8o commented Jan 8, 2025

Hello @andrewla ! I apologize for the long time between updates, but some other stuff came up. So we have decided to go ahead with this. I gave a first initial review and I only have some minor comments, but mostly looks good to me. I just have a question if the network_overrides field also works when starting from a JSON config file.

@andrewla andrewla requested a review from Manciukic as a code owner January 8, 2025 19:28
@andrewla
Copy link
Contributor Author

andrewla commented Jan 9, 2025

Re: config -- currently there is no config support for snapshots (https://github.com/firecracker-microvm/firecracker/blob/main/src/vmm/src/resources.rs) -- the snapshot configuration and restore has to be done with a running firecracker instance

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch 2 times, most recently from d8c5a44 to ea62e9a Compare January 9, 2025 19:00
Copy link
Contributor

@bchalios bchalios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It generally looks good and thanks for the contribution Andrew.

A few comments/questions from me.

Also, I've commented this for the documentation changes, but could you please squash as well the commits for the test changes into a single commit?

@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch 4 times, most recently from 0be3ff7 to bf47436 Compare January 10, 2025 15:02
@bchalios bchalios force-pushed the allow-snapshot-tap-changes branch 2 times, most recently from e32f8b8 to 0818b99 Compare March 13, 2025 11:23
@andrewla andrewla force-pushed the allow-snapshot-tap-changes branch 2 times, most recently from 2f1902e to 893686d Compare March 14, 2025 13:55
@andrewla andrewla requested a review from bchalios March 14, 2025 13:56
bchalios
bchalios previously approved these changes Mar 14, 2025
@bchalios bchalios added Status: Awaiting review Indicates that a pull request is ready to be reviewed and removed Status: Awaiting author Indicates that an issue or pull request requires author action labels Mar 14, 2025
@bchalios bchalios force-pushed the allow-snapshot-tap-changes branch 3 times, most recently from a0e3945 to 289abc3 Compare March 14, 2025 18:34
@bchalios bchalios enabled auto-merge (rebase) March 17, 2025 10:12
@bchalios bchalios added Status: Awaiting review Indicates that a pull request is ready to be reviewed and removed Status: Awaiting review Indicates that a pull request is ready to be reviewed labels Mar 17, 2025
In some scenarios it is not possible to use the jailer, especially in
limited privilege environments where the security is external to
firecracker itself. But in these cases a snapshot may have to use a
different tap device than the one that it was using when it was
snapshotted.

Signed-off-by: Andrew Laucius <[email protected]>
Signed-off-by: Babis Chalios <[email protected]>
Test that we can correctly parse configuration and API calls in a
backwards compatible way.

Signed-off-by: Andrew Laucius <[email protected]>
Signed-off-by: Babis Chalios <[email protected]>
@bchalios bchalios force-pushed the allow-snapshot-tap-changes branch from 289abc3 to 550b886 Compare March 17, 2025 10:54
Documenting the ability to rename network interfaces on snapshot
restore.

Signed-off-by: Andrew Laucius <[email protected]>
Signed-off-by: Babis Chalios <[email protected]>
@bchalios bchalios force-pushed the allow-snapshot-tap-changes branch from 550b886 to adf9d4a Compare March 17, 2025 11:06
@bchalios bchalios merged commit 7c7884c into firecracker-microvm:main Mar 17, 2025
5 of 7 checks passed
@roypat roypat moved this to Coming Soon in Firecracker Roadmap Apr 16, 2025
@bchalios bchalios moved this from Coming Soon to Shipped in Firecracker Roadmap Apr 24, 2025
@bchalios bchalios moved this from Shipped to Coming Soon in Firecracker Roadmap Apr 24, 2025
@zulinx86 zulinx86 moved this from Coming Soon to Shipped in Firecracker Roadmap May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Awaiting review Indicates that a pull request is ready to be reviewed
Projects
Status: Shipped
Development

Successfully merging this pull request may close these issues.

6 participants