-
Notifications
You must be signed in to change notification settings - Fork 153
Nodes should exit cleanly #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
As such, should I make it so that each context has its own signal handler? Or should I work on creating an |
Adding a signal handler for every context seems tricky and a source of problems (what if we have more than one context, will all of them handle the exit signal?). Perhaps a |
The https://github.com/Detegr/rust-ctrlc/blob/master/src/lib.rs#L60= And then https://github.com/Detegr/rust-ctrlc/blob/master/src/platform/unix/mod.rs#L76= https://github.com/Detegr/rust-ctrlc/blob/master/src/platform/unix/mod.rs#L14= In our case, it'd be akin to having a global default context where we'd store the signal handler |
I had actually been looking at the It would allow us to handle more signals (if needed) than just SIGINT. I'm not sure if that's overkill or not, though. But if I go with that, that would trap us into needing a global default context, right? |
Edit: Removed a part. There's no fundamental reason we couldn't install a signal handler the first time a context is initialized, by a |
But then the question in my mind becomes: why didn't I'm concerned about the potential possibility of multiple signal handlers somehow potentially stepping on each others' toes. From everything I've read, these things are hard to implement correctly, and the risk of race conditions is high. Also, when are multiple contexts used? Since |
I think I heard once that it was because of backwards compatibility with ROS 1, but not sure anymore. I very much agree it's important to understand why other packages do things a certain way.
I think we wouldn't want to have multiple signal handlers. Since the signal handling crates abstract away things for you, I think there is no more risk of race conditions and it's not hard to implement things correctly. E.g. from taking a quick glance at Good question about the use case of multiple contexts. But maybe this should be discussed in a new issue? |
Perhaps, but it's tangentially related here - if we want to have a signal handler initialized for each context, wouldn't that make multiple signal handlers need to be created? The way the Or am I misunderstanding? |
No, I'm suggesting to only have one global signal handler that gets installed when the first context gets created. We could also uninstall it when the last context gets destroyed. With that in mind, my previous comment hopefully makes more sense. |
And where would that live? In the other client libraries, they created a global context to store it. |
rclcpp allows developers to create multiple ROS contexts, this is important when you partition the ROS network with multiple domain IDs, or run loggers from multiple nodes in the same process. The rclcpp global context is still a thing but you can use it indirectly through a local context.
The rclcpp Node / Excutor / CallbackGroup model had a substantial refactor in Foxy, most of that was shaking off ROS1 baggage of one-process-one-node. I think if you can ditch globals in I'll be following this thread with interest because I have a race condition on sigint between rclcpp and GLib |
All right. With that information, I feel more comfortable about us splitting off from how @nnmm, I will try to see what I can do to get your suggestion working. Let's see if we can get this done without globals! |
@BrettRD Thanks for chiming in. Can you detail what you mean by "you can use it indirectly through a local context" (I thought the global context is an alternative to a local context, where one doesn't use the other) and "ditch globals" (do you mean the global default context?)? Also, of what nature is the signal handling race condition? Which signal handler is installed first? @jhdcs My guess is that you'll need at least one static variable. But that's okay imo. |
I was thinking that, but the static variable would probably be the flag... Which would mean it'd have to be mutable... And accessing While the scope of the problem would be limited to when the program is shutting down, I'd rather the problem not exist at all... Trying to think of a way around it... Perhaps use |
@jhdcs Atomics do not need to be mutable, they can be modified through a shared reference. |
Oh. Right. Silly me! |
The RCLCPP API has a couple of globals lurking around, and they make it unclear how much your node is exposed to other nodes in the same process. I would like to see rcl* libraries eliminate globals because I run into a lot of system integration decisions that need to break assumptions. In RCLCPP, there are a bunch of ways of creating a node with options and context. The old (default?) way is to rclcpp::init() the default context which installs the signal handler, default construct a node, and it automatically picks up the default context. This allows all nodes in the process to use a common signal handler which is held by the default context, but there's no longer much of a reason to have a fully-fledged rclcpp::context sitting around at global scope, except to hold the signal handler.
I have a hastily thrown-together threading arrangement with blocking waits etc, I should be doing things differently / safer. |
So perhaps what we do is have an Though I'm still trying to figure out the proper way to have things like nodes self-terminate on a signal. Ownership is making it a bit tricky. I'm currently checking the signal each time a node spins, but that means that we'd have to pass ownership of both the context and the node being spun to the spin functions in order for it to kill them... |
rclcpp also has bool install_signal_handlers() (returns false if the signal handlers have already been installed) and bool rclcpp::signal_handlers_installed() (checks if the signal handlers are installed) You can be a bit flexible in which structures do which task, in rclcpp, it got really modular very quickly:
rclcpp::spin(Node) and node.spin() are helper functions that create a default executor, load it with the default CallbackGroup, and spin the executor. I'm not familiar enough with the Rust ownership model, can the signal handler be a Static member of the context class? |
@jhdcs I think such a function would be a good apprach, though it should be named differently imo since it's not an analogue of Another way, like I described above, is to make contexts "share" the signal handler by enabling it when at least one context exists. That can be done with static variables. The drawback is that there's not a good way to make the signal handling optional in that case.
Have you checked out what rclcpp does? I think they have a guard condition that is added to the wait set which is triggered when the signal is received. Plus
@BrettRD Basically, yes. Rust is a little different, structs can't have static fields, but we'd have one or more static variables that are private to the |
I'm triple-checking this, but if a guard condition is how |
I'm not sure if the guard conditions are solely an optimization that reduce latency by not waiting for the current wait set to time out. The part that checks the atomic boolean in |
I thought the guard conditions were making it so that the waitset could be stopped early, which is admittedly similar to what you said, but it makes it so that you're less likely to get the user spamming Maybe it's just a case of me over-thinking things, but I figured that this should be able to gracefully shut down even frozen nodes... |
What do you mean by "frozen"? |
A node not responding to commands or inputs. Though that might be a case where a graceful shutdown isn't warranted. |
But why – is a subscription callback not returning? If so, adding a guard condition won't help, because the node is not currently waiting in a wait set, right? |
Not sure. I think I need to do some more reading on wait sets, guard conditions, and signal handling... |
Currently, sending Ctrl-C to a node makes it exit with code 130 (on a Linux machine). It should instead exit cleanly.
If we follow the
rclcpp
example, this would be achieved by a signal handler which callsrcl_shutdown()
.The text was updated successfully, but these errors were encountered: