Skip to content

Conversation

@stedolan
Copy link
Contributor

(This branch is for discussion, not immediate merging. It contains a breaking change)

If a signal arrives during a blocking section, the runtime will try to execute the signal handler (arbitrary OCaml code) directly inside the Unix signal handling context. This is unsafe, and I'd like to see it removed, but first I'd like to understand the uses of this feature.

The issue is that only a very small whitelisted set of functions can safely be called from a Unix signal handler. In particular, it is not safe to call malloc from a Unix signal handler (malloc can safely be called from multiple threads, but if a signal handler interrupts malloc and itself calls malloc, then per-thread datastructures will be corrupted).

It is essentially impossible to know that a piece of OCaml code does not call malloc. Obviously, code which allocates may call malloc. However, merely writing a reference may call malloc, if the ref_table overflows. Writing an immediate value like None into a reference may call malloc, if the value that was previously there needs to be darkened and the grey stack overflows. Even throwing an exception may call malloc to allocate space for the backtrace.

So, the current approach is fundamentally unsafe. Is it relied upon?

@mshinwell
Copy link
Contributor

mshinwell commented Mar 15, 2017

This behaviour does not happen when using the systhreads library, I think. See [caml_thread_try_leave_blocking_section].

I agree that we shouldn't be running OCaml code from signal handlers.

@xavierleroy
Copy link
Contributor

Before going on, please read some background:
https://www.gnu.org/software/libc/manual/html_node/Interrupted-Primitives.html

The original motivation for running nontrivial code from signal handlers (as opposed to just recording the pending signal and polling it later) was to implement Sys.catch_break on early BSD systems. On those systems, when e.g. blocked in read from the terminal, pressing ctrl-C would trigger a signal but not interrupt the read call. It's only when the user enters some data that read returns and the recorded signal gets polled and triggers the raising of the Break exception. That's not what users expect.

Hence the idea to run the Caml signal handler straight from the Unix signal handler if we know we are blocked in a system call. (That's the origin of the "blocking section" concept that you see in caml_enter_blocking_section and caml_leave_blocking_section.)

I knew about the reentrancy issues with signal handlers, but reasoned that if the program is really blocked inside a system call, it is not running C library code. This is fragile and less and less true, as we tend to run more and more code between caml_enter_blocking_section and caml_leave_blocking_section, not just plain system calls.

Nowadays, if we know for sure that the old BSD behavior is dead and all Unix variants implement the POSIX behavior, we could rely on signal handling to terminate blocking system calls with EINTR error. Then caml_leave_blocking_section polls for signals and runs the corresponding Caml code.

Note that in this approach we can't use the SA_RESTARTflag to automatically restart syscalls on handled signals, hiding the EINTR error from the user. Note also that we do not use SA_RESTART currently (except in the X11 graphics library, but the handler is not in Caml, and that's a different hack entirely).

@mshinwell
Copy link
Contributor

@xavierleroy Except there is always a race between any check for pending signals and the start of a system call such as read---when the signal arrives immediately before the call starts. The signal handler doesn't run and neither is EINTR returned...

One way of mitigating this is via the "pipe trick", which I think @stedolan is implementing at the moment.

@xavierleroy
Copy link
Contributor

xavierleroy commented Mar 24, 2017

There is a somewhat complementary suggestion to handle EINTR better: https://caml.inria.fr/mantis/view.php?id=7508, pointing to a Python proposal https://www.python.org/dev/peps/pep-0475/

@xavierleroy
Copy link
Contributor

@mshinwell: yes, the race condition exists, or maybe there are two races conditions. With proper atomic instructions to check and clear pending signals, the effect of the race is to delay the handling of the signal to after the syscall returned normally. Without atomic instructions there is also a possibility of losing the signal altogether.

I was thinking of special-casing signal handlers that just raise an exception, as in the Sys.catch_break case. I have a feeling that it is safe (or can be made safe) to raise a Caml exception from a C signal handler, so we could keep using the current implementation (with immediate processing of signals when we are in a blocking section). Signals that run arbitrary Caml code would always be delayed, risking the race condition you mention.

@xavierleroy
Copy link
Contributor

@mshinwell: any pointer to the "pipe trick" ?

@mrvn
Copy link
Contributor

mrvn commented Mar 25, 2017

@xavierleroy I think the pipe trick refers to creating a pipe and have the signal handler write the signal number to the pipe. Then adding it to any select/poll/epoll calls will abort the call early, the signal number can be read from the pipe and the ocaml signal handler can be called. Under linux ther is also signalfd() for this.

Not sure how that helps with read/write calls.

@mrvn
Copy link
Contributor

mrvn commented Mar 25, 2017

As a side note: I looked at the python source and it seems to simply ignore the race that a signal happens right before the syscall. The signal will be recorded and the signal handler gets called when the syscall returns normally.

Using a pipe or signalfd avoids this for select/poll/epoll but I have no idea how to avoid this race for e.g. open().

@DemiMarie
Copy link
Contributor

👍 for the pipe trick. That and Linux's signalfd seem to be two of the only reasonable ways to handle signals that could possibly interrupt code that you don't control. (The third is to write to a global of type volatile sig_atomic_t that is periodically read from).

@mrvn
Copy link
Contributor

mrvn commented Mar 25, 2017

You can't atomically read and compare the variable and then call a syscall like open() if unset. The signal could still arrive between the read and syscall. I think the only thing that can be used there is setjmp/longjump.

@xavierleroy
Copy link
Contributor

If the "pipe trick" means wrapping system calls with select, I think it's going to cause more problems than it solves, based on my experience with VM threads that does exactly this to implement user-level thread scheduling. For one thing, select doesn't help with a great many syscalls, e.g. wait, and library functions, e.g. gethostbyname. I'd rather live with the race condition!

@mrvn
Copy link
Contributor

mrvn commented Mar 27, 2017

I think you have to use setjmp/longjmp for the general case

@stedolan
Copy link
Contributor Author

Poking around Mantis was enlightening (particularly PR#3659). I think the simplest example worth discussing is raising an exception from a signal handler. I see three possible means of doing it:

  1. Using an exception / longjmp / setcontext directly from the signal handler.
  2. Causing the blocking section to exit quickly with EINTR, and checking for pending signals afterwards.
  3. The self-pipe trick

1 and 2 have essentially unfixable race conditions, although those of 2 are much less serious (delaying processing of a signal rather than ignoring it or corrupting data). 3 is the most robust, but requires more significant code changes and doesn't apply to arbitrary code.

The race in 2 is as @mshinwell pointed out, although as @xavierleroy noted it only causes a delay, not a loss of a signal. The race in 1 is worse: a signal may arrive just after read returns, which causes the return value of read to be lost, which means we don't know how much of the buffer has been filled. If we were reading from a pipe, there is no way to recover the lost data.

I think the best option is to go for option 2 by default, possibly in combination with 3 in the cases where it's easy. This means that blocking sections will stop being interruptible, and it will become the responsibility of C code to handle EINTR by leaving and re-entering the blocking section before retrying.

That means that blocking C code not written with care for signal handling will by default block pending signals until the next caml_leave_blocking_section. I think this is better than allowing signals to arbitrarily interrupt C code, as that can cause data corruption.

If that sounds like a sensible solution, I'll open a pull request to that effect.

I knew about the reentrancy issues with signal handlers, but reasoned that if the program is really blocked inside a system call, it is not running C library code. This is fragile and less and less true, as we tend to run more and more code between caml_enter_blocking_section and caml_leave_blocking_section, not just plain system calls.

Right, if only async signal-safe code runs inside blocking sections, then we're on firmer ground (in particular, the issue I report here about interrupting malloc doesn't arise). But as you say, that's not true these days (example), and besides, the race on return values of system calls still occurs.

If the "pipe trick" means wrapping system calls with select, I think it's going to cause more problems than it solves, based on my experience with VM threads that does exactly this to implement user-level thread scheduling. For one thing, select doesn't help with a great many syscalls, e.g. wait, and library functions, e.g. gethostbyname. I'd rather live with the race condition!

The self-pipe trick is indeed the trick of wrapping syscalls with select and installing a signal handler that writes a byte to a pipe. It can be made to work with most syscalls, with some effort. To handle wait, you use wait(WNOHANG) and install a signal handler writing to a pipe for SIGCHLD.

In principle, unlike read/write/wait (which block waiting for other processes to do something), the library call gethostbyname doesn't block for an unbounded amount of time, so delaying the syscall handling until after the call returns isn't so bad. In practice, misconfigured DNS causes gethostbyname to block for an uncomfortably long amount of time, certainly long enough for the user to start mashing ^C, so this trick is not satisfying in general.

@xavierleroy
Copy link
Contributor

@stedolan: in the specific case of a signal handler that just raises an exception, I don't see the race condition in your solution #1. I agree that it obviously doesn't work for signal handlers that return normally.

Concerning select, remember that it gives approximate results: select then read can block because no data is actually available. So you need to put file descriptors in nonblocking mode. Then a bunch of other code starts getting EAGAIN errors. I went through this madness for the OCaml VM threads and won't go through it again.

@mrvn
Copy link
Contributor

mrvn commented Mar 28, 2017

@xavierleroy Race in option #1: you setjmp() and check the result. If it's not the first time then there was a signal and you handle it. If it's the first time you run your normal code. The signal handler calls longjmp().

Now what happens if your code does "retval = read(fd, &buf, size);" and just when read returned a signal hits?

The result of read is still in the return register 0 and not yet saved in the retval variable. The signal handler gets called which then calls longjmp(), loosing the result from read in the process. How is the signal handler supposed to know it must not longjmp() right then?

@stedolan
Copy link
Contributor Author

@stedolan: in the specific case of a signal handler that just raises an exception, I don't see the race condition in your solution #1. I agree that it obviously doesn't work for signal handlers that return normally.

nread = read(fd, channel->buff, channel->end - channel->buff);
// signal handler runs here
channel->offset += nread;

The race is when read returns successfully, but the signal handler is processed before the return value can be used. The next time we try to use that channel, we have no idea how much data its buffer contains. If we're reading from a pipe, there's no way to recover.

Concerning select, remember that it gives approximate results: select then read can block because no data is actually available. So you need to put file descriptors in nonblocking mode. Then a bunch of other code starts getting EAGAIN errors. I went through this madness for the OCaml VM threads and won't go through it again.

You're right, this sounds unpleasant. I'd initially thought it would be easy enough to just change the channel code in io.c to use non-blocking mode, but I'd forgotten that Unix.{in,out}_channel_of_descr exposes that code to custom file descriptors. (It's very annoying that the Unix API specifies blocking vs. nonblocking as per-file-descriptor state rather than a flag on each possibly-blocking operation).

@xavierleroy
Copy link
Contributor

@mrvn : your scenario is not the one I asked about. (Raising a Caml exception from the signal handler.)

@stedolan : that's a good example indeed, but I suspect solution #2 is vulnerable as well if caml_leave_blocking_section() is called early, before all updates to the channel data structure are completed. This seems to be currently the case in byterun/io.c...

@stedolan
Copy link
Contributor Author

I suspect solution #2 is vulnerable as well if caml_leave_blocking_section() is called early, before all updates to the channel data structure are completed. This seems to be currently the case in byterun/io.c...

That is indeed the case, and it's the cause of MPR#7503. I'm working on a patch to change this, which should make option 2 viable.

@mrvn
Copy link
Contributor

mrvn commented Mar 28, 2017

In option #2 when you leave early you clean up to a consisten state first. So for the channel case you would never leave between the read call and adding the offset.

The problem with option #2 is that the signal can happen just before the syscall. In that case the signal is recorded but you still call the syscall and it does not return with EINTR right away. Most cases it will finish normaly and only then the ocaml signal handler can run. Worst case the syscall blocks forever.

I think for option #2 to be viable all syscalls must be changed so they use a variant that never blocks too long and then loop. E.g. for select that means always using a timeout even if -1 was specified. Unfortunatly I don't think all of them have one. For things like open() we should probably just ignore the verry small chance that it can get stuck with a pending signal recorded. If you do want open() to be aborted by signals then don't set the alarm so close to the call that it might be missed or use a repeating signal.

@stedolan stedolan mentioned this pull request Mar 29, 2017
4 tasks
@stedolan
Copy link
Contributor Author

Superseded by #1128

@stedolan stedolan closed this Mar 29, 2017
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants