Skip to content

Cannot longjmp from signal handler to recover from SIG{SEGV,ILL,...} on macOS #44501

@alexcrichton

Description

@alexcrichton

What version of Go are you using (go version)?

$ go version
go version go1.16 darwin/amd64

Does this issue reproduce with the latest release?

Yes (downloaded 1.16 from the weebsite)

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/acrichton/Library/Caches/go-build"
GOENV="/Users/acrichton/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/acrichton/code/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/acrichton/code/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.16"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/acrichton/code/gorepro/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/b0/wd3mrtcj36l61jkqrzpdjds00000gn/T/go-build2517283257=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Given these files:

`go.mod`
module repro

go 1.16
`main.go`
package main

// #include "lib.h"
import "C"
import "fmt"

func main() {
        C.setup_signal_handler()
        for i := 0; ; i++ {
                if i%1000 == 0 {
                        fmt.Printf("%d\n", i)
                }
                C.do_trap()
        }
}
`lib.h`
#ifndef __lib_h
void setup_signal_handler(void);
void do_trap(void);
#endif
`lib.c`
#include <assert.h>
#include <setjmp.h>
#include <signal.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

#if 0
#define platform_jmp_buf sigjmp_buf
#define platform_setjmp(a) sigsetjmp(a, 0)
#define platform_longjmp(a, v) siglongjmp(a, v)
#else
#define platform_jmp_buf jmp_buf
#define platform_setjmp(a) setjmp(a)
#define platform_longjmp(a, v) longjmp(a, v)
#endif

struct sigaction PREV;
static int MY_TRAP = 0;
static platform_jmp_buf JMP;

static void signal_handler(int signal, siginfo_t *info, void *data) {
  if (MY_TRAP) {
    MY_TRAP = 0;
    platform_longjmp(JMP, 1);
    assert(0);
  }

  // If we don't handle this delegate to the previous handler.
  if (PREV.sa_flags & SA_SIGINFO) {
    PREV.sa_sigaction(signal, info, data);
  } else if (PREV.sa_handler == SIG_DFL || PREV.sa_handler == SIG_IGN) {
    sigaction(signal, &PREV, NULL);
  } else {
    PREV.sa_handler(signal);
  }
}

void setup_signal_handler(void) {
  struct sigaction handler;
  handler.sa_sigaction = signal_handler;
  handler.sa_flags = SA_SIGINFO | SA_NODEFER | SA_ONSTACK;
  sigemptyset(&handler.sa_mask);
  int rc = sigaction(SIGSEGV, &handler, &PREV);
  assert(rc == 0);
}

void do_trap(void) {
  assert(MY_TRAP == 0);
  if (platform_setjmp(JMP)) {
    assert(MY_TRAP == 0);
  } else {
    MY_TRAP = 1;
    *(int*) 3 = 5;
    assert(0);
  }
}

I compiled this via go build. I then ran the resulting binary with ./gorepro. I changed the #if 0 to #if 1 as well and looked at the results.

What did you expect to see?

I expected both binaries to succeed, regardless of #if 0 or #if 1

What did you see instead?

The Go binary would also crash eventually (sometimes taking more time than others). The #if 1 case using sigsetjmp to recover would often crash much faster than the #if 0 case. In the #if 0 case, however, this wouldstill eventually crash.

Here's an example of the crashes:

`#if 0` - using `setjmp`
fatal: morestack on g0
SIGTRAP: trace trap
PC=0x4063b22 m=0 sigcode=1

goroutine 0 [idle]:
runtime: unexpected return pc for runtime.abort called from 0xc000009ee0
stack: frame={sp:0xc000009a68, fp:0xc000009a70} stack=[0x7ffeefb80758,0x7ffeefbff7c0)

runtime.abort()
        /usr/local/Cellar/go/1.15.8/libexec/src/runtime/asm_amd64.s:860 +0x2

goroutine 1 [syscall]:
runtime.cgocall(0x40aa030, 0xc00005bf10, 0xc000000001)
        /usr/local/Cellar/go/1.15.8/libexec/src/runtime/cgocall.go:133 +0x5b fp=0xc00005bee0 sp=0xc00005bea8 pc=0x400465b
main._Cfunc_do_trap()
        _cgo_gotypes.go:41 +0x45 fp=0xc00005bf10 sp=0xc00005bee0 pc=0x40a9ec5
main.main()
        /Users/acrichton/code/gorepro/main.go:13 +0x2f fp=0xc00005bf88 sp=0xc00005bf10 pc=0x40a9f6f
runtime.main()
        /usr/local/Cellar/go/1.15.8/libexec/src/runtime/proc.go:204 +0x209 fp=0xc00005bfe0 sp=0xc00005bf88 pc=0x4035329
runtime.goexit()
        /usr/local/Cellar/go/1.15.8/libexec/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00005bfe8 sp=0xc00005bfe0 pc=0x4063d01

rax    0x17
rbx    0xc000009a40
rcx    0x41723a0
rdx    0x0
rdi    0x2
rsi    0xc0000099e0
rbp    0xc000009aa0
rsp    0xc000009a68
r8     0x41723a0
r9     0x834d92209e0071b5
r10    0xc000009a40
r11    0x206
r12    0x834d92209e0071b5
r13    0xa
r14    0x200
r15    0x38
rip    0x4063b22
rflags 0x202
cs     0x2b
fs     0x0
gs     0x0
`#if 1` - using `sigsetsjmp`
signal 16 received but handler not on signal stack
fatal error: non-Go code set up signal handler without SA_ONSTACK flag

runtime stack:
runtime: unexpected return pc for runtime.sigtramp called from 0xc000058d88
stack: frame={sp:0xc000058910, fp:0xc000058920} stack=[0xc000050840,0xc000058c40)
000000c000058810:  000000c000058830  000000000404bfc5 <runtime.sigNotOnStack+133>
000000c000058820:  00000000040d5839  0000000000000039
000000c000058830:  000000c000058888  000000000404afa5 <runtime.adjustSignalStack+645>
000000c000058840:  0000000000000010  000000c000058870
000000c000058850:  000000c000058898  0000000004469000
000000c000058860:  000000000465ffff  000000c0000588b0
000000c000058870:  000000c000002000  0000000000008000
000000c000058880:  0000000000000001  000000c000058900
000000c000058890:  000000000404ac11 <runtime.sigtrampgo+369>  000000c000000010
000000c0000588a0:  0000000004166540  000000c0000588c0
000000c0000588b0:  000000c000058900  000000000401a513 <runtime.(*mcentral).cacheSpan+403>
000000c0000588c0:  0000000000000000  0000000000000000
000000c0000588d0:  0000000000000000  0000000000000000
000000c0000588e0:  0000000000000000  000000c000000180
000000c0000588f0:  000000c000058d88  000000c000058df0
000000c000058900:  000000c000058950  000000000406a393 <runtime.sigtramp+51>
000000c000058910: <000000c000000010 !000000c000058d88
000000c000058920: >000000c000058df0  000000c000058df0
000000c000058930:  c68bf6b46e1b0a17  000000000000000a
000000c000058940:  0000000000000200  0000000000000000
000000c000058950:  000000c000058960  00007fff20365d7d
000000c000058960:  000000c000058ed0  000000c000058a10
000000c000058970:  0000000000000000  0000000000000000
000000c000058980:  00000000000000de  0000000000000003
000000c000058990:  0000000004166540  0000000000000000
000000c0000589a0:  00000000040a92b0  004189374bc6a7ee
000000c0000589b0:  0000000000000001  0000000000000000
000000c0000589c0:  000000c000058ed0  000000c000058ea8
000000c0000589d0:  00000000ffffffff  0000000000000000
000000c0000589e0:  00000000fffffffb  0000000000000246
000000c0000589f0:  000000c000018090  000000000000000a
000000c000058a00:  0000000000000200  0000000000000000
000000c000058a10:  0000000004007db6 <runtime.cgocall+54>  0000000000000202
runtime.throw(0x40d5839, 0x39)
        /usr/local/go/src/runtime/panic.go:1117 +0x72
runtime.sigNotOnStack(0x10)
        /usr/local/go/src/runtime/signal_unix.go:918 +0x85
runtime.adjustSignalStack(0xc000000010, 0x4166540, 0xc0000588c0, 0xc000058900)
        /usr/local/go/src/runtime/signal_unix.go:509 +0x285
runtime.sigtrampgo(0xc000000010, 0xc000058d88, 0xc000058df0)
        /usr/local/go/src/runtime/signal_unix.go:449 +0x171

In Go 1.15 I saw different crashes with sigsetjmp than with setjmp, but in Go 1.16 it looks like both crash in the same manner.

edit: turns out I was testing the wrong binary, sigsetjmp and setjmp do indeed have separate crash signatures.


For some background this was originally reported as bytecodealliance/wasmtime-go#60. Wasmtime is a WebAssembly runtime where WebAssembly traps translate to ud2 on x86_64 platforms, raising a SIGILL. We were seeing trouble after recently switching from setjmp to sigsetjmp but I've seen crashes for quite some time even using setjmp (as seen here). I've tried to reduce this to not having a whole WebAssembly runtime and instead just having one C file to poke around. The C mirrors what the runtime currently does in a rough manner.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions