Skip to content

Stacktrace frame lookup accuracy #100

@lggomezml

Description

@lggomezml

Hi

I'm working on a custom gin recover middleware that prints the offending stacktrace whenever a panic goes uncatched. In this scenario, we noticed an application's code path in which the line number on the printed stacktrace is wrong, returning line 59 instead of line 61 (it took me a long time to realize that something was wrong with the stacktrace itself and not the reported line):

func (ls LocationsMutableService) getChildren(ctx context.Context, tree string, loc string) (LocationChildren, error) {
	url := ""
	var children LocationChildren

//line 59 - panic not possible here
	response, err := ls.Client.GET(url, &children)

//line 61 - actual place of panic (response == nil)
	if response.StatusCode == http.StatusNotFound { 
		return nil, errors.New("")
	}

	if err != nil {
		return nil, err
	}

	return children, err
}

We fixed this by writing a implementation based on debug.Stack() but, surprisingly, after deploying the fix to verify it I'm observing that the internal stacktrace used by txn.NoticeError has the same inaccuracy that our original middleware implementation had:

2019/08/30 14:33:09 /go/src/github.com/.../src/api/vendor/github.com/.../gingonic/mlhandlers/nr_stack.go 19 mlhandlers.GetStackTrace
/go/src/github.com/.../src/api/vendor/github.com/.../gingonic/mlhandlers/recovery.go 58 mlhandlers.RecoveryWithWriter.func1.1
/usr/local/go/src/runtime/panic.go 522 runtime.gopanic
/usr/local/go/src/runtime/panic.go 82 runtime.panicmem
/usr/local/go/src/runtime/signal_unix.go 390 runtime.sigpanic
/go/src/github.com/.../src/api/gateways/locations_mutable_service.go 59 gateways.LocationsMutableService.getChildren
/go/src/github.com/.../src/api/gateways/locations_mutable_service.go 51 gateways.LocationsMutableService.GetChildren.func1

After some research the main culprit seems to be a wrong PC being used when retrieving the stack frames, but I'm still not certain about the concrete cause. After checking the agent code I come across the frame lookup method which has the following comment about PC manipulation:

func lookupFrame(pc uintptr) stacktraceFrame {
// The Golang runtime package documentation says "To look up the file
// and line number of the call itself, use pc[i]-1. As an exception to
// this rule, if pc[i-1] corresponds to the function runtime.sigpanic,
// then pc[i] is the program counter of a faulting instruction and
// should be used without any subtraction."
//
// TODO: Fully understand when this subtraction is necessary.
place := pc - 1
f := runtime.FuncForPC(place)

As an experiment, I removed the substraction and it fixed the wrong line, although it messes the rest of the stack frames:

2019/09/02 14:15:19 /go/src/github.com/.../src/api/vendor/github.com/.../gingonic/mlhandlers/nr_stack.go 19 mlhandlers.GetStackTrace
/go/src/github.com/.../src/api/vendor/github.com/.../gingonic/mlhandlers/recovery.go 58 mlhandlers.RecoveryWithWriter.func1.1
/usr/local/go/src/runtime/panic.go 523 runtime.gopanic
/usr/local/go/src/runtime/panic.go 82 runtime.panicmem
/usr/local/go/src/runtime/panic.go 81 runtime.panicmem
/go/src/github.com/.../src/api/gateways/locations_mutable_service.go 61 gateways.LocationsMutableService.getChildren
/go/src/github.com/.../src/api/gateways/locations_mutable_service.go 51 gateways.LocationsMutableService.GetChildren.func1

As I see it, this issue is two-fold:

  • We don't know for certain whether this is a bug or unexpected behavior on runtime.Callers
  • We also don't know how to properly identify PCs that require substraction

The scenario I've been showing is on a specific branch of a web app we use at work (thus being unable to share it here), and unfortunately I've been trying to create a repro case on the stacktracetest package to no avail

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions