-
Notifications
You must be signed in to change notification settings - Fork 18.1k
cmd/link: SIGSEGV running 'openshift-install version' for release-4.8 using external linking on PPC64LE #45850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Does it work with a different version of Go, 1.16 or 1.16.1, or tip of the master branch? Issue #45564 is recently fixed (and not triggered in some earlier versions of Go). It is possible that you don't need to force external linking. |
Or different versions of openshift? |
FYI, I've tried 1.16 and 1.16.1 and they both fail. Hopefully David Benoit can comment. He also reproduced the problem and investigated a little more thoroughly. And the previous version of openshift installer had the trampoline problem. |
It should not have trampoline problem if you use Go 1.16, 1.16.1, or tip, per #45564 . Could you try that? Thanks. |
@cherrymui I've been able to reproduce this issue, and can also confirm that combining |
[root@mihawklp111 installer]# /bin/rm -rf /usr/local/go/; tar -C /usr/local -xzf /root/go1.16.1.linux-ppc64le.tar.gz [root@mihawklp111 installer]# go version go version go1.16.1 linux/ppc64le [root@mihawklp111 installer]# git checkout 911eb7f69a1316c48e883d08644cda0e7304a373 [root@mihawklp111 installer]# hack/build.sh ... + go build -mod=vendor -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-4518-g911eb7f69a1316c48e883d08644cda0e7304a373 -X github.com/openshift/installer/pkg/version.Commit=911eb7f69a1316c48e883d08644cda0e7304a373 -s -w' -tags ' release' -o bin/openshift-install ./cmd/openshift-install # github.com/openshift/installer/cmd/openshift-install github.com/ovirt/go-ovirt.(*IconServiceGetRequest).Send: unexpected trampoline for shared or dynamic linking ... /usr/local/go/pkg/tool/linux_ppc64le/link: too many errors [root@mihawklp111 installer]# git checkout origin/release-4.8 Previous HEAD position was 911eb7f69 Merge pull request #4868 from wking/stable-4.8 HEAD is now at 52fef9e1d Merge pull request #4875 from multi-arch/ppc64le-link [root@mihawklp111 installer]# hack/build.sh ... + go build -mod=vendor -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-4520-g52fef9e1d7b765470e493670177eefd73bc27c6f -X github.com/openshift/installer/pkg/version.Commit=52fef9e1d7b765470e493670177eefd73bc27c6f -s -w -linkmode external' -tags ' release' -o bin/openshift-install ./cmd/openshift-install [root@mihawklp111 installer]# bin/openshift-install version unexpected fault address 0x2fffffff4170 fatal error: fault [signal SIGSEGV: segmentation violation code=0x1 addr=0x2fffffff4170 pc=0x1a69e95c] goroutine 1 [running, locked to thread]: runtime.throw(0x217086d3, 0x5) /usr/local/go/src/runtime/panic.go:1117 +0x5c fp=0xc00149f4f8 sp=0xc00149f4b8 pc=0x152d1d6c runtime.sigpanic() /usr/local/go/src/runtime/signal_unix.go:741 +0x274 fp=0xc00149f540 sp=0xc00149f4f8 pc=0x152e9984 github.com/aws/aws-sdk-go/service/efs.init() /root/go/src/github.com/openshift/installer/vendor/github.com/aws/aws-sdk-go/service/efs/errors.go:194 +0x24 fp=0xc00149f5b0 sp=0xc00149f560 pc=0x18aa1764 runtime.doInit(0x29778280) /usr/local/go/src/runtime/proc.go:6265 +0xf0 fp=0xc00149f710 sp=0xc00149f5b0 pc=0x152e29a0 runtime.doInit(0x297c9440) /usr/local/go/src/runtime/proc.go:6242 +0x64 fp=0xc00149f870 sp=0xc00149f710 pc=0x152e2914 runtime.doInit(0x29796cc0) /usr/local/go/src/runtime/proc.go:6242 +0x64 fp=0xc00149f9d0 sp=0xc00149f870 pc=0x152e2914 runtime.doInit(0x2978a3a0) /usr/local/go/src/runtime/proc.go:6242 +0x64 fp=0xc00149fb30 sp=0xc00149f9d0 pc=0x152e2914 runtime.doInit(0x297a3ae0) /usr/local/go/src/runtime/proc.go:6242 +0x64 fp=0xc00149fc90 sp=0xc00149fb30 pc=0x152e2914 runtime.doInit(0x29780700) /usr/local/go/src/runtime/proc.go:6242 +0x64 fp=0xc00149fdf0 sp=0xc00149fc90 pc=0x152e2914 runtime.doInit(0x297a6900) /usr/local/go/src/runtime/proc.go:6242 +0x64 fp=0xc00149ff50 sp=0xc00149fdf0 pc=0x152e2914 runtime.main() /usr/local/go/src/runtime/proc.go:208 +0x214 fp=0xc00149ffc0 sp=0xc00149ff50 pc=0x152d43d4 runtime.goexit() /usr/local/go/src/runtime/asm_ppc64x.s:869 +0x4 fp=0xc00149ffc0 sp=0xc00149ffc0 pc=0x15307b74 ... |
Either a trap or a trampoline, you take your pick :) [root@mihawklp111 installer]# /bin/rm -rf /usr/local/go/; tar -C /usr/local -xzf /root/go1.15.11.linux-ppc64le.tar.gz [root@mihawklp111 installer]# go version go version go1.15.11 linux/ppc64le [root@mihawklp111 installer]# git status HEAD detached at origin/release-4.8 nothing to commit, working tree clean [root@mihawklp111 installer]# hack/build.sh ... + go build -mod=vendor -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-4520-g52fef9e1d7b765470e493670177eefd73bc27c6f -X github.com/openshift/installer/pkg/version.Commit=52fef9e1d7b765470e493670177eefd73bc27c6f -s -w -linkmode external' -tags ' release' -o bin/openshift-install ./cmd/openshift-install [root@mihawklp111 installer]# bin/openshift-install version unexpected fault address 0xffffffffffff3a60 fatal error: fault [signal SIGSEGV: segmentation violation code=0x3 addr=0xffffffffffff3a60 pc=0x1bbf3664] ... [root@mihawklp111 installer]# git checkout 911eb7f69a1316c48e883d08644cda0e7304a373 Previous HEAD position was 52fef9e1d Merge pull request #4875 from multi-arch/ppc64le-link HEAD is now at 911eb7f69 Merge pull request #4868 from wking/stable-4.8 [root@mihawklp111 installer]# hack/build.sh ... + go build -mod=vendor -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-4518-g911eb7f69a1316c48e883d08644cda0e7304a373 -X github.com/openshift/installer/pkg/version.Commit=911eb7f69a1316c48e883d08644cda0e7304a373 -s -w' -tags ' release' -o bin/openshift-install ./cmd/openshift-install # github.com/openshift/installer/cmd/openshift-install github.com/ovirt/go-ovirt.(*ClusterServiceRemoveRequest).Send: unexpected trampoline for shared or dynamic linking ... |
I am investigating. It looks like the external linker is inserted a TOC relative trampoline to |
This is the same problem we saw a while ago with some descriptions in #20497 and also a related discussion in #20492. We don't want ld to be inserting trampolines like this because of this issue -- by default Go doesn't maintain r2. The odd thing is that when I look at the openshift-installer binary I don't see any tramp symbols and the text section is not split unless I'm missing something. I'm not sure why we haven't hit this before either. |
Yes, the problem is that ld is inserting the tramplines and that is why we are getting the segv. I thought there were some cases where Go would still generate trampolines even with external linking according to the issues I mentioned above, but there are none in this case. It looks like the text section is not being split like it should, and that is why ld is inserting them. Not sure why that would be. Here is the check before splitting: if ctxt.Arch.InFamily(sys.PPC64) && ldr.OuterSym(s) == 0 && ctxt.IsExternal() Is OuterSym !=0 for some reason? |
I think openshift is interesting in that it uses plugins, which makes the Go linker to go with |
OuterSym != 0 is for symbols from C objects, for which we have a section as an "outer" symbol and individual symbols within that section as "sub" symbols. This is just to skip sub symbols, which just follow the outer symbol, because you cannot break an Outer symbol to smaller pieces. |
Yeah, -buildmode=pie works. (the builder machine is really too slow, it takes ~hour to build...) |
I think there are multiple issues here.
Are you saying that 2) is related to the use of DynlinkingGo? Or what did you mean when you said we shouldn't be checking DynlinkingGo? |
Yes, 2 is related to the use of DynlinkingGo. I think we want to change it to a more precise condition that we actually save/restore R2. |
Where is that check being done? |
I'm wondering if Openshift only recently added the use of plugins so that is why we haven't seen these problems before? |
1 is interesting. If Go code uses plugins it must be cgo and must be externally linked, because the plugin implementation internally uses C function dlopen etc.. At compile time, it doesn't know whether we are going to use plugins. Currently, it doesn't generate code that maintains R2. It sounds like plugins on PPC64 is just broken? If that's the case, maybe the only thing to make it work is that we always save/restore R2, at least at compile time. The linker replace them with NOPs if plugin is not used. Or we document that if you want to use plugin you must build PIE on PPC64. |
At least this one https://tip.golang.org/src/cmd/link/internal/ppc64/asm.go#L661 . There may be others. |
( @hamzy @dbenoit17 in case this helps you -- besides waiting for the fix of #45564 to be released and change to internal linking, another workaround is to use go build |
Thanks, I have tested and can confirm that -buildmode=pie does effectively work around #45564. |
In general plugins aren't broken but I agree this setting of DynlinkingGo is not right when there are plugins. But for some reason now we are hitting problems with plugins in large text sections where we haven't before. In the normal case, when the plugin is called it should go through C code and that will set up R2 before invoking the plugin code. With external linking, I believe the text section should be split if it is too large but in this case it is not and I don't see where that should be related to the setting of DynlinkingGo. If there is a call between text sections there should be a trampoline inserted by Go and that is not happening because of the value of DynlinkingGo. I have to do some experiments to verify my thinking. |
We've had enough issues with the value of R2 that this would be fine with me, but that means we'd have to generate R2 at the beginning of each function. However what about existing code from previous releases (or does that all get recompiled automatically). |
Could we always split text sections when external linking (even our code maitains R2)? Is there any problem with that?
Yes. We never link Go objects that are not built with the same version of the toolchains. |
Maybe we can teach the linker to skip those instructions and resize the symbol. I guess it doesn't really matter compared to replacing them with NOPs at link time. (We have alignment paddings between functions anyway) |
The way it works now (with -shared, etc.), is that when a call is made to a function it will know if it is a local or global call in terms of using the same R2, so it will skip those if it would be the same. Just extra code in the binary but only executed if needed. |
Okay, thanks. I think the way forward is either fix the DynlinkingGo condition or just drop the condition and always split sections. Up to you. |
Change https://golang.org/cl/315289 mentions this issue: |
@gopherbot this affects building openshift with go1.16, please consider a backport. |
Backport issue(s) opened: #45927 (for 1.16), #46002 (for 1.15). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
What did you expect to see?
The version of the openshift installer.
What did you see instead?
A SIGSEGV
The text was updated successfully, but these errors were encountered: