Skip to content

Issue running kube-router on arm64 #736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dparalen opened this issue May 16, 2019 · 2 comments
Closed

Issue running kube-router on arm64 #736

dparalen opened this issue May 16, 2019 · 2 comments

Comments

@dparalen
Copy link
Contributor

Folks,

I've been trying to run kube-router on arm64 and I finally made it work for myself. I'd like to discuss my findings here.

Running my arm64-master_arm64_build image built as
make -B -e BUILD_IN_DOCKER=true -e IMG_NAMESPACE=dparalen container in my master_arm64_build branch leads to following kernel stack dump, implying a missing syscall:

$ sudo dmesg
# ------>%--------------

[518292.085500] kube-router[4128140]: syscall 1069
[518292.085504] Code: f94023e4 f94027e5 f9400fe8 d4000001 (b13ffc1f)
[518292.085507] CPU: 2 PID: 4128140 Comm: kube-router Not tainted 4.9.0-6-arm64-si #1 Debian 4.9.82-1+deb9u3.softiron3
[518292.085508] Hardware name: SoftIron SoftIron Platform Mainboard/SoftIron Barton Mainboard, BIOS 1.12 Feb 21 2019
[518292.085510] task: ffffcd3e6696a080 task.stack: ffffcd3e66800000
[518292.085512] PC is at 0x85f58
[518292.085513] LR is at 0x85f38
[518292.085515] pc : [<0000000000085f58>] lr : [<0000000000085f38>] pstate: 20000000
[518292.085516] sp : 000000442093b850
[518292.085517] x29: 0000000000000000 x28: 0000004420a55b00
[518292.085521] x27: 00000000020c42b8 x26: 000000442093b820
[518292.085524] x25: 0000000000dbee5c x24: fffea806fb5c4a60
[518292.085527] x23: 0000025e650d6f88 x22: 00000044207c04fc
[518292.085530] x21: 000000000000001d x20: 000000000000001d
[518292.085533] x19: 0000000000000061 x18: 0000000000000000
[518292.085536] x17: 000000442093b640 x16: 000000442093b6e8
[518292.085538] x15: 0000000000000000 x14: 0000000000000000
[518292.085541] x13: 000000441ffc8747 x12: 0000000000000013
[518292.085544] x11: 00000000000000d3 x10: 0000000000000200
[518292.085547] x9 : 00000044206f00e8 x8 : 000000000000042d
[518292.085549] x7 : 0000000000000018 x6 : 0000000000000001
[518292.085552] x5 : 0000000000000000 x4 : 0000000000000000
[518292.085555] x3 : 0000000000002328 x2 : 0000000000000001
[518292.085557] x1 : 000000442093b998 x0 : 000000000000000d

This happens once gobgp calls DialTCP because the epoll_wait isn't implemented in the Linux kernel on the arm64 architecture.

The issue goes away the moment I patch gobgp/server/sockopt_linux.go to use the x/sys/unix package instead of syscall, as recommended in the related issue golang/go#25813 discussion. This may eventually get fixed in the syscall package in some later Go version.

The fix can be seen running my arm64-unix_epoll image, built as make -B -e BUILD_IN_DOCKER=true -e IMG_NAMESPACE=dparalen container in my unix_epoll kube-router branch.

What I'm concerned about is where/how to best address this issue since kube-router pins the Go 1.10.8 version as well as gobgp in the vendor directory. I'm considering opening a pull request against gobgp too, wanted to have a reference to kube-router first...

Thanks!
milan

PS: I had to patch the Makefile to be able to compile gobgp in an image as my system runs Debian

@dparalen
Copy link
Contributor Author

...interestingly gobgp doesn't rely on the syscall.Epoll* interface since osrg/gobgp@598bba9 which appeared in version [email protected] which in turn requires Go 1.11...

@aauren
Copy link
Collaborator

aauren commented Jul 10, 2020

Closed via #737

@aauren aauren closed this as completed Jul 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants