Skip to content

kubectl timeout causing goroutine errors and system freezes #1266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
boehle opened this issue Aug 22, 2022 · 5 comments
Closed

kubectl timeout causing goroutine errors and system freezes #1266

boehle opened this issue Aug 22, 2022 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@boehle
Copy link

boehle commented Aug 22, 2022

What happened:
Timeout while connecting to cluster causing weird chain reaction of go errors in kubectl, finally resulting in the whole system kinda freezing up and all network activity halting (no network I/O anymore).
--> Apps won't start (freeze on startup)
--> System Preferences unresponsive and empty in case of the Network preferences

Only a reboot will fix the issues until you try to reach the not reachable cluster via kubectl again.

Log output from running kubectl with -v=9

➜  sample-project git:(main) ✗ export KUBECONFIG=~/.kube/sample-kubeconfig.yaml
➜  sample-project git:(main) ✗ kubectl get pods -v=9
I0822 21:33:09.727414    2841 loader.go:372] Config loaded from file:  /Users/sample-user/.kube/sample-kubeconfig.yaml
I0822 21:33:09.728966    2841 round_trippers.go:435] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.22.12 (darwin/arm64) kubernetes/fde0037" -H "Authorization: Bearer <masked>" 'https://sample-project-uuid.linodelke.net:443/api?timeout=32s'
I0822 21:33:39.730146    2841 round_trippers.go:454] GET https://sample-project-uuid.linodelke.net:443/api?timeout=32s  in 30000 milliseconds
I0822 21:33:39.730260    2841 round_trippers.go:460] Response Headers:
I0822 21:33:39.730555    2841 cached_discovery.go:121] skipped caching discovery info due to Get "https://sample-project-uuid.linodelke.net:443/api?timeout=32s": dial tcp: i/o timeout
I0822 21:33:39.732421    2841 round_trippers.go:435] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.22.12 (darwin/arm64) kubernetes/fde0037" -H "Authorization: Bearer <masked>" 'https://sample-project-uuid.linodelke.net:443/api?timeout=32s'
I0822 21:34:09.734390    2841 round_trippers.go:454] GET https://sample-project-uuid.linodelke.net:443/api?timeout=32s  in 30001 milliseconds
I0822 21:34:09.734506    2841 round_trippers.go:460] Response Headers:
I0822 21:34:09.734763    2841 cached_discovery.go:121] skipped caching discovery info due to Get "https://sample-project-uuid.linodelke.net:443/api?timeout=32s": dial tcp: i/o timeout
I0822 21:34:09.734899    2841 shortcut.go:89] Error loading discovery information: Get "https://sample-project-uuid.linodelke.net:443/api?timeout=32s": dial tcp: i/o timeout
I0822 21:34:09.735440    2841 round_trippers.go:435] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.22.12 (darwin/arm64) kubernetes/fde0037" -H "Authorization: Bearer <masked>" 'https://sample-project-uuid.linodelke.net:443/api?timeout=32s'
I0822 21:34:39.737300    2841 round_trippers.go:454] GET https://sample-project-uuid.linodelke.net:443/api?timeout=32s  in 30001 milliseconds
I0822 21:34:39.737410    2841 round_trippers.go:460] Response Headers:
I0822 21:34:39.737608    2841 cached_discovery.go:121] skipped caching discovery info due to Get "https://sample-project-uuid.linodelke.net:443/api?timeout=32s": dial tcp: i/o timeout
I0822 21:34:39.738170    2841 round_trippers.go:435] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.22.12 (darwin/arm64) kubernetes/fde0037" -H "Authorization: Bearer <masked>" 'https://sample-project-uuid.linodelke.net:443/api?timeout=32s'
I0822 21:35:09.740431    2841 round_trippers.go:454] GET https://sample-project-uuid.linodelke.net:443/api?timeout=32s  in 30001 milliseconds
I0822 21:35:09.740553    2841 round_trippers.go:460] Response Headers:
I0822 21:35:09.740725    2841 cached_discovery.go:121] skipped caching discovery info due to Get "https://sample-project-uuid.linodelke.net:443/api?timeout=32s": dial tcp: i/o timeout
I0822 21:35:09.741292    2841 round_trippers.go:435] curl -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.22.12 (darwin/arm64) kubernetes/fde0037" -H "Authorization: Bearer <masked>" 'https://sample-project-uuid.linodelke.net:443/api?timeout=32s'
I0822 21:35:39.743602    2841 round_trippers.go:454] GET https://sample-project-uuid.linodelke.net:443/api?timeout=32s  in 30001 milliseconds
I0822 21:35:39.743706    2841 round_trippers.go:460] Response Headers:
I0822 21:35:39.743883    2841 cached_discovery.go:121] skipped caching discovery info due to Get "https://sample-project-uuid.linodelke.net:443/api?timeout=32s": dial tcp: i/o timeout
I0822 21:35:39.744039    2841 helpers.go:235] Connection error: Get https://sample-project-uuid.linodelke.net:443/api?timeout=32s: dial tcp: i/o timeout
F0822 21:35:39.744074    2841 helpers.go:116] Unable to connect to the server: dial tcp: i/o timeout
goroutine 1 [running]:
k8s.io/kubernetes/vendor/k8s.io/klog/v2.stacks(0x1400000e001, 0x140000e8700, 0x65, 0xf6)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1026 +0xac
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).output(0x10553ce60, 0x14000000003, 0x0, 0x0, 0x140000f0b60, 0x2, 0x104c730b7, 0xa, 0x74, 0x103c7ad00)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:975 +0x1a8
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).printDepth(0x10553ce60, 0x14000000003, 0x0, 0x0, 0x0, 0x0, 0x2, 0x14000480380, 0x1, 0x1)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:735 +0x128
k8s.io/kubernetes/vendor/k8s.io/klog/v2.FatalDepth(...)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1500
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.fatal(0x140003d8480, 0x36, 0x1)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:94 +0x21c
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.checkErr(0x10470a408, 0x14000512750, 0x1046e3de0)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:189 +0x74c
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util.CheckErr(...)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/util/helpers.go:116
k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/get.NewCmdGet.func2(0x14000461900, 0x140001b0140, 0x1, 0x2)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/cmd/get/get.go:180 +0x120
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).execute(0x14000461900, 0x140001b0120, 0x2, 0x2, 0x14000461900, 0x140001b0120)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:856 +0x1d0
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0x140004d2c80, 0x1, 0x104292614, 0x3)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:960 +0x258
k8s.io/kubernetes/vendor/github.com/spf13/cobra.(*Command).Execute(...)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/github.com/spf13/cobra/command.go:897
main.main()
        _output/dockerized/go/src/k8s.io/kubernetes/cmd/kubectl-sdk/kubectl.go:58 +0x240

goroutine 6 [chan receive]:
k8s.io/kubernetes/vendor/k8s.io/klog/v2.(*loggingT).flushDaemon(0x10553ce60)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:1169 +0x70
created by k8s.io/kubernetes/vendor/k8s.io/klog/v2.init.0
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/klog/v2/klog.go:420 +0x128

goroutine 10 [select]:
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0x1046e3ce8, 0x104708908, 0x140004420c0, 0x1, 0x1400004a120)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:167 +0xd8
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x1046e3ce8, 0x12a05f200, 0x0, 0x1, 0x1400004a120)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x88
k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0x1046e3ce8, 0x12a05f200, 0x1400004a120)
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x48
created by k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/util/logs.InitLogs
        /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/vendor/k8s.io/kubectl/pkg/util/logs/logs.go:51 +0x9c

goroutine 73 [runnable]:
net/http.setRequestCancel.func4(0x0, 0x140004cc1e0, 0x140003a0140, 0x140001fc158, 0x14000782120)
        /usr/local/go/src/net/http/client.go:397 +0x6c
created by net/http.setRequestCancel
        /usr/local/go/src/net/http/client.go:396 +0x25c

goroutine 75 [chan receive]:
net.(*Resolver).goLookupIPCNAMEOrder.func4(0x1400002a400, 0x40, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/net/dnsclient_unix.go:611 +0x68
net.(*Resolver).goLookupIPCNAMEOrder(0x10553b600, 0x104742000, 0x14000382080, 0x140005de280, 0x3f, 0x1, 0x0, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/net/dnsclient_unix.go:621 +0x5b4
net.(*Resolver).lookupIP(0x10553b600, 0x104742000, 0x14000382080, 0x103ebbad6, 0x3, 0x140005de280, 0x3f, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/net/lookup_unix.go:102 +0xb0
net.glob..func1(0x104742000, 0x14000382080, 0x14000700480, 0x103ebbad6, 0x3, 0x140005de280, 0x3f, 0x140005a97a8, 0x140005a97a0, 0x140005a97a0, ...)
        /usr/local/go/src/net/hook.go:23 +0x58
net.(*Resolver).lookupIPAddr.func1(0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/lookup.go:293 +0x9c
internal/singleflight.(*Group).doCall(0x10553b610, 0x140003a01e0, 0x140005de2d0, 0x43, 0x140003820c0)
        /usr/local/go/src/internal/singleflight/singleflight.go:95 +0x28
created by internal/singleflight.(*Group).DoChan
        /usr/local/go/src/internal/singleflight/singleflight.go:88 +0x304

goroutine 76 [syscall]:
syscall.syscall(0x102cbe1d0, 0x7, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/sys_darwin.go:19 +0x1c
syscall.Close(0x7, 0x0, 0x0)
        /usr/local/go/src/syscall/zsyscall_darwin_arm64.go:519 +0x44
internal/poll.(*FD).destroy(0x14000474080, 0x0, 0x12d0b0378)
        /usr/local/go/src/internal/poll/fd_unix.go:83 +0x44
internal/poll.(*FD).decref(0x14000474080, 0x1, 0x0)
        /usr/local/go/src/internal/poll/fd_mutex.go:213 +0x74
internal/poll.(*FD).Close(0x14000474080, 0x14000474080, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:106 +0x48
net.(*netFD).Close(0x14000474080, 0x0, 0x0)
        /usr/local/go/src/net/fd_posix.go:37 +0x48
net.(*conn).Close(0x1400000e008, 0x0, 0x0)
        /usr/local/go/src/net/net.go:207 +0x58
net.(*Resolver).exchange(0x10553b600, 0x104742000, 0x14000382080, 0x140005bb530, 0x22, 0x6666376163353764, 0x33342d303434312d, 0x2d343332392d6464, 0x6663613633393432, 0x2d75652e32383330, ...)
        /usr/local/go/src/net/dnsclient_unix.go:173 +0x374
net.(*Resolver).tryOneName(0x10553b600, 0x104742000, 0x14000382080, 0x1400038ce60, 0x1400002a400, 0x40, 0x1, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/net/dnsclient_unix.go:259 +0x24c
net.(*Resolver).goLookupIPCNAMEOrder.func3.1(0x10553b600, 0x104742000, 0x14000382080, 0x1400038ce60, 0x1400002a400, 0x40, 0x1400039a8a0, 0x1)
        /usr/local/go/src/net/dnsclient_unix.go:605 +0x78
created by net.(*Resolver).goLookupIPCNAMEOrder.func3
        /usr/local/go/src/net/dnsclient_unix.go:604 +0xb4

goroutine 77 [syscall]:
syscall.syscall(0x102cbe1d0, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/sys_darwin.go:19 +0x1c
syscall.Close(0x8, 0x0, 0x0)
        /usr/local/go/src/syscall/zsyscall_darwin_arm64.go:519 +0x44
internal/poll.(*FD).destroy(0x14000544200, 0x0, 0x12d0b0460)
        /usr/local/go/src/internal/poll/fd_unix.go:83 +0x44
internal/poll.(*FD).decref(0x14000544200, 0x1, 0x0)
        /usr/local/go/src/internal/poll/fd_mutex.go:213 +0x74
internal/poll.(*FD).Close(0x14000544200, 0x14000544200, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:106 +0x48
net.(*netFD).Close(0x14000544200, 0x0, 0x0)
        /usr/local/go/src/net/fd_posix.go:37 +0x48
net.(*conn).Close(0x1400059c058, 0x0, 0x0)
        /usr/local/go/src/net/net.go:207 +0x58
net.(*Resolver).exchange(0x10553b600, 0x104742000, 0x14000382080, 0x140005bb530, 0x22, 0x6666376163353764, 0x33342d303434312d, 0x2d343332392d6464, 0x6663613633393432, 0x2d75652e32383330, ...)
        /usr/local/go/src/net/dnsclient_unix.go:173 +0x374
net.(*Resolver).tryOneName(0x10553b600, 0x104742000, 0x14000382080, 0x1400038ce60, 0x1400002a400, 0x40, 0x1400000001c, 0x0, 0x0, 0x0, ...)
        /usr/local/go/src/net/dnsclient_unix.go:259 +0x24c
net.(*Resolver).goLookupIPCNAMEOrder.func3.1(0x10553b600, 0x104742000, 0x14000382080, 0x1400038ce60, 0x1400002a400, 0x40, 0x1400039a8a0, 0x1c)
        /usr/local/go/src/net/dnsclient_unix.go:605 +0x78
created by net.(*Resolver).goLookupIPCNAMEOrder.func3
        /usr/local/go/src/net/dnsclient_unix.go:604 +0xb4

What you expected to happen:
Timeout error message and system staying responsive and working.

How to reproduce it (as minimally and precisely as possible):

Probably have an unreachable cluster and try any command against it with the same specs I'm running.

Anything else we need to know?:
I mainly encounter the issue, when I'm using macOS' integrated tethering option. For some reason the cluster becomes unreachable and afterwards, when running any kubectl command towards the cluster the system becomes completely unresponsive.

Environment:

  • Kubernetes client and server versions (use kubectl version):

    Output when the connection isn't working

    Client Version: version.Info{Major:"1", Minor:"22+", GitVersion:"v1.22.12-dispatcher-dirty", 
    GitCommit:"fde00375407ad0afadd681a3505054ec83f935ec", GitTreeState:"dirty", BuildDate:"2022-07-19T18:58:44Z", 
    GoVersion:"go1.16.15", Compiler:"gc", Platform:"darwin/arm64"}
    Unable to connect to the server: dial tcp: i/o timeout
    

    Output when the connection is working

    Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.9", 
    GitCommit:"c1de2d70269039fe55efb98e737d9a29f9155246", GitTreeState:"clean", BuildDate:"2022-07-13T14:26:51Z", 
    GoVersion:"go1.17.11", Compiler:"gc", Platform:"darwin/arm64"}
    Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.8", 
    GitCommit:"a12b886b1da059e0190c54d09c5eab5219dd7acf", GitTreeState:"clean", BuildDate:"2022-06-16T05:51:36Z", 
    GoVersion:"go1.17.11", Compiler:"gc", Platform:"linux/amd64"}
    
  • Cloud provider or hardware configuration:
    Linode Kubernetes Engine (LKE)

  • OS (e.g: cat /etc/os-release):
    macOS Monterey 12.5 running on a M1 Max

@boehle boehle added the kind/bug Categorizes issue or PR as related to a bug. label Aug 22, 2022
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 22, 2022
@k8s-ci-robot
Copy link
Contributor

@boehle: This issue is currently awaiting triage.

SIG CLI takes a lead on issue triage for this repo, but any Kubernetes member can accept issues by applying the triage/accepted label.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@PedroAlbamonte
Copy link

@boehle did you find a solution for this issue? I'm getting the same error.
My OS is macOS Monterey 12.6 running on a Intel

@eddiezane
Copy link
Member

I think this is the same as #1299.

Are you still experiencing this?

/close

@k8s-ci-robot
Copy link
Contributor

@eddiezane: Closing this issue.

In response to this:

I think this is the same as #1299.

Are you still experiencing this?

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@boehle
Copy link
Author

boehle commented Feb 8, 2023

@PedroAlbamonte the only solution so far has been to connect over a connection without IPv6. In my case my VPN does that trick.

@eddiezane gotta check next week as it mainly happens when I'm on the road using mobile hotspot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

4 participants