Reflector seems to hang for long periods of time #239

scarby · 2021-12-03T02:09:04Z

Noticed recently that our certificates sometimes take up to 30 minutes to reflect to new namespaces, yet sometimes this happens instantly.

As an example of the happy path:

$ kubectl create ns adam-cert-test
Thu Dec  2 17:32:34 PST 2021
namespace/adam-cert-test created

logs:

2021-12-03 01:31:02.678 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Requesting V1Secret resources
2021-12-03 01:31:02.934 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Auto-reflected trsy-certificates/trsy-cert where permitted. Created 1 - Updated 0 - Deleted 0 - Validated 36.
2021-12-03 01:32:35.089 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Created adam-cert-test/trsy-cert as a reflection of trsy-certificates/trsy-cert

this one took around a second

the unhappy path:

$ date; kubectl create ns adam-cert-test3
Thu Dec  2 17:43:11 PST 2021
namespace/adam-cert-test3 created

logs:

2021-12-03 01:41:39.885 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Session closed. Duration: 00:45:25.8514941. Faulted: False.
2021-12-03 01:41:39.885 +00:00 [INF] (ES.Kubernetes.Reflector.Core.ConfigMapWatcher) Requesting V1ConfigMap resources
2021-12-03 01:44:51.286 +00:00 [INF] (ES.Kubernetes.Reflector.Core.NamespaceWatcher) Session closed. Duration: 00:36:53.8147881. Faulted: False.
2021-12-03 01:44:51.286 +00:00 [INF] (ES.Kubernetes.Reflector.Core.NamespaceWatcher) Requesting V1Namespace resources

still going as of Fri Dec 3 02:08:12 UTC 2021

note - all cert creation at this point will stall at this point. It will then catch up with log entries like:

2021-12-03 01:44:51.286 +00:00 [INF] (ES.Kubernetes.Reflector.Core.NamespaceWatcher) Requesting V1Namespace resources
2021-12-03 02:09:37.293 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Session closed. Duration: 00:38:34.6152188. Faulted: False.
2021-12-03 02:09:37.294 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretWatcher) Requesting V1Secret resources
2021-12-03 02:09:37.555 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Auto-reflected trsy-certificates/trsy-cert where permitted. Created 1 - Updated 0 - Deleted 0 - Validated 39.
2021-12-03 02:09:37.563 +00:00 [INF] (ES.Kubernetes.Reflector.Core.SecretMirror) Created adam-cert-test3/trsy-cert as a reflection of trsy-certificates/trsy-cert

So this one took 26 minutes. Unfortunately i'm not proficient in C# so haven't gotten close to working out why this could happen.

cluster info:

Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.12-gke.1500

The text was updated successfully, but these errors were encountered:

Tomasz-Kluczkowski · 2021-12-06T09:33:10Z

Any ideas about this one? If this happens all the time, it pretty much means I cannot use reflector, which would be really sad :(.

winromulus · 2021-12-06T09:42:02Z

hi
As mentioned in previous issues, this is not a reflector but k8s issue.
Basically k8s had a bug in older (not that old, but older) versions where events were not pushed by the API server. In 1.21+ it was fixed.
Reflector relies on those events to be pushed (it does not scrape). What you're seeing as "hanging" is basically the API server not sending anything, then the connection closes (idle) and on reconnect everything gets sent. There is no way to detect from the reflector is the API server is not sending events or there are actually no events.
Have a look at #228
My suggestion is upgrading your version of k8s to the latest supported by your platform.
BTW, this is not an issue that affects reflector only, there are a ton of extensions that rely on those events and do not get them. Most of them have changed from subscribing to events to scraping the data (querying k8s) but this is problematic because, depending on the size of the cluster and number of resources, it can become a serious performance issue. (Reflector is also installed on clusters with hundreds of namespaces and thousands of configmaps and secrets, so querying those every,,,1 minute?...will kill the API server).

I'll keep this issue open for a while in the hope that others facing this problem can provide before and after k8s upgrade insights.

Tomasz-Kluczkowski · 2021-12-06T12:29:04Z

Many thx for coming back on this! I have upgrade of our clusters scheduled so will report before/after. On 6 Dec 2021, at 09:42, Romeo Dumitrescu ***@***.***> wrote: hi As mentioned in previous issues, this is not a reflector but k8s issue. Basically k8s had a bug in older (not that old, but older) versions where events were not pushed by the API server. In 1.21+ it was fixed. Reflector relies on those events to be pushed (it does not scrape). What you're seeing as "hanging" is basically the API server not sending anything, then the connection closes (idle) and on reconnect everything gets sent. There is no way to detect from the reflector is the API server is not sending events or there are actually no events. Have a look at #228<#228> My suggestion is upgrading your version of k8s to the latest supported by your platform. BTW, this is not an issue that affects reflector only, there are a ton of extensions that rely on those events and do not get them. Most of them have changed from subscribing to events to scraping the data (querying k8s) but this is problematic because, depending on the size of the cluster and number of resources, it can become a serious performance issue. (Reflector is also installed on clusters with hundreds of namespaces and thousands of configmaps and secrets, so querying those every,,,1 minute?...will kill the API server). I'll keep this issue open for a while in the hope that others facing this problem can provide before and after k8s upgrade insights. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#239 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGGVI2P5AW4XDVAGTCVXH7LUPSAPLANCNFSM5JITJO7Q>.

winromulus · 2021-12-19T16:58:47Z

Closing this as related to #246

winromulus mentioned this issue Dec 14, 2021

Settings for more frequent resources checks #243

Closed

winromulus closed this as completed Dec 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reflector seems to hang for long periods of time #239

Reflector seems to hang for long periods of time #239

scarby commented Dec 3, 2021 •

edited

Loading

Tomasz-Kluczkowski commented Dec 6, 2021

Uh oh!

winromulus commented Dec 6, 2021

Uh oh!

Tomasz-Kluczkowski commented Dec 6, 2021 via email

Uh oh!

winromulus commented Dec 19, 2021

Uh oh!

Reflector seems to hang for long periods of time #239

Reflector seems to hang for long periods of time #239

Comments

scarby commented Dec 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tomasz-Kluczkowski commented Dec 6, 2021

Uh oh!

winromulus commented Dec 6, 2021

Uh oh!

Tomasz-Kluczkowski commented Dec 6, 2021 via email

Uh oh!

winromulus commented Dec 19, 2021

Uh oh!

scarby commented Dec 3, 2021 •

edited

Loading