Skip to content

Conversation

foxish
Copy link
Contributor

@foxish foxish commented Feb 7, 2017

Falling back to using persistent HTTP Get.
The code could be DRYer. Getting this out there to get some initial opinions.

@iocanel @jimmidyson @iyanuobidele

@foxish foxish force-pushed the add-hanging-get branch 2 times, most recently from 8ac44d0 to bf34108 Compare February 7, 2017 19:06
@foxish
Copy link
Contributor Author

foxish commented Feb 7, 2017

Will fix tests shortly.

@iocanel
Copy link
Member

iocanel commented Feb 7, 2017

ok to test

@iocanel iocanel requested review from iocanel and jimmidyson February 7, 2017 22:03
Copy link
Contributor

@jimmidyson jimmidyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a test in for this too? @iocanel Thoughts on using the mock test server for this?

return watch;
} catch (MalformedURLException e) {
throw KubernetesClientException.launderThrowable(e);
} catch (KubernetesClientException ke) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check the type of the exception? For example, there would be no point in falling back to HTTP watch in e.g. auth error. Perhaps we should only fall back if we receive a 200 instead of a 101 for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added a check for ke.getCode() to be 200.

} catch (KubernetesClientException ke) {
WatchHTTPManager watch = null;
try {
watch = new WatchHTTPManager(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply return new WatchHTTP....

response = clonedClient.newCall(request).execute();
BufferedSource source = response.body().source();
while (!source.exhausted()) {
String message = source.readUtf8LineStrict();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is each event new line separated then? Not chunked response? I haven't looked to be sure but will trust you :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the chunk size is sent at first, but does not correspond to the event boundaries. The newlines are reliable separators.

@jimmidyson
Copy link
Contributor

Thanks for this PR. It's looking good, I like the approach, just some minor questions and we need a test too if possible.

@jimmidyson
Copy link
Contributor

Oh and this breaks some tests too - see build logs please.

@foxish foxish force-pushed the add-hanging-get branch 2 times, most recently from af183ee to 9262d47 Compare February 8, 2017 01:52
@foxish
Copy link
Contributor Author

foxish commented Feb 8, 2017

@jimmidyson @iocanel PTAL. Addressed comments, added tests.

Copy link
Contributor

@jimmidyson jimmidyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the tests!

@jimmidyson
Copy link
Contributor

When I try this locally, I get a 5 second delay between 200 response from websocket watcher request and the HTTP watcher kicking in:

09:15:48.719 [main] DEBUG io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager - Connecting websocket ... io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager@e5041ef
09:15:53.830 [OkHttp http://localhost:8001/...] WARN  io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager - Exec Failure: HTTP 200, Status: 200 -

Any ideas why?

@iocanel
Copy link
Member

iocanel commented Feb 8, 2017

[merge]

@jimmidyson
Copy link
Contributor

@iocanel I was waiting for feedback on #652 (comment)...

@iocanel
Copy link
Member

iocanel commented Feb 8, 2017

@jimmidyson: you've approved the request though.

@iocanel
Copy link
Member

iocanel commented Feb 8, 2017

I aborted the job on jenkins until you get feedback.

@jimmidyson
Copy link
Contributor

@iocanel That is a good point. I had approved the code review, but was testing locally before merging. I will ensure from now on that review approval means a complete yes, not just code review approval. Sorry for the misunderstanding!

@foxish
Copy link
Contributor Author

foxish commented Feb 8, 2017

Thanks for the quick review. A 5 second delay seems quite substantial, and I don't remember running into this. I'll look into it shortly.

@jimmidyson
Copy link
Contributor

@foxish Yeah it was unexpected, it could be environmental but I'm not sure how. Let me know if you have any ideas.

@foxish
Copy link
Contributor Author

foxish commented Feb 8, 2017

@jimmidyson It seems like the websocket call takes ~5 seconds to fail. The delay comes from the DEFAULT_WEBSOCKET_TIMEOUT which is set to 5 seconds.
Specifying it in the form: Config config = new ConfigBuilder().withMasterUrl("http://127.0.0.1:8001").withWebsocketTimeout(1000).build(); for example, brings the delay down to 1s.

@jimmidyson
Copy link
Contributor

OK so sounds like we need to address the websocket timeout separately. Getting back a 200 should prevent the websocket from trying to connect and immediately return.

@foxish Would you have time to look at that fix? I'm happy to merge this in & get a fix in for that in a separate PR.

@foxish
Copy link
Contributor Author

foxish commented Feb 8, 2017

That sounds good. I'll send a follow-up PR for the fix. Would cutting a release be possible sometime early next week? We'd like to use these changes in our Spark integration efforts.

@jimmidyson
Copy link
Contributor

We can do the release as soon as we have the fix for the delay.

@jimmidyson
Copy link
Contributor

[merge] thanks @foxish!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants