-
Notifications
You must be signed in to change notification settings - Fork 1.1k
most requests fail with curl code 77 error #530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
side-note: you don't need libuuid anymore. Can you send me a trace output log that contains a request succeeding, and some failing? |
Also, this just started happening? Was this perchance S3 in region SA-EAST-1 ? |
Since which SDK version?
Here: aws_sdk_2017-05-16-20.zip Here you could see multiple threads executing PutObject. Most of them fail. All of them share same S3Client. These requests are parts of bigger tasks that execute undo actions (DeleteObject requests) if one of subtasks fail. Setting verifySSL to false causes problems to go away.
No, it is a Webscale demo account. I highly doubt it has anything to do with server side since I switched to static linking over course of few hours (it was working before the switch). |
doesn't look like it starts failing until we make our 7th tcp connection. Can you try setting your client config to only have 6 connections? (This is for debugging, not for the solution). |
6 / 30 connections == 20% success rate. That is curious indeed. |
We removed libuuid ages ago and moved to just hitting /dev/urandom. Unfortunately, we didn't remove it from the build dependencies on linux until this week. |
hmmm max file handles on your CA file? |
Log is probably a concatenation of multiple runs. Sometimes entire run is successful. Each test run -- 3 files to upload (3 tasks --> 6 PutObject requests). Each upload is executed (typically) on a new thread. Another thing wanted to mention -- on linker line I ended up with both -lcrypt and -lcrypto. Probably unrelated (not sure exactly what these libs are doing). |
can you elaborate? Not sure what you mean. Also, why it happens only when static linking SDK? |
It fails when it opens the 7th tcp connection and all other connections will fail after that. So you keep succeeding 20% of the time, but fail on all others, because the original 6 are left open. error code 77 indicates libcurl wasn't able to read the CA_FILE... just brainstroming here |
I should note, we open a new connection for each concurrent request until we reach the maximum number of connections in the pool. Once a thread finishes, the connection is returned (still open) to the pool. |
Pretty sure I've seen it failing on first connection. As well as not failing at all |
You should be able to use:
|
Thanks. I'll give it a try tomorrow. |
ok, i'll think on this some more. Maybe something will come to me. My best guess, is you are linking to something that has different behavior than before. Like maybe libnss ? -lcrypto wouldn't be the culprit since that isn't where TLS resides. LibCurl is likely the culprit. My guess is libcurl.a is linked against libnss (which is the spawn of satan) while libcurl.so is linked against openssl. |
I was thinking maybe MINIMIZE_SIZE could cause change in behaviour -- it combines all cpp files into one to compile. C++ doesn't like stuff like that.
I am pretty sure I link against libcurl.so. Will check tmrw.
:-) Once I had to write some logic that was calling into openssl -- after few days I was thinking smth along the same lines. P.S. pretty sure I don't have -lnss on my linker line. |
Spent entire day today dealing with this. It looks like a race condition in libnss.so. Here is what I've found:
I am running out of ideas here, tbh... Google is full of pages mentioning race conditions in libnss, but they all seem to be already fixed and my packages seem fresh enough:
Help! P.S. Checked all this against AWS (instead of Webscale) -- result is the same. |
Huh... I might have found it: If this is the same bug -- updating libcurl to v7.54+ should fix it (see curl patch here: curl/curl@3a5d5de9) |
My recommendation is to use libcurl linked against openssl. Libnss has too many issues.
…Sent from my iPhone
On May 17, 2017, at 7:27 PM, crusader-mike ***@***.***> wrote:
Huh... I might have found it:
https://curl.haxx.se/mail/lib-2016-08/0119.html
https://bugzilla.mozilla.org/show_bug.cgi?id=1297397
Tmrw will try to find if there is a workaround...
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I am not particularly familiar with art of deployment on Linux. What do I need to do to ensure my app ends up using lubcurl+openssl on client's machine? Linking against them statically is likely possible but isn't smth I am looking forward to -- this means security patches/etc won't apply to my app... Also, any ideas on these mysterious 6 seconds delays observe? |
Ended up building latest libcurl (v7.54) from sources (with OpenSSL -- libNSS refused to compile due to some missing NSPR headers :D). Linked both AWS SDK and libcurl statically (this app is not going to talk to Internet, so security patches are not particularly necessary). Problem is gone! Turned out there is another one which was largely obscured before. :-\ I'll submit it as another issue. P.S. portion of my linker's cmdline: |
Uh oh!
There was an error while loading. Please reload this page.
My code was working just fine until decision was made to start linking AWS SDK statically. Now most of requests (~80%) fail with (error 99) 'Can connect to endpoint' and log show this:
[ERROR] 2017-05-16 20:05:18 CurlHttpClient [140198444779264] Curl returned error code 77
I spent about half of day trying to figure it out, but so far no luck. Google is full of 'curl code 77' problems related to certificate storage access, but it can't explain why about 20% of requests work just fine (and why static linking could cause this).
I suspect it might be related to linker options I am using (order of some libs is wrong?). As of now end of that line looks like this:
-laws-cpp-sdk-s3 -laws-cpp-sdk-core -lcurl -lcrypto
I had to add curl and crypto after switching to static linking.
Here is how I built SDK (note we use v1.0.59):
OS: CentOS 7 (minimal installation + few packages like 'Developers Tools', etc)
Help!
The text was updated successfully, but these errors were encountered: