-
Notifications
You must be signed in to change notification settings - Fork 159
Getting Random SSL Errors with upload_from_file function #992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Please share your code. |
Transferring to the python-storage repo for triage |
Hi shubham07507@ could you please elaborate your use case along with a code snippet for investigation? Also could you share what versions of google-cloud-storage and google-resumable-media you're using? |
Closing this issue for now. Happy to reopen if you have more information or questions. |
I am also getting random errors. It is completely spurious, I can not reproduce the issue with determinism. I'll give you as much info as I can, please ask for more if you need. I will also keep on adding logging to give more clues to why this is happening. Execution environment: Cloud Run Job, 8 CPUs, 16 GB memory, 10 tasks in parallel (1 / 10 fails spuriously). System info:
Stack traces below (Note that I have anonymized sensitive names, such as Stack trace (level 1) Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 715, in urlopen
httplib_response = self._make_request(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 416, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/local/lib/python3.10/dist-packages/urllib3/connection.py", line 244, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/lib/python3.10/http/client.py", line 1283, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1329, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1278, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1077, in _send_output
self.send(chunk)
File "/usr/lib/python3.10/http/client.py", line 999, in send
self.sock.sendall(data)
File "/usr/lib/python3.10/ssl.py", line 1266, in sendall
v = self.send(byte_view[count:])
File "/usr/lib/python3.10/ssl.py", line 1235, in send
return self._sslobj.write(data)
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2426) Stack trace (level 2) Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 799, in urlopen
retries = retries.increment(
File "/usr/local/lib/python3.10/dist-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/SECRET_BUCKET_NAME/o?uploadType=multipart (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:2426)'))) Stack trace (level 3) Traceback (most recent call last):
File "/app/tasks/task.py", line 74, in <module>
main()
File "/app/tasks/task.py", line 71, in main
run(workspace, args)
File "/app/tasks/task.py", line 50, in run
upload(output_bucket, local_output_path, output_file)
File "/app/taskutil.py", line 63, in upload
blob.upload_from_filename(local_path)
File "/usr/local/lib/python3.10/dist-packages/google/cloud/storage/blob.py", line 2959, in upload_from_filename
self._handle_filename_and_upload(
File "/usr/local/lib/python3.10/dist-packages/google/cloud/storage/blob.py", line 2829, in _handle_filename_and_upload
self._prep_and_do_upload(
File "/usr/local/lib/python3.10/dist-packages/google/cloud/storage/blob.py", line 2637, in _prep_and_do_upload
created_json = self._do_upload(
File "/usr/local/lib/python3.10/dist-packages/google/cloud/storage/blob.py", line 2443, in _do_upload
response = self._do_multipart_upload(
File "/usr/local/lib/python3.10/dist-packages/google/cloud/storage/blob.py", line 1956, in _do_multipart_upload
response = upload.transmit(
File "/usr/local/lib/python3.10/dist-packages/google/resumable_media/requests/upload.py", line 153, in transmit
return _request_helpers.wait_and_retry(
File "/usr/local/lib/python3.10/dist-packages/google/resumable_media/requests/_request_helpers.py", line 178, in wait_and_retry
raise error
File "/usr/local/lib/python3.10/dist-packages/google/resumable_media/requests/_request_helpers.py", line 155, in wait_and_retry
response = func()
File "/usr/local/lib/python3.10/dist-packages/google/resumable_media/requests/upload.py", line 145, in retriable_request
result = transport.request(
File "/usr/local/lib/python3.10/dist-packages/google/auth/transport/requests.py", line 541, in request
response = super(AuthorizedSession, self).request(
File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/requests/adapters.py", line 563, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/SECRET_BUCKET_NAME/o?uploadType=multipart (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:2426)'))) Worth noting is that the job runs a quite heavy workload and push performance with |
Hi cdeln@, thanks for reporting. This seems to be caused by issues in the underlaying cpython and urllib3 packages. The urllib3 request is not fulfilled yet, so I'd suggest trying a few workarounds in the meanwhile
[cpython]
[urllib3]
While |
@cojenco Thanks for your support. I do not see the need for a retry policy as my service runs in a completely managed solution provided by Google in the cloud. I see why retries become important if you are accessing cloud storage from something like a mobile device. Two example scenarios
But my use case is done all in the cloud and there are nothing to disturb the network! I think adding a retry policy will only hide the problem, it will not solve it. Please correct me if I have misunderstood the retry policy. With that said, I am reading up on the links to the cpython and urllib3 threads. These are both quite technical, so again, please correct me if I am wrong. According to the cpython thread, there is a regression that changes some instances of OSError into SSLEOFError, which breaks some downstream libraries, including urllib3. See this (unresolved) issue: urllib3/urllib3#3382 . This thread does not address the underlying connection error (at least what I can see). According to the urllib3 thread, it seems like the issue is related to the SSL handshake, which then gets convoluted into an SSLEOFError somehow. If I read the thread correctly, they indicate that the issue is related to a protocol violation from the server side (due to usage of weak cipher configuration from the client side, and server requires stricter, and abruptly closes the connection in violation to the protocol). The weirdest part of this is still that it's random, which feels like a stability issue with GCS backend, but this is impossible for me to verify. |
There are various reasons (broken connection, network congestion, etc) that can cause connection issues inside the Google network boundaries. When the connection is interrupted during ssl communication, it’s not by surprise to see transient errors such as As for the |
Hi @cdeln following up to see if you had the chance try running in python 3.9. I haven't been able to reproduce. Is there a code snippet you could share? |
Not yet, I haven't had time, but I am thinking about a debugging strategy. It's unfortunate that I discovered this so late into my development process, I don't want to use my complex workflow as a basis for debugging.
There are some variables that needs to be considered, such as compute and storage region. I am using europe-north1 atm. Any other variables to consider? This is just a proposal debug strategy, please improve with any ideas you may have. Here is a dockerfile + python script (main.py) and utility bash scripts for starters FROM ubuntu:22.04
WORKDIR /app/
ENV DEBIAN_FRONTEND noninteractive
ENV PYTHONUNBUFFERED True
RUN apt-get update && apt-get install -y --no-install-recommends \
python3 \
python3-pip \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir \
google-cloud-storage==2.16.0
COPY main.py .
ENTRYPOINT ["python3", "main.py"] #!/usr/bin/env python3
import argparse
import os
import uuid
import google.cloud.storage
parser = argparse.ArgumentParser()
parser.add_argument('bucket')
parser.add_argument('--filename', default=str(uuid.uuid4()))
parser.add_argument('--filesize', type=int, default=1_000_000_000)
args = parser.parse_args()
storage = google.cloud.storage.Client()
bucket = storage.bucket(args.bucket)
blob = bucket.blob(args.filename)
payload = b'A' * args.filesize
print(f'Uploading {len(payload)} bytes to gs://{args.bucket}/{args.filename}')
blob.upload_from_string(payload) I build the docker with this script #!/usr/bin/env bash
docker build -t google-cloud-storage-ssl-error:latest . and run locally with this script (no need to mount credentials when deploying the job ofc) #!/usr/bin/env bash
docker run \
-v ~/.config/gcloud/application_default_credentials.json:/app/credentials.json \
-e GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json \
-e GCLOUD_PROJECT=$(gcloud config get project) \
google-cloud-storage-ssl-error:latest $@ The script is parameterized with bucket name, bucket object filename (defaults to unique ID on every invocation) and payload size. Can we set sensible defaults for I set up a repo with the code as well: https://github.com/cdeln/google-cloud-storage-ssl-bug |
Hi, I've added a workflow also, and tweaked the existing scripts a bit (see repo). main:
params: []
steps:
- init:
assign:
- project_id: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
- job_location: europe-north1
- job_namespace: ${"namespaces/" + project_id + "/jobs/"}
- bucket: ${sys.get_env("BUCKET")}
- loop:
for:
value: i
range: ${[0, 100]}
steps:
- upload:
call: googleapis.run.v1.namespaces.jobs.run
args:
name: ${job_namespace + "google-cloud-storage-ssl-error"}
location: ${job_location}
body:
overrides:
containerOverrides:
args:
- ${bucket}
- --folder
- testfolder I configure the Run Job with 1 vCPU and 4GB mem, with 40 tasks in parallel. |
requests.exceptions.SSLError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/athena-samples-prod/o?uploadType=multipart (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1056)')))
at .send ( /usr/local/lib/python3.7/site-packages/requests/adapters.py:563 )
at .send ( /usr/local/lib/python3.7/site-packages/requests/sessions.py:701 )
at .request ( /usr/local/lib/python3.7/site-packages/requests/sessions.py:587 )
at .request ( /usr/local/lib/python3.7/site-packages/google/auth/transport/requests.py:555 )
at .retriable_request ( /usr/local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py:146 )
at .wait_and_retry ( /usr/local/lib/python3.7/site-packages/google/resumable_media/requests/_request_helpers.py:148 )
at .wait_and_retry ( /usr/local/lib/python3.7/site-packages/google/resumable_media/requests/_request_helpers.py:171 )
at .transmit ( /usr/local/lib/python3.7/site-packages/google/resumable_media/requests/upload.py:154 )
at ._do_multipart_upload ( /usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py:1890 )
at ._do_upload ( /usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py:2367 )
at .upload_from_file ( /usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py:2552 )
at .upload_from_filename ( /usr/local/lib/python3.7/site-packages/google/cloud/storage/blob.py:2696 )
The text was updated successfully, but these errors were encountered: