-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
read1
and readline
of http.client.HTTPResponse
do not raise IncompleteRead
#115997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I am not sure that they should raise AFAIK, only |
Illia described the proposal as “Make [read1 and readline] raise IncompleteRead instead of returning zero bytes if a connection is closed before an expected number of bytes has been read.” I’m in favour of raising an exception. If the transport is shut down before the end of a non-chunked HTTP/1.1 message with Content-Length specified, it looks like read1 and readline will eventually return an empty byte string, which normally indicates EOF or end-of-stream. Since the HTTP protocol has broken down and the end of the HTTP message has not been received, it is better to raise an exception. There are at least two more cases that should raise an exception. (There may be others; I haven’t looked at every case.)
I believe all these cases already raise an exception when reading a chunked message, so it is inconsistent that an exception cannot also be relied on to validate a non-chunked response. Especially annoying since I presume HTTPSConnection is still vulnerable to TLS truncation attack (#72002). |
I'd like to chime in with my two pence. I was alerted to this issue by @vadmium yesterday when I was troubleshooting issue #129264. What brought me here was Ansible Ultimately I found that I've got problems with my ISP; some sort of connection optimisation platform is sitting between me and the server and is closing connections before all of the data has been transmitted. Ultimately I need to work with the ISP to get these issues resolved (not convinced that I'll have much luck there!) but I do fundamentally hold that a truncated HTTP response should not be taken as good and should throw some sort of exception. |
Thank you for the feedback! The case with
Somebody considered that this can break compatibility, I wonder if such a change is acceptable for a feature release Lines 480 to 483 in d89a5f6
|
FYI, I gave this PR a test, and it doesn't help in the code path Ansible takes to get a truncated read that doesn't result in IncompleteRead. Having never submitted an Issue or PR to Python before, what would you suggest I do here? Should I propose a further change to this PR or should I submit my own? (I'm not sure of the efficacy of either since this PR is nearly a year old.) |
I have experienced a related issue today, similar to #129264, which is now unfortunately closed. I too am trying to download a file through HTTP POST, by doing If I instead do I can reliably reproduce this error on both my home network (optical until Wi-Fi AP) and broadband mobile connection (5G, tethered via my cell phone), corresponding to different ISPs/operators. Tested with Python 3.12.8 and 3.13.0 Minimal reproducer below: import json
import shutil
from urllib.request import Request, urlopen
from tempfile import NamedTemporaryFile
from zipfile import ZipFile
USE_COPYFILEOBJ_BUG = True # set to False to see the difference
headers = {
'Accept': 'application/zip',
'Content-Type': 'application/json; charset=UTF-8',
}
body = {
"accessions": ["GCF_000010285.1", "GCF_007677595.1"],
"include_annotation_type": ["GENOME_FASTA"]
}
data = json.dumps(body, indent=2).encode("UTF-8")
req = Request("https://api.ncbi.nlm.nih.gov/datasets/v2/genome/download", headers=headers, data=data)
with NamedTemporaryFile() as tempfile, urlopen(req) as urlstream, open(tempfile.name, "wb") as filestream:
if USE_COPYFILEOBJ_BUG:
shutil.copyfileobj(urlstream, filestream)
else:
filestream.write(urlstream.read())
with open(tempfile.name, 'rb') as tempstream:
print(f'{len(tempstream.read())=}') # prints either 1572864 (wrong) or 1573406 (correct)
with ZipFile(tempfile.name) as zipfile: # this raises when using copyfileobj
pass |
Uh oh!
There was an error while loading. Please reload this page.
Bug report
Bug description:
Unlike
http.client.HTTPResponse.read
,read1
andreadline
do not raiseIncompleteRead
when the content length is know and a connection is closed before everything has been read.FYI, these two methods have had a common issue in the past #113199.
CPython versions tested on:
3.8, 3.9, 3.10, 3.11, 3.12, 3.13, CPython main branch
Operating systems tested on:
Linux, macOS, Windows
Linked PRs
HTTPResponse.read1
andreadline
raiseIncompleteRead
#115998The text was updated successfully, but these errors were encountered: