Skip to content

Cache final file from resume retry process #13441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

notatallshaw
Copy link
Member

Fixes #13440

Not sure how elegant this fix is, I don't have a lot of expertise here, but it was fairly easy to put together.

@notatallshaw notatallshaw force-pushed the cache-final-file-from-resume-retry branch from d81fb05 to 17f3703 Compare June 23, 2025 23:35
@ichard26
Copy link
Member

Ah, this doesn't look that bad. Thanks for working on this!

The main limitation with this approach is that we load the entire file in memory before writing it to the cache. This used to be an annoying issue (I would link examples, but I'm on a phone), but a few years ago, the caching implementation was rewritten to stream the response body.

It'd be preferable to avoid reintroducing the memory usage issue, which means the synthetic response needs to be a streaming response.

I can take a look at retooling this in a few days if you don't get to it.

@notatallshaw
Copy link
Member Author

I can take a look at retooling this in a few days if you don't get to it.

I'll take another look, my initial naive approach didn't work, which is why I rewrote it to read the whole file at once. But if I don't respond with anything in the next couple of days feel free to take a go.

@notatallshaw notatallshaw force-pushed the cache-final-file-from-resume-retry branch from 5076580 to ba1ee52 Compare June 24, 2025 01:48
@notatallshaw
Copy link
Member Author

Okay, I added streaming support by extending the functionality of SafeFileCache, not at all sure if this is the right approach or a hack, but works without significant addition of complexity.

Copy link
Member

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking quite good! Thank you so much for working on this, especially with such a quick turn-around!

I have some suggestions to harden the implementation among other changes.

Comment on lines 374 to 384
# Mock the session's adapters to have a cache controller
mock_adapter = MagicMock(spec=CacheControlAdapter)
mock_controller = MagicMock()
mock_adapter.controller = mock_controller
mock_controller.cache_url = MagicMock(return_value="cache_key")
mock_controller.serializer = MagicMock()
mock_controller.serializer.dumps = MagicMock(return_value=b"serialized_data")

# Mock the cache to be a SafeFileCache
mock_cache = MagicMock(spec=SafeFileCache)
mock_adapter.cache = mock_cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just not create a pip session with caching already enabled? PipSession(cache=str(tmpdir)) I get this is an unit test, but practically, I'd prefer if the we tested the actual implementation to avoid mismatches that hide issues down the line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm now using a PipSession object to test here.

@ichard26 ichard26 added this to the 25.2 milestone Jun 24, 2025
Copy link
Member

@ichard26 ichard26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@notatallshaw
Copy link
Member Author

@ichard26 Thanks for your quick review!

I ran some more verification scenarios locally (such as starting with no internet connection and then losing it intermitently during the download), and I couldn't find any edge case it failed on.

Going to merge it now rather than waiting any longer.

@notatallshaw notatallshaw merged commit f68e7c9 into pypa:main Jun 25, 2025
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Download is not cached when using --resume-retries option
2 participants