Skip to content

Conversation

CyMule
Copy link
Contributor

@CyMule CyMule commented Jul 30, 2025

No description provided.

CyMule added a commit that referenced this pull request Jul 31, 2025
## Problem

S3 downloads were sometimes failing with `NotADirectoryError` and
`FileExistsError` when S3 buckets contained objects with conflicting
naming patterns that cannot be represented in traditional filesystem
hierarchies.

**Example conflict:**
- S3 object: `foo` (file)
- S3 object: `foo/documents` (file requiring foo to be a directory)

This created a race condition where download order determined
success/failure

## Solution

Used tempfile to create unique download paths for each S3 object:

**Before:**
```
S3: "foo" → Local: /downloads/foo
S3: "foo/documents" → Local: /downloads/foo/documents
Conflict: foo cannot be both file and directory
```

**After:**
```
S3: "foo" → Local: /downloads/a1b2c3d4e5f6/foo
S3: "foo/documents" → Local: /downloads/9g8h7i6j5k4l/documents
No conflicts: Each file gets unique directory
```

## Future Work

This PR targets only the s3 downloads. I think it would make sense to
use tempfiles for all downloads (as in [PR
#571](#571)),
but that requires more extensive changes to implement cleanly. This fix
provides immediate relief from the path conflict issues while we work on
the more comprehensive tempfile solution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant