Skip to content

Conversation

@prantogg
Copy link
Contributor

@prantogg prantogg commented Jan 3, 2026

This PR addresses #70 - enables direct S3 streaming for generated data, eliminating local storage requirements for large-scale dataset generation. S3 URIs in --output-dir are automatically detected and routed to a streaming multipart upload implementation.

spatialbench-cli --scale-factor 1000 --output-dir s3://my-bucket/spatialbench/sf1000
  • Implements streaming S3 writer with Write trait compatibility.
  • Uses S3 multipart upload API with 32MB parts for efficient streaming. Small files (<5MB) use direct PUT requests.
  • Multipart completion handled asynchronously via Tokio runtime. Built on object_store crate's AmazonS3Builder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant