-
Couldn't load subscription status.
- Fork 168
Description
What happens?
When executing a SELECT statement in DuckDB-Wasm on a data source accessed via a pre-signed URL (especially those created for GET requests), the operation fails due to CORS errors. This prevents querying data stored in locations that require pre-signed URLs for access.
To Reproduce
- Use the following pre-signed URL for a Parquet file (valid for 7 days from 2024-09-14):
https://91ff95bcb91fbfa1b1c5c356262b1fe4.r2.cloudflarestorage.com/techtalk/world_populations.parquet?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=0d9126cf0fed3ae3c00f20ceb2bb97c3%2F20240914%2Fauto%2Fs3%2Faws4_request&X-Amz-Date=20240914T091120Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=1bddf8fcc77e83aa20ffa827e771cea7310af373354af06c5ac58f2e181f0182 - In DuckDB-Wasm or at shell.duckdb.org, attempt to execute a SELECT statement on this data source using the pre-signed URL.
Example SQL query:
SELECT * FROM parquet_scan('https://91ff95bcb91fbfa1b1c5c356262b1fe4.r2.cloudflarestorage.com/techtalk/world_populations.parquet?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=0d9126cf0fed3ae3c00f20ceb2bb97c3%2F20240914%2Fauto%2Fs3%2Faws4_request&X-Amz-Date=20240914T091120Z&X-Amz-Expires=604800&X-Amz-SignedHeaders=host&X-Amz-Signature=1bddf8fcc77e83aa20ffa827e771cea7310af373354af06c5ac58f2e181f0182') LIMIT 10;- Observe that the query fails due to a CORS error, and the data is not accessible.
Note: I tried this query on shell.duckdb.org, and it failed to access the data.
Additional context:
The current behavior seems to be:
- DuckDB-Wasm attempts a HEAD request on the pre-signed URL.
- The HEAD request fails with a CORS error.
- An exception is thrown by
xhr.send(null), which is not caught. - The code for performing a range GET request is never reached.
- The SELECT statement fails, unable to access the data.
This behavior was observed both in a local DuckDB-Wasm implementation and on shell.duckdb.org.
Importantly, the bucket's CORS policy is set according to the documentation:
[
{
"AllowedOrigins": [
"*"
],
"AllowedMethods": [
"GET",
"HEAD"
],
"AllowedHeaders": [
"*"
],
"ExposeHeaders": [
"*"
],
"MaxAgeSeconds": 3000
}
]Despite this CORS policy allowing both GET and HEAD methods from any origin, the issue persists. This suggests that the problem might be related to how DuckDB-Wasm handles the pre-signed URLs rather than the bucket's CORS configuration.
A possible solution might be to skip the HEAD request for pre-signed URLs or implement exception handling to proceed with the range GET request even if the HEAD request fails.
Browser/Environment:
Chrome 128.0.6613.138
Device:
M2 Macbook Air
DuckDB-Wasm Version:
1.28.1-dev278.0
DuckDB-Wasm Deployment:
shell.duckdb.org
Full Name:
Koji Mizoguchi
Affiliation:
TechTalk Inc.