Skip to content

Conversation

@steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Aug 18, 2025

How was this patch tested?

Testing in progress; still trying to get the ITests working.

JUnit5 update complicates things here, as it highlights that minicluster tests aren't working.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@pan3793
Copy link
Member

pan3793 commented Aug 19, 2025

JUnit5 update complicates things here, as it highlights that minicluster tests aren't working.

I found hadoop-client-runtime and hadoop-client-minicluster broken during integration with Spark, HADOOP-19652 plus YARN-11824 recovers that, is it the same issue?

@steveloughran
Copy link
Contributor Author

@pan3793 maybe.

what is unrelated is out the box the SDK doesn't do bulk delete with third party stores which support it (Dell ECS).

org.apache.hadoop.fs.s3a.AWSBadRequestException: bulkDelete on job-00-fork-0001/test/org.apache.hadoop.fs.contract.s3a.ITestS3AContractBulkDelete: software.amazon.awssdk.services.s3.model.InvalidRequestException: Missing required header for this request: Content-MD5 (Service: S3, Status Code: 400, Request ID: 0c07c87d:196d43d824a:d5329:91d, Extended Request ID: 85e1d41b57b608d4e58222b552dea52902e93b05a12f63f54730ae77769df8d1) (SDK Attempt Count: 1):InvalidRequest: Missing required header for this request: Content-MD5 (Service: S3, Status Code: 400, Request ID: 0c07c87d:196d43d824a:d5329:91d, Extended Request ID: 85e1d41b57b608d4e58222b552dea52902e93b05a12f63f54730ae77769df8d1) (SDK Attempt Count: 1)
--

@steveloughran
Copy link
Contributor Author

@pan3793 no, it's lifecycle related. Test needs to set up that minicluster before the test cases. and that's somehow not happening

@steveloughran steveloughran force-pushed the s3/HADOOP-19654-aws-sdk-2.32 branch from 5b9a7e3 to efd34a0 Compare August 25, 2025 21:36
@steveloughran steveloughran marked this pull request as draft August 25, 2025 21:37
@steveloughran
Copy link
Contributor Author

regressions

everywhere

No logging. Instead we get

SLF4J: Failed to load class "org.slf4j.impl.StaticMDCBinder".
SLF4J: Defaulting to no-operation MDCAdapter implementation.
SLF4J: See http://www.slf4j.org/codes.html#no_static_mdc_binder for further details.

ITestS3AContractAnalyticsStreamVectoredRead failures -stream closed.

more on this once I've looked at it. If it is an SDK issue, major regression, though it may be something needing changes in the aal libary

s3 express

[ERROR]   ITestTreewalkProblems.testDistCp:319->lambda$testDistCp$3:320 [Exit code of distcp -useiterator -update -delete -direct s3a://stevel--usw2-az1--x-s3/job-00-fork-0005/test/testDistCp/src s3a://stevel--usw2-az1--x-s3/job-00-fork-0005/test/testDistCp/dest]    

assumption: now that the store has lifecycle rules, you don't get prefix listings when there's an in-progress upload.

Fix: change test but also path capability warning of inconsistency. this is good.

Operation costs/auditing count an extra HTTP request, so cost tests fail. I suspect it is always calling CreateSession, but without logging can't be sure

@steveloughran steveloughran force-pushed the s3/HADOOP-19654-aws-sdk-2.32 branch from efd34a0 to 6a7e6d9 Compare August 26, 2025 20:52
@steveloughran steveloughran force-pushed the s3/HADOOP-19654-aws-sdk-2.32 branch from 6a7e6d9 to cc31e5b Compare September 15, 2025 17:47
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+0 🆗 shelldocs 0m 1s Shelldocs was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 10m 31s Maven dependency ordering for branch
+1 💚 mvninstall 24m 17s trunk passed
+1 💚 compile 9m 23s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 7m 50s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 2m 0s trunk passed
+1 💚 mvnsite 19m 57s trunk passed
+1 💚 javadoc 5m 17s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 4m 37s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+0 🆗 spotbugs 0m 11s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 40m 37s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 40s Maven dependency ordering for patch
+1 💚 mvninstall 23m 52s the patch passed
+1 💚 compile 8m 3s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 8m 3s the patch passed
+1 💚 compile 7m 24s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 7m 24s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 1m 54s /results-checkstyle-root.txt root: The patch generated 9 new + 42 unchanged - 5 fixed = 51 total (was 47)
+1 💚 mvnsite 11m 32s the patch passed
+1 💚 shellcheck 0m 0s No new issues.
+1 💚 javadoc 5m 26s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 5m 7s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+0 🆗 spotbugs 0m 15s hadoop-project has no data from spotbugs
+1 💚 shadedclient 39m 23s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 678m 20s /patch-unit-root.txt root in the patch passed.
+1 💚 asflicense 1m 8s The patch does not generate ASF License warnings.
913m 40s
Reason Tests
Failed junit tests hadoop.yarn.server.router.subcluster.fair.TestYarnFederationWithFairScheduler
hadoop.yarn.server.router.webapp.TestFederationWebApp
hadoop.yarn.server.router.webapp.TestRouterWebServicesREST
hadoop.mapreduce.v2.TestUberAM
hadoop.yarn.sls.appmaster.TestAMSimulator
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/6/artifact/out/Dockerfile
GITHUB PR #7882
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint shellcheck shelldocs
uname Linux 113d355d9ed2 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cc31e5b
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/6/testReport/
Max. process+thread count 4200 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2 shellcheck=0.7.0
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@ahmarsuhail
Copy link
Contributor

Thanks @steveloughran, PR looks good overall.

Are then failures in ITestS3AContractAnalyticsStreamVectoredRead intermittent? I've not been able to reproduce, am running the test on this SDK upgrade branch.

// disable create session so there's no need to
// add a role policy for it.
disableCreateSession(conf);
//disableCreateSession(conf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can just cut this instead of commenting it out, since we're skipping these tests if S3 Express is enabled

// close the stream, should throw RemoteFileChangedException
RemoteFileChangedException exception = intercept(RemoteFileChangedException.class, stream::close);
assertS3ExceptionStatusCode(SC_412_PRECONDITION_FAILED, exception);
verifyS3ExceptionStatusCode(SC_412_PRECONDITION_FAILED, exception);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know what the difference is with the other tests here?

As in, why with S3 express is it ok to assert that we'll get a 412, whereas the others tests will throw a 200?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, it's your server code. Go see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked..the answer is that it's MPU that has divergence, put object which these tests do will return 412

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 shelldocs 0m 0s Shelldocs was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 10m 32s Maven dependency ordering for branch
+1 💚 mvninstall 23m 50s trunk passed
+1 💚 compile 8m 32s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 7m 30s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 1m 58s trunk passed
+1 💚 mvnsite 14m 30s trunk passed
+1 💚 javadoc 5m 33s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 5m 5s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+0 🆗 spotbugs 0m 15s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 38m 32s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 36s Maven dependency ordering for patch
+1 💚 mvninstall 23m 26s the patch passed
+1 💚 compile 8m 17s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 8m 17s the patch passed
+1 💚 compile 7m 15s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 7m 15s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 1m 58s /results-checkstyle-root.txt root: The patch generated 9 new + 42 unchanged - 5 fixed = 51 total (was 47)
+1 💚 mvnsite 12m 9s the patch passed
+1 💚 shellcheck 0m 0s No new issues.
+1 💚 javadoc 5m 27s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 5m 4s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+0 🆗 spotbugs 0m 15s hadoop-project has no data from spotbugs
+1 💚 shadedclient 38m 17s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 678m 56s /patch-unit-root.txt root in the patch passed.
+1 💚 asflicense 1m 11s The patch does not generate ASF License warnings.
905m 0s
Reason Tests
Failed junit tests hadoop.yarn.server.router.subcluster.fair.TestYarnFederationWithFairScheduler
hadoop.yarn.server.router.webapp.TestFederationWebApp
hadoop.yarn.server.router.webapp.TestRouterWebServicesREST
hadoop.mapreduce.v2.TestUberAM
hadoop.yarn.sls.appmaster.TestAMSimulator
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/7/artifact/out/Dockerfile
GITHUB PR #7882
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint shellcheck shelldocs
uname Linux 3b890eb50412 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 3351e41
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/7/testReport/
Max. process+thread count 4379 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2 shellcheck=0.7.0
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@apache apache deleted a comment from hadoop-yetus Sep 17, 2025
@apache apache deleted a comment from hadoop-yetus Sep 17, 2025
@steveloughran
Copy link
Contributor Author

I've attached a log of a test run against an s3 express bucket where the test ITestAWSStatisticCollection.testSDKMetricsCostOfGetFileStatusOnFile() is failing because the AWS SDK stats report 2 http requests for the probe. I'd thought it was create-session related but it isn't: it looks like somehow the stream is broken. This happens reliably on every test runs.

The relevant stuff is at line 564 where a HEAD request fails because the stream is broken
"end of stream".

2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "HEAD /test/testSDKMetricsCostOfGetFileStatusOnFile HTTP/1.1[\r][\n]"
2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "Host: stevel--usw2-az1--x-s3.s3express-usw2-az1.us-west-2.amazonaws.com[\r][\n]"
2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "amz-sdk-invocation-id: 1804bbcd-04de-cba8-8055-6a09917ca20d[\r][\n]"
2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "amz-sdk-request: attempt=1; max=3[\r][\n]"
2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "Authorization: AWS4-HMAC-SHA256 Credential=AKIA/20250917/us-west-2/s3express/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;host;referer;x-amz-content-sha256;x-amz-date, Signature=228a46bb1d008468d38afd0da0ed7b4c354ab12631a63bf4283cb23dc02527a3[\r][\n]"
2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "Referer: https://audit.example.org/hadoop/1/op_get_file_status/cf739331-1f2e-42dd-a5d9-f564d6023a23-00000008/?op=op_get_file_status&p1=test/testSDKMetricsCostOfGetFileStatusOnFile&pr=stevel&ps=282e3c5d-c1bd-4859-94b9-82e77ff225d1&id=cf739331-1f2e-42dd-a5d9-f564d6023a23-00000008&t0=1&fs=cf739331-1f2e-42dd-a5d9-f564d6023a23&t1=1&ts=1758131029311[\r][\n]"
2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "User-Agent: Hadoop 3.5.0-SNAPSHOT aws-sdk-java/2.33.8 md/io#sync md/http#Apache ua/2.1 api/S3#2.33.x os/Mac_OS_X#15.6.1 lang/java#17.0.8 md/OpenJDK_64-Bit_Server_VM#17.0.8+7-LTS md/vendor#Amazon.com_Inc. md/en_GB m/F,G hll/cross-region[\r][\n]"
2025-09-17 18:43:49,313 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "x-amz-content-sha256: UNSIGNED-PAYLOAD[\r][\n]"
2025-09-17 18:43:49,314 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "X-Amz-Date: 20250917T174349Z[\r][\n]"
2025-09-17 18:43:49,314 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "Connection: Keep-Alive[\r][\n]"
2025-09-17 18:43:49,314 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-1 >> "[\r][\n]"
2025-09-17 18:43:49,314 [setup] DEBUG http.wire (Wire.java:wire(87)) - http-outgoing-1 << "end of stream"
2025-09-17 18:43:49,314 [setup] DEBUG awssdk.request (LoggerAdapter.java:debug(125)) - Retryable error detected. Will retry in 51ms. Request attempt number 1
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: The target server failed to respond
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:130)
	at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)

The second request always works.

2025-09-17 18:43:49,672 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "HEAD /test/testSDKMetricsCostOfGetFileStatusOnFile HTTP/1.1[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "Host: stevel--usw2-az1--x-s3.s3express-usw2-az1.us-west-2.amazonaws.com[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "amz-sdk-invocation-id: 1804bbcd-04de-cba8-8055-6a09917ca20d[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "amz-sdk-request: attempt=2; max=3[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "Authorization: AWS4-HMAC-SHA256 Credential=AKIA/20250917/us-west-2/s3express/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;host;referer;x-amz-content-sha256;x-amz-date, Signature=920d981fad319228c969f5df7f5c1a3c7e4d3c0e2f45ff53bba73e6cf47c5871[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "Referer: https://audit.example.org/hadoop/1/op_get_file_status/cf739331-1f2e-42dd-a5d9-f564d6023a23-00000008/?op=op_get_file_status&p1=test/testSDKMetricsCostOfGetFileStatusOnFile&pr=stevel&ps=282e3c5d-c1bd-4859-94b9-82e77ff225d1&id=cf739331-1f2e-42dd-a5d9-f564d6023a23-00000008&t0=1&fs=cf739331-1f2e-42dd-a5d9-f564d6023a23&t1=1&ts=1758131029311[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "User-Agent: Hadoop 3.5.0-SNAPSHOT aws-sdk-java/2.33.8 md/io#sync md/http#Apache ua/2.1 api/S3#2.33.x os/Mac_OS_X#15.6.1 lang/java#17.0.8 md/OpenJDK_64-Bit_Server_VM#17.0.8+7-LTS md/vendor#Amazon.com_Inc. md/en_GB m/F,G hll/cross-region[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "x-amz-content-sha256: UNSIGNED-PAYLOAD[\r][\n]"
2025-09-17 18:43:49,673 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "X-Amz-Date: 20250917T174349Z[\r][\n]"
2025-09-17 18:43:49,674 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "Connection: Keep-Alive[\r][\n]"
2025-09-17 18:43:49,674 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 >> "[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "HTTP/1.1 200 OK[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "server: AmazonS3[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "x-amz-request-id: 01869434dd00019958c6871b05090b3f875a3c90[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "x-amz-id-2: 9GqfbNyMyUs6[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "etag: "6036aaaf62444466bf0a21cc7518f738"[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "accept-ranges: bytes[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "last-modified: Wed, 17 Sep 2025 17:43:49 GMT[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "x-amz-storage-class: EXPRESS_ONEZONE[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "content-type: application/octet-stream[\r][\n]"
2025-09-17 18:43:49,859 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "x-amz-server-side-encryption: AES256[\r][\n]"
2025-09-17 18:43:49,860 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "content-length: 0[\r][\n]"
2025-09-17 18:43:49,860 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "x-amz-expiration: NotImplemented[\r][\n]"
2025-09-17 18:43:49,860 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "date: Wed, 17 Sep 2025 17:43:48 GMT[\r][\n]"
2025-09-17 18:43:49,860 [setup] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-2 << "[\r][\n]"
2025-09-17 18:43:49,860 [setup] DEBUG awssdk.request (LoggerAdapter.java:debug(105)) - Received successful response: 200, Request ID: 

Either the request is being rejected (why?) or the connection has gone stale. But why should it happen at exactly the same place on every single test run?

org.apache.hadoop.fs.s3a.statistics.ITestAWSStatisticCollection-output.txt

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 34s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+0 🆗 shelldocs 0m 1s Shelldocs was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 11 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 12m 12s Maven dependency ordering for branch
+1 💚 mvninstall 40m 29s trunk passed
+1 💚 compile 15m 48s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 14m 0s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 4m 18s trunk passed
+1 💚 mvnsite 21m 27s trunk passed
+1 💚 javadoc 9m 42s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 7m 58s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+0 🆗 spotbugs 0m 21s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 66m 25s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 1m 3s Maven dependency ordering for patch
+1 💚 mvninstall 40m 59s the patch passed
+1 💚 compile 15m 18s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 15m 18s the patch passed
+1 💚 compile 13m 50s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 13m 50s the patch passed
-1 ❌ blanks 0m 1s /blanks-eol.txt The patch has 1 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 4m 10s /results-checkstyle-root.txt root: The patch generated 7 new + 42 unchanged - 5 fixed = 49 total (was 47)
+1 💚 mvnsite 19m 25s the patch passed
+1 💚 shellcheck 0m 0s No new issues.
+1 💚 javadoc 9m 38s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 7m 50s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+0 🆗 spotbugs 0m 21s hadoop-project has no data from spotbugs
+1 💚 shadedclient 66m 26s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 450m 14s /patch-unit-root.txt root in the patch failed.
+1 💚 asflicense 1m 21s The patch does not generate ASF License warnings.
832m 28s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/8/artifact/out/Dockerfile
GITHUB PR #7882
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint shellcheck shelldocs
uname Linux 40fa101aa5ab 5.15.0-143-generic #153-Ubuntu SMP Fri Jun 13 19:10:45 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 661dc6e
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/8/testReport/
Max. process+thread count 3559 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws . U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7882/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2 shellcheck=0.7.0
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@ahmarsuhail
Copy link
Contributor

@steveloughran discovered completely by accident, but it's something to do with the checksumming code.

If you comment out these lines:

   //  builder.addPlugin(LegacyMd5Plugin.create());

    // do not do request checksums as this causes third-party store problems.
  //  builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED);

    // response checksum validation. Slow, even with CRC32 checksums.
//    if (parameters.isChecksumValidationEnabled()) {
//      builder.responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED);
//    }

the test will pass. Could be something to do with s3Express not supporting md5, will look into it.

@ahmarsuhail
Copy link
Contributor

Specifically, it's this line: builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED); that causes this.

Comment that out, or change it to builder.requestChecksumCalculation(RequestChecksumCalculation.WHEN_SUPPORTED), it passes.

My guess is it's something to do with S3 express not supporting MD5, but for operations where RequestChecksumCalculation.WHEN_REQUIRED is true, SDK calculates the m5 and then S3 express rejects it.

Have asked the SDK team.

@steveloughran
Copy link
Contributor Author

ok, so maybe for s3express stores we don't do legacy MD5 plugin stuff all is good?

  1. Does imply the far end is breaking the connection when it is unhappy -at least our unit tests found this stuff before the cost of every HEAD doubles.
  2. maybe we should make the choice of checksums an enum with md5 the default, so it is something that can be turned off/changed in future.

While on the topic of S3 Express, is it now the case that because there's lifecycle rules for cleanup, LIST calls don't return prefixes of paths with incomplete uploads? If so I will need to change production code and the test -with a separate JIRA for that for completeness

@ahmarsuhail
Copy link
Contributor

for s3express stores we don't do legacy MD5 plugin stuff all is good

@steveloughran confirming with the SDK team, since the MD5 plugin is supposed to restore previous behaviour, the server rejecting the first request seems wrong. let's see what they have to say.

LIST calls don't return prefixes of paths with incomplete uploads

Will check with S3 express team on this

@steveloughran steveloughran force-pushed the s3/HADOOP-19654-aws-sdk-2.32 branch from 661dc6e to aa8e814 Compare September 19, 2025 13:23
@steveloughran
Copy link
Contributor Author

thanks. I don't see it on tests against s3 with the 2.29.52 release, so something is changing with the requests made with new SDK + MD5 stuff.

@ahmarsuhail
Copy link
Contributor

@steveloughran not able to narrow this error down just yet, it looks like it's a combination of S3A's configuration of the S3 client + these new Md5 changes.

  @Test
  public void testHead() throws Throwable {
   // S3Client s3Client = getFileSystem().getS3AInternals().getAmazonS3Client("test instance");

    S3Client s3Client = S3Client.builder().region(Region.US_EAST_1)
            .addPlugin(LegacyMd5Plugin.create())
            .requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED)
            .responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED)
            .overrideConfiguration(o -> o.retryStrategy(b -> b.maxAttempts(1)))
            .build();



    s3Client.headObject(HeadObjectRequest.builder().bucket("<>")
            .key("<>").build());
  }

I see the failure when the S3A client, and don't see it when I use a newly created client. So it's not just because of requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED)

Looking into it some more.

S3 express team said there have been no changes in LIST behaviour.

@ahmarsuhail
Copy link
Contributor

able to reproduce the issue outside of S3A. Basically did what would happen when you run a test in S3A:

  • a probe for the test/ directory, and then create the test/ directory, and then do the headObject() call.

The head fails, but if you comment out requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED) it works again.

no idea what's going on. but have shared this local reproduction with SDK team. And rules out that it's something in the S3A code.

public class TestClass {

    S3Client s3Client;

    public TestClass() {
        this.s3Client = S3Client.builder().region(Region.US_EAST_1)
                .addPlugin(LegacyMd5Plugin.create())
                .requestChecksumCalculation(RequestChecksumCalculation.WHEN_REQUIRED)
                .responseChecksumValidation(ResponseChecksumValidation.WHEN_SUPPORTED)
                .overrideConfiguration(o -> o.retryStrategy(b -> b.maxAttempts(1)))
                .build();
    }


    public void testS3Express(String bucket, String key) {
        s3Client.listObjectsV2(ListObjectsV2Request.builder()
                .bucket("<>")
                .maxKeys(2)
                .prefix("test/")
                .build());


        try {
            s3Client.headObject(HeadObjectRequest.builder().bucket("<>")
                    .key("test")
                    .build());
        } catch (Exception e) {
            System.out.println("Exception thrown: " + e.getMessage());
        }

        s3Client.putObject(PutObjectRequest
                .builder()
                .bucket("<>")
                .key("test/").build(), RequestBody.empty());

        s3Client.headObject(HeadObjectRequest.builder().bucket("<>")
                .key("<>")
                .build());
    }

* Now need to explicitly turn off checksum validation on downloads
  (slow)
* Default fs.s3a.create.checksum.algorithm is "" again: nothing.

Docs updated to try and explain this.
@steveloughran steveloughran force-pushed the s3/HADOOP-19654-aws-sdk-2.32 branch from 149e982 to 6416c20 Compare November 4, 2025 13:12
@steveloughran
Copy link
Contributor Author

Seeing

[ERROR] Failures: 
[ERROR] org.apache.hadoop.fs.s3a.commit.integration.ITestS3ACommitterMRJob.test_200_execute
[INFO]   Run 1: PASS
[ERROR]   Run 2: ITestS3ACommitterMRJob.test_200_execute:342 [Files found in s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned] 
Expecting:                                                                                                                                                                  
 <["s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00001"]>                                                                    
to be equal to:                                                                                                                                                             
 <["s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00000",                                                                     
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00001",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00002",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00003",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00004",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00005",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00006",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00007",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00008",                                                                    
    "s3a://stevel-london/job-00-fork-0004/test/ITestS3ACommitterMRJob-execute-partitioned/part-m-00009"]>                                                                   
but was not.                                                                                                                                                                
[INFO]   Run 3: PASS

This doesn't happen standalone.

In the IDE I get java8 errors, probably need to log out and log in again now I've switched my default jvm to 17.

I'm not worrying about this.

* fs.s3a.ext.multipart.commit.consumes.upload.id =>
  fs.s3a.ext.test.multipart.commit.consumes.upload.id

  Makes clear it is for testing and not relevant in production.

* remove some whitespace
* declare "auto", "sdk" and "ec2" as reserved regions.
@steveloughran
Copy link
Contributor Author

test failure with ITestConnectionTimeouts and store set to use analytics stream. Fix: make sure we only use classic stream here.

[ERROR] Failures: 
[ERROR]   ITestConnectionTimeouts.testObjectUploadTimeouts:254 Expected a java.lang.Exception to be thrown, but got the result: : "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123"                                                                                                               

@steveloughran
Copy link
Contributor Author

I think I'm done here, @mukund-thakur and @ahmarsuhail .... Testing against corner case deployments are finding corners of test configurations, not actual code failures

I am getting failures of some MR jobs since the JDK/junit updates; with no obvious cause.
They do work standalone, just not in batch runs. Going to see if this surfaces on
trunk and if so declare unrelated.

@steveloughran steveloughran force-pushed the s3/HADOOP-19654-aws-sdk-2.32 branch from 5b4114a to ff3fade Compare November 4, 2025 18:43
@steveloughran
Copy link
Contributor Author

everything running mr job can't spawn process properly as the launched jvm is always java8.
I've mostly fixed this but some of the spawned mr jobs still play up in batch runs but not standalone

Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/hadoop/mapreduce/v2/app/MRAppMaster has been compiled by a more recent version of the Java Runtime (class file version 61.0), this version of the Java Runtime only recognizes class file versions up to 52.0
	at java.lang.ClassLoader.defineClass1(Native Method)

One regression is a JUnit5 regression; the configurable timeouts of scale tests are no longer being picked up, slow tests are timing out

[ERROR] org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.testThreadPoolCoolDown -- Time elapsed: 183.3 s <<< ERROR!
java.util.concurrent.TimeoutException: testThreadPoolCoolDown() timed out after 180 seconds
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        Suppressed: java.lang.InterruptedException: sleep interrupted
                at java.base/java.lang.Thread.sleep(Native Method)
                at org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.testThreadPoolCoolDown(ITestS3AConcurrentOps.java:218)
                at java.base/java.lang.reflect.Method.invoke(Method.java:568)
                ... 2 more

@steveloughran
Copy link
Contributor Author

The intermittent test failures happen on trunk when running with java17; it's related to the parallel test runner. I am not investigating it here.

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

Thanks @steveloughran! good to see the test stabilisation changes


<property>
<name>fs.s3a.request.md5.header</name>
<value>false</value>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confused - I thought this should be true, otherwise the SDK won't generate the MD5's, which was causing the compatibility issues with third party stores?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right. somehow got that wrong in my port. will fix

// close the stream, should throw RemoteFileChangedException
RemoteFileChangedException exception = intercept(RemoteFileChangedException.class, stream::close);
assertS3ExceptionStatusCode(SC_412_PRECONDITION_FAILED, exception);
verifyS3ExceptionStatusCode(SC_412_PRECONDITION_FAILED, exception);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked..the answer is that it's MPU that has divergence, put object which these tests do will return 412


```xml
<property>
<name>fs.s3a.region</name>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cut this

+ added that "null" is also a choice of region name to avoid.
@steveloughran steveloughran merged commit 3695db2 into apache:trunk Nov 6, 2025
1 of 2 checks passed
Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was running tests locally. All good other than 2 failures.

//when to calculate request checksums.
final RequestChecksumCalculation checksumCalculation =
parameters.isChecksumCalculationEnabled()
? RequestChecksumCalculation.WHEN_SUPPORTED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking if we could have some docs around the WHEN_SUPPORTED and WHEN_REQUIRED

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its confusing. What happens if it is required but not supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some operations require checksums (bulk delete?) and everything which implemented them has had to expect checksums. This new generation option, "when supported" is what broke things as it really means "generate checksums on all requests". There are only two values in the enum, so the sdk always has to choose one.

when_supported

  • doesn't work for most third party stores
  • seems to break MPUs if you don't set a content checksum for put/posted data.

I think having a generation "true/false" is simpler for people to understand than the nuances of when_supported vs when_required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should be just true/false. @ahmarsuhail could you please talk to the SDK team for this. Why they did this way?

@mukund-thakur
Copy link
Contributor

[ERROR] ITestConnectionTimeouts.testObjectUploadTimeouts:247 [Duration of write] Expecting: <PT16.579S> to be less than: <PT10S>

this can ignored.

[ERROR] ITestBucketTool.testRecreateTestBucketNonS3Express:145 Expected to find 'BucketAlreadyOwnedByYouException' but got unexpected exception: org.apache.hadoop.fs.s3a.AWSBadRequestException: create on s3a://mthakur-us-west-1: software.amazon.awssdk.services.s3.model.S3Exception: The us-east-2 location constraint is incompatible for the region specific endpoint this request was sent to. (Service: S3, Status Code: 400, Request ID: 305KXAWSAP63FKGA, Extended Request ID: BmoZx1BgwGVhyQB6vYFjEFHCkXA4UbowUMEMJhPeNs0SU8q2KPtIbaWg5sZ1gg2XB5LVJfBXQHw=) (SDK Attempt Count: 1):IllegalLocationConstraintException: The us-east-2 location constraint is incompatible for the region specific endpoint this request was sent to. (Service: S3, Status Code: 400, Request ID: 305KXAWSAP63FKGA, Extended Request ID: BmoZx1BgwGVhyQB6vYFjEFHCkXA4UbowUMEMJhPeNs0SU8q2KPtIbaWg5sZ1gg2XB5LVJfBXQHw=) (SDK Attempt Count: 1) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:271) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) at org.apache.hadoop.fs.s3a.tools.BucketTool.run(BucketTool.java:267) at org.apache.hadoop.fs.s3a.tools.BucketTool.exec(BucketTool.java:154) at org.apache.hadoop.fs.s3a.tools.ITestBucketTool.lambda$testRecreateTestBucketNonS3Express$1(ITestBucketTool.java:146)

this one I thinking what is going on

steveloughran added a commit to steveloughran/hadoop that referenced this pull request Nov 6, 2025
AWS SDK upgraded to 2.35.4.

This SDK has changed checksum/checksum headers handling significantly,
causing problems with third party stores, and, in some combinations
AWS S3 itself.

The S3A connector has retained old behavior; options to change
these settings are now available.

The default settings are chosen for maximum compatiblity and performance.

fs.s3a.request.md5.header:       true
fs.s3a.checksum.generation:      false
fs.s3a.create.checksum.algorithm: ""

Consult the documentation for more details.

Contributed by Steve Loughran
@steveloughran
Copy link
Contributor Author

ITestBucketTool.testRecreateTestBucketNonS3Express

was looking at this in the regions patch as it fails for sdk and ec2 regions.

we are trying to issue a create command and need to know the bucket region for the call. The test will have to explicitly ask for it via a HEAD call.
we are expecting an error FWIW; we could just look for two different error texts and accept them both "BucketExists" and "IllegalLocationConstraint". That might be easiest.

(my pr currently skips the test if the region is sdk or ec2, as well as the existing non-aws/non s3-express options).

@mukund-thakur
Copy link
Contributor

reading about the stack trace, the reason for failure is both s3client and the create bucket request should have the same configured region.
Debugging more I found we should be setting the region here as well https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/tools/ITestBucketTool.java#L115
once set it will get propagated as location constraint in the create bucket request.

but even if setting and verifying that propagation is happening correctly, it fails with the same reason.

Yes just accepting both error, the test will be fine and I am wondering what is going on.

steveloughran added a commit to steveloughran/hadoop that referenced this pull request Nov 10, 2025
AWS SDK upgraded to 2.35.4.

This SDK has changed checksum/checksum headers handling significantly,
causing problems with third party stores, and, in some combinations
AWS S3 itself.

The S3A connector has retained old behavior; options to change
these settings are now available.

The default settings are chosen for maximum compatiblity and performance.

fs.s3a.request.md5.header:       true
fs.s3a.checksum.generation:      false
fs.s3a.create.checksum.algorithm: ""

Consult the documentation for more details.

Contributed by Steve Loughran
steveloughran added a commit that referenced this pull request Nov 11, 2025
AWS SDK upgraded to 2.35.4.

This SDK has changed checksum/checksum headers handling significantly,
causing problems with third party stores, and, in some combinations
AWS S3 itself.

The S3A connector has retained old behavior; options to change
these settings are now available.

The default settings are chosen for maximum compatiblity and performance.

fs.s3a.request.md5.header:       true
fs.s3a.checksum.generation:      false
fs.s3a.create.checksum.algorithm: ""

Consult the documentation for more details.

Contributed by Steve Loughran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants