Skip to content

Conversation

@steveloughran
Copy link
Contributor

  • Pull out region resolution to new class RegionResolution for isolation and testing through TestRegionResolution.
  • Add new "sdk" and "ec2" mappings to hand down to SDK
  • Test in ITestS3AEndpointRegion to verify that either the SDK fails to resolve with an SDK message or it does resolve to something and the test assert fails instead.

How was this patch tested?

New tests.

  • running regression test with configured endpoint.
  • planning to run a test with a bucket set to "sdk" to see that the sdk handles it through my .aws/config setting.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 34m 10s trunk passed
+1 💚 compile 0m 45s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 46s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 checkstyle 0m 35s trunk passed
+1 💚 mvnsite 0m 53s trunk passed
+1 💚 javadoc 0m 44s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 39s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
-1 ❌ spotbugs 1m 25s /branch-spotbugs-hadoop-tools_hadoop-aws-warnings.html hadoop-tools/hadoop-aws in trunk has 188 extant spotbugs warnings.
+1 💚 shadedclient 26m 9s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 38s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 37s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 23s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 23 new + 16 unchanged - 4 fixed = 39 total (was 20)
+1 💚 mvnsite 0m 43s the patch passed
-1 ❌ javadoc 0m 30s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 17 new + 928 unchanged - 0 fixed = 945 total (was 928)
-1 ❌ javadoc 0m 28s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 15 new + 850 unchanged - 0 fixed = 865 total (was 850)
+1 💚 spotbugs 1m 21s the patch passed
+1 💚 shadedclient 26m 6s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 18s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
103m 5s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/1/artifact/out/Dockerfile
GITHUB PR #8058
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux be91d7e026c0 5.15.0-160-generic #170-Ubuntu SMP Wed Oct 1 10:06:56 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ec5ae6b
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/1/testReport/
Max. process+thread count 613 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/1/console
versions git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

running full test suites with java17 and the test region set to sdk. as usual, it's that forked ITestS3ACommitterMRJob playing up, specifically with "null" somehow ending up as the region/endpoint.

[ERROR] org.apache.hadoop.fs.s3a.commit.integration.ITestS3ACommitterMRJob.test_200_execute -- Time elapsed: 9.958 s <<< FAILURE!
org.opentest4j.AssertionFailedError: 
Job job_1762435181334_0003 failed in state FAILED with cause Job setup failed : java.net.UnknownHostException: getFileStatus on s3a://stevel-london/test/ITestS3ACommitterMRJob-execute-magic/_SUCCESS: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.:    software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.: stevel-london.s3.null.amazonaws.com
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
        at org.apache.hadoop.fs.s3a.impl.ErrorTranslation.wrapWithInnerIOE(ErrorTranslation.java:237)
        at org.apache.hadoop.fs.s3a.impl.ErrorTranslation.maybeExtractIOException(ErrorTranslation.java:207)
        at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:212)
        at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4061)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3971)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.deleteWithoutCloseCheck(S3AFileSystem.java:3570)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.delete(S3AFileSystem.java:3542)
        at org.apache.hadoop.fs.s3a.commit.impl.CommitOperations.deleteSuccessMarker(CommitOperations.java:467)
        at org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter.setupJob(AbstractS3ACommitter.java:611)
        at org.apache.hadoop.fs.s3a.commit.magic.MagicS3GuardCommitter.setupJob(MagicS3GuardCommitter.java:106)
        at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobSetup(CommitterEventHandler.java:255)
        at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:235)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.
        at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
        at software.amazon.awssdk.awscore.interceptor.HelpfulUnknownHostExceptionInterceptor.modifyException(HelpfulUnknownHostExceptionInterceptor.java:59)
        at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.modifyException(ExecutionInterceptorChain.java:181)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.ExceptionReportingUtils.runModifyException(ExceptionReportingUtils.java:54)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.ExceptionReportingUtils.reportFailureToInterceptors(ExceptionReportingUtils.java:38)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:39)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
        at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:210)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
        at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
        at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
        at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
        at software.amazon.awssdk.services.s3.DefaultS3Client.headObject(DefaultS3Client.java:7029)
        at software.amazon.awssdk.services.s3.DelegatingS3Client.lambda$headObject$56(DelegatingS3Client.java:5678)
        at software.amazon.awssdk.services.s3.internal.crossregion.S3CrossRegionSyncClient.invokeOperation(S3CrossRegionSyncClient.java:67)
        at software.amazon.awssdk.services.s3.DelegatingS3Client.headObject(DelegatingS3Client.java:5678)
        at org.apache.hadoop.fs.s3a.impl.S3AStoreImpl.lambda$headObject$3(S3AStoreImpl.java:688)
        at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
        at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
        at org.apache.hadoop.fs.s3a.impl.S3AStoreImpl.headObject(S3AStoreImpl.java:675)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:3048)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:3028)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4043)
        ... 11 more
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: stevel-london.s3.null.amazonaws.com
        at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
        at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper2.setLastException(RetryableStageHelper2.java:226)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:65)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:36)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:53)
        at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:35)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:82)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:43)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
        ... 31 more
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: Unable to execute HTTP request: stevel-london.s3.null.amazonaws.com: nodename nor servname provided, or not known
        Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: Unable to execute HTTP request: stevel-london.s3.null.amazonaws.com
Caused by: java.net.UnknownHostException: stevel-london.s3.null.amazonaws.com
        at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:801)
        at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1385)
        at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1306)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:45)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
        at software.amazon.awssdk.http.apache.internal.conn.ClientConnectionManagerFactory$DelegatingHttpClientConnectionManager.connect(ClientConnectionManagerFactory.java:86)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
        at software.amazon.awssdk.thirdparty.org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
        at software.amazon.awssdk.http.apache.internal.impl.ApacheSdkHttpClient.execute(ApacheSdkHttpClient.java:72)
        at software.amazon.awssdk.http.apache.ApacheHttpClient.execute(ApacheHttpClient.java:254)
        at software.amazon.awssdk.http.apache.ApacheHttpClient.access$500(ApacheHttpClient.java:104)
        at software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:231)
        at software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:228)
        at software.amazon.awssdk.core.internal.util.MetricUtils.measureDurationUnsafe(MetricUtils.java:102)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.executeHttpRequest(MakeHttpRequestStage.java:79)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:57)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:40)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:74)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:43)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:79)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:41)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.executeRequest(RetryableStage2.java:93)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage2.execute(RetryableStage2.java:56)
        ... 43 more

.
Consult logs under /Users/stevel/Projects/hadoop-trunk/hadoop-tools/hadoop-aws/target/test-dir/yarn-2025-11-06-13.19.38.98/yarn-166883501
        at org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:38)
        at org.junit.jupiter.api.Assertions.fail(Assertions.java:138)
        at org.apache.hadoop.fs.s3a.commit.integration.ITestS3ACommitterMRJob.test_200_execute(ITestS3ACommitterMRJob.java:311)
        at java.base/java.lang.reflect.Method.invoke(Method.java:568)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
        at java.base/java.util.Optional.ifPresent(Optional.java:178)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:992)
        at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:762)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:992)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)

…solution.

* Pull out region resolution to new class RegionResolution for isolation
  and testing through TestRegionResolution.
* Add new "sdk" and "ec2" mappings to hand down to SDK
* Test in ITestS3AEndpointRegion to verify that either the SDK fails to resolve
  with an SDK message or it does resolve to *something* and the test assert
  fails instead.
IF the region == ec2 then we don't handle off to the SDK chain, instead
only use the bit of the SDK for metadata retrieval.

Nominally private/unstable, but cloudstore has used it in the past...

If it is removed after an SDK update, we will have to handle it.
@steveloughran steveloughran force-pushed the s3/HADOOP-19740-regions branch from ec5ae6b to 6f662bf Compare November 6, 2025 19:25
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
-1 ❌ mvninstall 3m 34s /branch-mvninstall-root.txt root in trunk failed.
-1 ❌ compile 0m 54s /branch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 24s /branch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
-0 ⚠️ checkstyle 0m 22s /buildtool-branch-checkstyle-hadoop-tools_hadoop-aws.txt The patch fails to run checkstyle in hadoop-aws
-1 ❌ mvnsite 0m 24s /branch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in trunk failed.
-1 ❌ javadoc 0m 24s /branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌ javadoc 0m 24s /branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in trunk failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
-1 ❌ spotbugs 0m 25s /branch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in trunk failed.
+1 💚 shadedclient 2m 52s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 24s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 0m 23s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 23s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 15s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 15s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 21s /buildtool-patch-checkstyle-hadoop-tools_hadoop-aws.txt The patch fails to run checkstyle in hadoop-aws
-1 ❌ mvnsite 0m 23s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ javadoc 0m 23s /patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04.
-1 ❌ javadoc 0m 23s /patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04.
-1 ❌ spotbugs 0m 24s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 shadedclient 4m 35s patch has no errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 24s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+0 🆗 asflicense 0m 16s ASF License check generated no output?
16m 19s
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/2/artifact/out/Dockerfile
GITHUB PR #8058
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 72c378f32934 5.15.0-160-generic #170-Ubuntu SMP Wed Oct 1 10:06:56 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 6f662bf
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/2/testReport/
Max. process+thread count 79 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/2/console
versions git=2.25.1 maven=3.9.11
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

...and no attempt to fall back to central-1 here.

You no longer need to declare a region for a third party store;
endpoint is enough.
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 34m 27s trunk passed
+1 💚 compile 0m 46s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 46s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 checkstyle 0m 36s trunk passed
+1 💚 mvnsite 0m 54s trunk passed
+1 💚 javadoc 0m 44s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 40s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
-1 ❌ spotbugs 1m 26s /branch-spotbugs-hadoop-tools_hadoop-aws-warnings.html hadoop-tools/hadoop-aws in trunk has 188 extant spotbugs warnings.
+1 💚 shadedclient 26m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 40s the patch passed
+1 💚 compile 0m 38s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 38s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 23s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 9 new + 15 unchanged - 5 fixed = 24 total (was 20)
+1 💚 mvnsite 0m 42s the patch passed
-1 ❌ javadoc 0m 31s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 18 new + 926 unchanged - 0 fixed = 944 total (was 926)
-1 ❌ javadoc 0m 29s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 17 new + 848 unchanged - 0 fixed = 865 total (was 848)
+1 💚 spotbugs 1m 24s the patch passed
+1 💚 shadedclient 26m 19s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 18s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
103m 36s
Subsystem Report/Notes
Docker ClientAPI=1.52 ServerAPI=1.52 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/3/artifact/out/Dockerfile
GITHUB PR #8058
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 1aef8133aa53 5.15.0-160-generic #170-Ubuntu SMP Wed Oct 1 10:06:56 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 41e32a2
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/3/testReport/
Max. process+thread count 644 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8058/3/console
versions git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

also, I think we may want to add an enum of fallbacks

fs.s3a.endpoint.region.fallback =  <ec2, sdk, central>

this would define the policy for calculating the region of a bucket which isn't set explicitly, or inferred from the endpoint (vpce, aws, external).

today's behaviour is "central", which doesn't work when the network is locked down in a VPC in a different region.

if you set ec2 or sdk, then unless the region is declared (or set to those explicit region names) then the fallback will be whichever of the defaults.

There's one more thing to consider here: should we always do an EC2/IAM probe before falling back to central?
pro: works great in EC2 with local regions
con: change in behavior when working with remote buckts.

I think the fallback option has better compatibility.

now, given we are doing ec2 region resolution in our own code, we have a choice of what to do when the resolution fails. so what does
endopoint.region=ec2
endpoint.fallback=ec2

mean?

so maybe we only need two fallbacks
central: today
sdk: do the chain of env vars, sysprops and iam info.

@steveloughran
Copy link
Contributor Author

@mukund-thakur @ahmarsuhail thoughts on what I'm doing here? I'm coming round to

  1. explict sdk and ec2 regions
  2. automatically use region "external" if third party endpoint url of any kind
  3. fs.s3a.endpoint.region.fallback enum, currently central and sdk
  4. could have fail where we just fail immediately.

// allow this if people really want it; it is OK to rely on this
// when deployed in EC2.
WARN_OF_DEFAULT_REGION_CHAIN.warn(SDK_REGION_CHAIN_IN_USE);
DEFAULT_REGION_CHAIN.info(SDK_REGION_CHAIN_IN_USE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we tell sdk to determine the region by not setting anything in the builder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we cut the log on line 324, we're logging the same thing twice


// If the region was configured, set it.
// this includes special handling of the sdk, ec2 and "" regions.
if (configuredRegion != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the not configured else part?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay from line 451 I guess.
If fs.s3a.endpoint.region is not set we fall back to fs.s3a.endpoint.. old stuff.

* @param endpoint non-null endpoint URL
* @return true if this is amazonaws or amazonaws china
*/
public static boolean isAwsEndpoint(final String endpoint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can there be other cases. How do we test this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move this into utils, as it's a commonly required method. can see it also exists in NetworkBinding

}

final Region r = resolution.getRegion();
if (r != null && Region.regions().contains(r)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be a !? doesn't contain?

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have done a high level review..this code always gives me a headache.

will take another look tomorrow


/**
* Parses the endpoint to get the region.
* If endpoint is the central one, use US_EAST_2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to update the java doc here, as we are no longer falling back to US_EAST_2

* @param endpoint non-null endpoint URL
* @return true if this is amazonaws or amazonaws china
*/
public static boolean isAwsEndpoint(final String endpoint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's move this into utils, as it's a commonly required method. can see it also exists in NetworkBinding

final String endpointStr = parameters.getEndpoint();
boolean endpointDeclared = endpointStr != null && !endpointStr.isEmpty();
// will be null if endpointStr is null/empty
final URI endpoint = buildEndpointUri(endpointStr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can just do

if (endpointDeclared) {
buildEndpointUri(
}

Comment on lines +436 to +439
} else if (isEc2Region(configuredRegion)) {
// special EC2 handling
final Resolution r = getS3RegionFromEc2IAM();
resolution.withRegion(r.getRegion(), r.getMechanism());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this? the SDK will call IMDS in it's region resolution chain anyway.

the InstanceProfileRegionProvider is marked SdkProtectedApi, so i'm worried if it changes across SDK versions, it's just going to make upgrading more of a pain

if (!endpointDeclared || isAwsEndpoint(endpointStr)) {
// still failing to resolve the region
// fall back to central
resolution.withRegion(US_EAST_2, RegionResolutionMechanism.FallbackToCentral);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we change the fallback to US_EAST_1? We had to do US_EAST_2 during the initial upgrade because there was no cross-region client then, but now US_EAST_1 works and restores consistent behaviour with the 3.3.x versions.

For context, we recently had a customer who was trying to upgrade 3.4.x, and they had access to US_EAST_2, and couldn't figure out why there requests were failing

// allow this if people really want it; it is OK to rely on this
// when deployed in EC2.
WARN_OF_DEFAULT_REGION_CHAIN.warn(SDK_REGION_CHAIN_IN_USE);
DEFAULT_REGION_CHAIN.info(SDK_REGION_CHAIN_IN_USE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we cut the log on line 324, we're logging the same thing twice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants