-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HADOOP-19740. S3A: add explicit "sdk" and "ec2" regions for region resolution. #8058
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
running full test suites with java17 and the test region set to sdk. as usual, it's that forked |
…solution. * Pull out region resolution to new class RegionResolution for isolation and testing through TestRegionResolution. * Add new "sdk" and "ec2" mappings to hand down to SDK * Test in ITestS3AEndpointRegion to verify that either the SDK fails to resolve with an SDK message or it does resolve to *something* and the test assert fails instead.
IF the region == ec2 then we don't handle off to the SDK chain, instead only use the bit of the SDK for metadata retrieval. Nominally private/unstable, but cloudstore has used it in the past... If it is removed after an SDK update, we will have to handle it.
ec5ae6b to
6f662bf
Compare
|
💔 -1 overall
This message was automatically generated. |
...and no attempt to fall back to central-1 here. You no longer need to declare a region for a third party store; endpoint is enough.
|
💔 -1 overall
This message was automatically generated. |
|
also, I think we may want to add an enum of fallbacks this would define the policy for calculating the region of a bucket which isn't set explicitly, or inferred from the endpoint (vpce, aws, external). today's behaviour is "central", which doesn't work when the network is locked down in a VPC in a different region. if you set ec2 or sdk, then unless the region is declared (or set to those explicit region names) then the fallback will be whichever of the defaults. There's one more thing to consider here: should we always do an EC2/IAM probe before falling back to central? I think the fallback option has better compatibility. now, given we are doing ec2 region resolution in our own code, we have a choice of what to do when the resolution fails. so what does mean? so maybe we only need two fallbacks |
|
@mukund-thakur @ahmarsuhail thoughts on what I'm doing here? I'm coming round to
|
| // allow this if people really want it; it is OK to rely on this | ||
| // when deployed in EC2. | ||
| WARN_OF_DEFAULT_REGION_CHAIN.warn(SDK_REGION_CHAIN_IN_USE); | ||
| DEFAULT_REGION_CHAIN.info(SDK_REGION_CHAIN_IN_USE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we tell sdk to determine the region by not setting anything in the builder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we cut the log on line 324, we're logging the same thing twice
|
|
||
| // If the region was configured, set it. | ||
| // this includes special handling of the sdk, ec2 and "" regions. | ||
| if (configuredRegion != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the not configured else part?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay from line 451 I guess.
If fs.s3a.endpoint.region is not set we fall back to fs.s3a.endpoint.. old stuff.
| * @param endpoint non-null endpoint URL | ||
| * @return true if this is amazonaws or amazonaws china | ||
| */ | ||
| public static boolean isAwsEndpoint(final String endpoint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can there be other cases. How do we test this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this into utils, as it's a commonly required method. can see it also exists in NetworkBinding
| } | ||
|
|
||
| final Region r = resolution.getRegion(); | ||
| if (r != null && Region.regions().contains(r)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be a !? doesn't contain?
ahmarsuhail
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have done a high level review..this code always gives me a headache.
will take another look tomorrow
|
|
||
| /** | ||
| * Parses the endpoint to get the region. | ||
| * If endpoint is the central one, use US_EAST_2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to update the java doc here, as we are no longer falling back to US_EAST_2
| * @param endpoint non-null endpoint URL | ||
| * @return true if this is amazonaws or amazonaws china | ||
| */ | ||
| public static boolean isAwsEndpoint(final String endpoint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move this into utils, as it's a commonly required method. can see it also exists in NetworkBinding
| final String endpointStr = parameters.getEndpoint(); | ||
| boolean endpointDeclared = endpointStr != null && !endpointStr.isEmpty(); | ||
| // will be null if endpointStr is null/empty | ||
| final URI endpoint = buildEndpointUri(endpointStr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just do
if (endpointDeclared) {
buildEndpointUri(
}
| } else if (isEc2Region(configuredRegion)) { | ||
| // special EC2 handling | ||
| final Resolution r = getS3RegionFromEc2IAM(); | ||
| resolution.withRegion(r.getRegion(), r.getMechanism()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need this? the SDK will call IMDS in it's region resolution chain anyway.
the InstanceProfileRegionProvider is marked SdkProtectedApi, so i'm worried if it changes across SDK versions, it's just going to make upgrading more of a pain
| if (!endpointDeclared || isAwsEndpoint(endpointStr)) { | ||
| // still failing to resolve the region | ||
| // fall back to central | ||
| resolution.withRegion(US_EAST_2, RegionResolutionMechanism.FallbackToCentral); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we change the fallback to US_EAST_1? We had to do US_EAST_2 during the initial upgrade because there was no cross-region client then, but now US_EAST_1 works and restores consistent behaviour with the 3.3.x versions.
For context, we recently had a customer who was trying to upgrade 3.4.x, and they had access to US_EAST_2, and couldn't figure out why there requests were failing
| // allow this if people really want it; it is OK to rely on this | ||
| // when deployed in EC2. | ||
| WARN_OF_DEFAULT_REGION_CHAIN.warn(SDK_REGION_CHAIN_IN_USE); | ||
| DEFAULT_REGION_CHAIN.info(SDK_REGION_CHAIN_IN_USE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we cut the log on line 324, we're logging the same thing twice
How was this patch tested?
New tests.
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?