DX-110907: Backport OAuth2 Token Provider support for Azure Workload Identity #1
+622
−18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR backports HADOOP-18610 from Apache Hadoop branch-3.4 to Dremio's branch-3.3.6.
Original PR: apache#6881
Original Commit: 468b7e5
Author: Anuj Modi (@anujmodi2021)
Adds support for Azure Active Directory (Azure AD) workload identities which integrate with Kubernetes's native capabilities to federate with any external identity provider. This enables ABFS to authenticate using workload identity tokens from files mounted in Kubernetes pods.
Changes Made for branch-3.3.6 Compatibility
1. Merge Conflict Resolutions
AbstractAbfsIntegrationTest.javaassumeValidTestConfigPresent()- validates test configuration presenceassumeValidAuthConfigsPresent()- validates authentication configurationsisAppendBlobEnabled()- checks if append blob is enabledTestAccountConfiguration.javatestConfigPropNotFound()CONFIG_KEYStoconfigKeysparametertokenProviderClassNametosetAuthConfig()callabfsConf.unset(key);before existingabfsConf.unset(key + "." + accountName);2. Compilation Fixes for API Differences
AbfsConfiguration.javaFS_AZURE_ACCOUNT_OAUTH_TOKEN_FILEconstantimport static org.apache.hadoop.fs.azurebfs.constants.ConfigurationKeys.*;while branch-3.3.6 uses 74 individual explicit importsWorkloadIdentityTokenProvider.javaIssue 1: Invalid
@Overrideannotation onisTokenAboutToExpire()methodRoot Cause: In branch-3.4,
AccessTokenProviderhas a protectedisTokenAboutToExpire()instance method that can be overridden. In branch-3.3.6, this method doesn't exist - only a static methodAzureADAuthenticator.isTokenAboutToExpire(token)exists.Fix: Removed
@Overrideannotation and changed fromsuper.isTokenAboutToExpire()toAzureADAuthenticator.isTokenAboutToExpire(cachedToken)Issue 2: Cannot access parent class's private
tokenfieldRoot Cause: The parent
AccessTokenProviderclass has a privatetokenfield that cannot be accessed from child classesFix:
cachedTokenfield:private AzureADToken cachedToken;refreshToken()to cache the token:cachedToken = getTokenUsingJWTAssertion(clientAssertion);isTokenAboutToExpire()to usecachedTokeninstead of parent's private field3. Architectural Differences and Limitations
Clock Skew Detection Not Functional:
AccessTokenProvider.getToken()callsisTokenAboutToExpire()polymorphically, allowingWorkloadIdentityTokenProviderto override it with clock skew detection logicAccessTokenProvider.getToken()directly callsAzureADAuthenticator.isTokenAboutToExpire(this.token)(static method), bypassing any child class overridesisTokenAboutToExpire()method inWorkloadIdentityTokenProvideris dead code - it exists for compatibility but is never executed