-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core. #23499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #100969 has finished for PR 23499 at commit
|
cd7db2f to
47ea157
Compare
|
Test build #100970 has finished for PR 23499 at commit
|
|
Test build #100975 has finished for PR 23499 at commit
|
|
Test build #100983 has finished for PR 23499 at commit
|
|
Test build #100984 has finished for PR 23499 at commit
|
|
retest this please |
|
Test build #101021 has finished for PR 23499 at commit
|
|
In Spark 3.0, shading all libraries that Spark depends is the direction we discussed in the mailing list. @gaborgsomogyi Thus, could you change your PR based on our direction? Thanks! |
|
This is a refactoring so that spark-core doesn't depend on Hive and Kafka at compile time. It's not really related to shading, and it would be distracting to mix both in this PR. If a decision has been made about shading everything (I don't remember a vote), then there will be a lot of work that is unrelated to this PR to get there. |
|
(Agree that shading is a big, important, but separate change to be considered) |
|
@gatorsmile happy to handle this problem in the streaming/kafka area in separate PRs. I would appreciate if you can point the details and the vote to pick up the topic. |
|
(I’d like to know about shading too)
|
...rc/test/scala/org/apache/spark/sql/hive/security/HiveHadoopDelegationTokenManagerSuite.scala
Show resolved
Hide resolved
| def obtainDelegationTokens( | ||
| hadoopConf: Configuration, | ||
| sparkConf: SparkConf, | ||
| fileSystems: Set[FileSystem], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit odd, and ultimately I think that YarnSparkHadoopUtil.hadoopFSsToAccess should be used for everybody (a.k.a. moved to HadoopFSDelegationTokenProvider), but that should be a separate change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Felt the same when changed but didn't have good solution. Good suggestion, will file a jira and resolve it.
core/src/main/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManager.scala
Outdated
Show resolved
Hide resolved
|
Test build #101480 has finished for PR 23499 at commit
|
|
cc @wangyum since he is working on upgrading Hive dependency. |
|
Test build #101597 has finished for PR 23499 at commit
|
core/src/test/scala/org/apache/spark/deploy/security/HadoopDelegationTokenManagerSuite.scala
Show resolved
Hide resolved
...rc/test/scala/org/apache/spark/sql/hive/security/HiveHadoopDelegationTokenManagerSuite.scala
Outdated
Show resolved
Hide resolved
|
Test build #101638 has finished for PR 23499 at commit
|
|
Test build #101673 has finished for PR 23499 at commit
|
|
Test build #101676 has finished for PR 23499 at commit
|
|
Merging to master. |
## What changes were proposed in this pull request? There are ugly provided dependencies inside core for the following: * Hive * Kafka In this PR I've extracted them out. This PR contains the following: * Token providers are now loaded with service loader * Hive token provider moved to hive project * Kafka token provider extracted into a new project ## How was this patch tested? Existing + newly added unit tests. Additionally tested on cluster. Closes apache#23499 from gaborgsomogyi/SPARK-26254. Authored-by: Gabor Somogyi <[email protected]> Signed-off-by: Marcelo Vanzin <[email protected]>
What changes were proposed in this pull request?
There are ugly provided dependencies inside core for the following:
In this PR I've extracted them out. This PR contains the following:
How was this patch tested?
Existing + newly added unit tests.
Additionally tested on cluster.