-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-31464][BUILD][SS] Upgrade Kafka to 2.5.0 #28235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #121386 has finished for PR 28235 at commit
|
|
Retest this please. |
|
Just curious, did you hit the issue KAFKA-9241 in production or likewise? Because if KAFKA-9241 is worth to consider as outstanding issue, then we probably may want to get this in Spark 3.0.0 as well. If the affected version on KAFKA-9241 is correct - KIP-368 seemed to be introduced in Kafka 2.2 so likely - then it would affect starting from Spark 3.0.0, not Spark 2.x. (Yeah I also waited for Kafka 2.5.0 to remove the usage of |
|
cc. @gaborgsomogyi as the author of #25135 (SPARK-28367) |
This comment has been minimized.
This comment has been minimized.
|
retest this please |
gaborgsomogyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (pending tests)
This comment has been minimized.
This comment has been minimized.
|
@dongjoon-hyun BTW, don't we need to modify this? spark/external/kafka-0-10-sql/pom.xml Line 111 in a7fb330
2.5.0 uses 3.5.7 zookeeper... |
|
Thank you, @HeartSaVioR , @HyukjinKwon , @gaborgsomogyi . |
|
Retest this please. |
This comment has been minimized.
This comment has been minimized.
|
The failure in the above SBT Jenkins run seems to be a known flaky test, |
This comment has been minimized.
This comment has been minimized.
|
It seems that our Maven Jenkins build grows again. The master branch maven Java/Scala only build takes There is another timeout failure in another PR, too. I will re-trigger this PR after midnight (PST). |
|
Retest this please. |
|
Retest this please. |
This comment has been minimized.
This comment has been minimized.
|
Retest this please. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Retest this please. |
|
Retest this please. |
|
Test build #121457 has finished for PR 28235 at commit
|
|
Test build #121455 has finished for PR 28235 at commit
|
|
Test build #121456 has finished for PR 28235 at commit
|
|
Retest this please. |
1 similar comment
|
Retest this please. |
|
Test build #121462 has finished for PR 28235 at commit
|
|
Test build #121463 has finished for PR 28235 at commit
|
|
The above job passed all Scala/Java/Python tests. It's timeout at R testing which is irrelevant. |
|
Retest this please. |
|
Test build #121472 has finished for PR 28235 at commit
|
|
As I wrote in the PR description, the following is the current status.
|
|
Hi, @viirya , @HyukjinKwon , @maropu .
|
|
Thank you so much, @viirya . |
|
Thank you all again! |
|
Thanks for pinging me, @dongjoon-hyun, late LGTM. |
### What changes were proposed in this pull request? This PR aims to upgrade Kafka library to 2.5.0 for Apache Spark 3.1.0. ### Why are the changes needed? Apache Kafka 2.5.0 client has improvements and bug fixes like [KAFKA-9241](https://issues.apache.org/jira/browse/KAFKA-9241) - https://downloads.apache.org/kafka/2.5.0/RELEASE_NOTES.html ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Pass the Jenkins with the existing tests. - [x] SBT apache#28235 (comment) - [x] Maven apache#28235 (comment) (All Scala/Java/Python/R UT tests passed. It's timeout during R installation testing which is already covered by SBT.) Closes apache#28235 from dongjoon-hyun/SPARK-KAFKA-2.5. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? This PR aims to upgrade Kafka library to 2.5.0 for Apache Spark 3.1.0. ### Why are the changes needed? Apache Kafka 2.5.0 client has improvements and bug fixes like [KAFKA-9241](https://issues.apache.org/jira/browse/KAFKA-9241) - https://downloads.apache.org/kafka/2.5.0/RELEASE_NOTES.html ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Pass the Jenkins with the existing tests. - [x] SBT apache#28235 (comment) - [x] Maven apache#28235 (comment) (All Scala/Java/Python/R UT tests passed. It's timeout during R installation testing which is already covered by SBT.) Closes apache#28235 from dongjoon-hyun/SPARK-KAFKA-2.5. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…m the `streaming-kafka-0-10` and `sql-kafka-0-10` modules ### What changes were proposed in this pull request? SPARK-31464 | #28235 specifies that the Zookeeper version used for testing in the `streaming-kafka-0-10` and `sql-kafka-0-10` modules is 3.5.7, while at that time, the default Zookeeper version used by Spark was 3.4.14. In SPARK-45956, the default Zookeeper version used by Spark was upgraded to 3.9.1, which is higher than both 3.5.7 and 3.6.4, the latter being used by the Kafka 3.4.1 embedded server. This PR try to make the `streaming-kafka-0-10` and `sql-kafka-0-10` modules also use the project's default Zookeeper version(3.9.1) for testing, rather than relying on a special version. ### Why are the changes needed? Make the `streaming-kafka-0-10` and `sql-kafka-0-10` modules use the default Zookeeper version of Spark for testing. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes #44230 from LuciferYang/kafka-zk. Lead-authored-by: yangjie01 <[email protected]> Co-authored-by: YangJie <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…m the `streaming-kafka-0-10` and `sql-kafka-0-10` modules ### What changes were proposed in this pull request? SPARK-31464 | apache#28235 specifies that the Zookeeper version used for testing in the `streaming-kafka-0-10` and `sql-kafka-0-10` modules is 3.5.7, while at that time, the default Zookeeper version used by Spark was 3.4.14. In SPARK-45956, the default Zookeeper version used by Spark was upgraded to 3.9.1, which is higher than both 3.5.7 and 3.6.4, the latter being used by the Kafka 3.4.1 embedded server. This PR try to make the `streaming-kafka-0-10` and `sql-kafka-0-10` modules also use the project's default Zookeeper version(3.9.1) for testing, rather than relying on a special version. ### Why are the changes needed? Make the `streaming-kafka-0-10` and `sql-kafka-0-10` modules use the default Zookeeper version of Spark for testing. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#44230 from LuciferYang/kafka-zk. Lead-authored-by: yangjie01 <[email protected]> Co-authored-by: YangJie <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR aims to upgrade Kafka library to 2.5.0 for Apache Spark 3.1.0.
Why are the changes needed?
Apache Kafka 2.5.0 client has improvements and bug fixes like KAFKA-9241
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Pass the Jenkins with the existing tests.