Skip to content

Conversation

@LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Dec 7, 2023

What changes were proposed in this pull request?

SPARK-31464 | #28235 specifies that the Zookeeper version used for testing in the streaming-kafka-0-10 and sql-kafka-0-10 modules is 3.5.7, while at that time, the default Zookeeper version used by Spark was 3.4.14.

In SPARK-45956, the default Zookeeper version used by Spark was upgraded to 3.9.1, which is higher than both 3.5.7 and 3.6.4, the latter being used by the Kafka 3.4.1 embedded server. This PR try to make the streaming-kafka-0-10 and sql-kafka-0-10 modules also use the project's default Zookeeper version(3.9.1) for testing, rather than relying on a special version.

Why are the changes needed?

Make the streaming-kafka-0-10 and sql-kafka-0-10 modules use the default Zookeeper version of Spark for testing.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass GitHub Actions

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang LuciferYang changed the title Make streaming-kafka-0-10 and sql-kafka-0-10 test with the same zookeeper version as the project [WIP] Make streaming-kafka-0-10 and sql-kafka-0-10 test with the same zookeeper version as the project Dec 7, 2023
@LuciferYang LuciferYang marked this pull request as draft December 7, 2023 07:05
@LuciferYang
Copy link
Contributor Author

Test first

<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-minikdc</artifactId>
</dependency>
<!-- Kafka embedded server uses Zookeeper 3.5.7 API -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this configuration was added in #28235, Spark defaulted to using Zookeeper 3.4.14.

spark/pom.xml

Line 125 in 7f47570

<zookeeper.version>3.4.14</zookeeper.version>

Currently, Spark has upgraded the Zookeeper version to 3.9.1. Let's see if streaming-kafka-0-10 and sql-kafka-0-10 can also successfully test with 3.9.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @dongjoon-hyun

Currently, Spark uses Kafka 3.4.1, and its embedded server uses Zookeeper 3.6.4. If we wish to continue using a special version of Zookeeper in these two modules, please let me know.

@LuciferYang LuciferYang changed the title [WIP] Make streaming-kafka-0-10 and sql-kafka-0-10 test with the same zookeeper version as the project [WIP] Make streaming-kafka-0-10 and sql-kafka-0-10 test with the same Zookeeper version as the project Dec 7, 2023
@LuciferYang LuciferYang changed the title [WIP] Make streaming-kafka-0-10 and sql-kafka-0-10 test with the same Zookeeper version as the project [WIP] Remove the special Zookeeper version in the streaming-kafka-0-10 and sql-kafka-0-10 modules Dec 7, 2023
@LuciferYang LuciferYang changed the title [WIP] Remove the special Zookeeper version in the streaming-kafka-0-10 and sql-kafka-0-10 modules [SPARK-46305][BUILD][SS] Remove the special Zookeeper version in the streaming-kafka-0-10 and sql-kafka-0-10 modules Dec 7, 2023
@LuciferYang LuciferYang marked this pull request as ready for review December 7, 2023 11:35
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-46305][BUILD][SS] Remove the special Zookeeper version in the streaming-kafka-0-10 and sql-kafka-0-10 modules [SPARK-46305][BUILD][SS] Remove Zookeeper 3.5.7 test dependency from the streaming-kafka-0-10 and sql-kafka-0-10 modules Dec 7, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Nice catch!

dbatomic pushed a commit to dbatomic/spark that referenced this pull request Dec 11, 2023
…m the `streaming-kafka-0-10` and `sql-kafka-0-10` modules

### What changes were proposed in this pull request?
SPARK-31464 | apache#28235 specifies that the Zookeeper version used for testing in the `streaming-kafka-0-10` and `sql-kafka-0-10` modules is 3.5.7, while at that time, the default Zookeeper version used by Spark was 3.4.14.

In SPARK-45956, the default Zookeeper version used by Spark was upgraded to 3.9.1, which is higher than both 3.5.7 and 3.6.4, the latter being used by the Kafka 3.4.1 embedded server. This PR try to make the `streaming-kafka-0-10` and `sql-kafka-0-10` modules also use the project's default Zookeeper version(3.9.1) for testing, rather than relying on a special version.

### Why are the changes needed?
Make the `streaming-kafka-0-10` and `sql-kafka-0-10` modules use the default Zookeeper version of Spark for testing.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#44230 from LuciferYang/kafka-zk.

Lead-authored-by: yangjie01 <[email protected]>
Co-authored-by: YangJie <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants