-
Notifications
You must be signed in to change notification settings - Fork 50
[SPARK-49045] Add docker image build for operator #28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, this is a broken code due to the missing SparkOperator class, isn't it?
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build-tools/docker/Dockerfile
Outdated
| COPY . . | ||
| RUN ./gradlew clean build -x test | ||
|
|
||
| FROM eclipse-temurin:17-jre-jammy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your intention, but it seems that we need to avoid this kind of confusion in the community.
Specifically, what makes you think in this way? This is not aligned with Apache Spark 4.0.0 in three ways.
We use the latest gradle 8.9 image as builder, with eclipse-temurin:17-jre-jammy - the latter still uses base image version 22.04 which is inline with version used in core Spark.
I guess you are confused the following three things.
-
JDK is the minimum requirement instead of JRE to support the full Spark feature.
-
We use
azul/zulu-openjdkin Spark 4 instead ofeclipse-temurin. -
Lastly, SPARK-49005 is for only Spark 3.x JIRA. Please see the JIRA information.
In other words, Apache Spark 4 is using azul/zulu-openjdk:21.
-
Apache Spark 4.0.0-preview2
https://github.com/apache/spark/blob/7a7a8bc4bab591ac8b98b2630b38c57adf619b82/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile#L17-L19 -
Apache Spark master
https://github.com/apache/spark/blob/6706c41cb42cbd270a6580385be67b2a2313df27/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile#L17-L19
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification!
Yes, now I'm +1 for updating the base image for operator to match 4.x.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for update. Could you fix CI failure, @jiangzho ?
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, it seems to fail still.
Could you test it in your repository first?
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @jiangzho .
Merged to main.
…tead of `test` ### What changes were proposed in this pull request? This PR aims to speed up docker image building and reducing the required memory by excluding `check` instead of `test`. ### Why are the changes needed? Initially, we exclude only `test` task. However, we can exclude the whole `check` lifecycle completely. - #28 This is required in three ways. - Speeding up docker image building. - Reducing required memory resources during image building. - Enabling `K8s integration tests` test pipeline later in the limited `GitHub Action CI` test pipeline. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #46 from dongjoon-hyun/SPARK-49237. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…harts (apache#28) * Add Rio pipelines to publish Helm charts to Apple Helm artifactory, Update CRD short name to sparkappv2 to avoid conflict with AIML's kubeflow Spark operator * cleanup, and disable git trigger
What changes were proposed in this pull request?
This PR adds
Dockerfileanddocker-entrypoint.shfor building operator docker image, also adds a CI task to build the image.We use the latest gradle 8.9 image as builder, with
eclipse-temurin:17-jre-jammy- the latter still uses base image version22.04which is inline with version used in core Spark.Why are the changes needed?
This is needed to package and deploy Spark operator.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing & proposed new CI tasks.
Was this patch authored or co-authored using generative AI tooling?
No