-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9270] [PySpark] allow --name option in pyspark #7610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
That seems fine if it still sets the app name as desired and improves consistency. |
|
Test build #38189 has finished for PR 7610 at commit
|
|
retest this please. |
1 similar comment
|
retest this please. |
This is not critical, but not great either; it points at some inconsistency in the code somewhere. From looking at SparkSubmit.scala, it should work, but I haven't explicitly tested it.
Is that true? |
|
Test build #38224 has finished for PR 7610 at commit
|
|
@vanzin thank you for your comments!
So if you run the following cmd- It invokes the following cmd b/c Since In summary, |
|
So the Python part is OK but now does |
|
@srowen In short, |
|
Test build #38232 has finished for PR 7610 at commit
|
|
Hmm, jenkins unstable? |
Hmm, it may not work, but I don't think that's the cause. With your changes, that line should never be reached when starting the shell. What I think is happening is:
So it seems like an ordering issue in SparkSubmit.scala. In any case, it doesn't seem important enough to change just for this particular edge case. The change LGTM. |
|
Test build #38256 has finished for PR 7610 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we support bin/pyspark --name MyName ? spark-shell support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--name MyName works. So the final command will be like this:
spark-submit pyspark-shell-main --name "PySparkShell" --name "MyName" <other args>
Then, MyName takes effect since it comes later than PySparkShell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, this is how --name is supported in spark-shell too (#7512). I am following the pattern introduced by that patch.
|
Merged to master. Thanks! |
This is continuation of #7512 which added
--nameoption to spark-shell. This PR adds the same option to pyspark.Note that
--conf spark.app.namein command-line has no effect in spark-shell and pyspark. Instead,--namemust be used. This is in fact inconsistency with spark-sql which doesn't accept--nameoption while it accepts--conf spark.app.name. I am not fixing this inconsistency in this PR. IMO, one of--nameand--conf spark.app.nameis needed not both. But since I cannot decide which to choose, I am not making any change here.