Skip to content

Conversation

@gaborgsomogyi
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds two new config properties: spark.driver.defaultJavaOptions and spark.executor.defaultJavaOptions. These are intended to be set by administrators in a file of defaults for options like JVM garbage collection algorithm. Users will still set extraJavaOptions properties, and both sets of JVM options will be added to start a JVM (default options are prepended to extra options).

How was this patch tested?

Existing + additional unit tests.

cd docs/
SKIP_API=1 jekyll build

Manual webpage check.

@gaborgsomogyi
Copy link
Contributor Author

cc @vanzin @squito @rdblue since you were involved in the previous PR.

I've considered many options but this looks the best from my perspective.
I personally don't like the fact that it's not possible to concatenate options (just like Imran) but I think introducing it would produce more problems than solutions. Setting it in one (now two) places requires attention from user side (more coding to merge options properly) but then I think it's more clear what exactly will be set.

@SparkQA
Copy link

SparkQA commented Jun 5, 2019

Test build #106196 has finished for PR 24804 at commit 7402291.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Jun 5, 2019

You'll need to add similar logic to SparkSubmitCommandBuilder.java so that the client mode driver also picks up this new setting. Bonus points for making it generic so that it can be easily applied to future default opts we add that affect the driver.

@gaborgsomogyi
Copy link
Contributor Author

Somehow I had the feeling this code part works with SparkConf and default fetched with extra (which is wrong). Not sure how this can be made more generic.

@SparkQA
Copy link

SparkQA commented Jun 6, 2019

Test build #106250 has finished for PR 24804 at commit 9f41dca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 6, 2019

Test build #106251 has finished for PR 24804 at commit ea50aef.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok.

@vanzin
Copy link
Contributor

vanzin commented Jun 24, 2019

BTW it would be good to double check the places where cluster-mode drivers are started and make sure they also pick up the new config.

e.g. StandaloneSubmitRequestServlet seems to create a driver command line and it does this:

val driverExtraJavaOptions = sparkProperties.get(config.DRIVER_JAVA_OPTIONS.key)

Which I think would completely ignore the stuff you're adding here...

 * Removed unused config entries
 * Fallback test added
 * Fixed Mesos and Rest server property handling
@SparkQA
Copy link

SparkQA commented Jun 26, 2019

Test build #106934 has finished for PR 24804 at commit aeb7cce.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 26, 2019

Test build #106941 has finished for PR 24804 at commit a7d7327.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@squito squito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok, though I haven't looked at whether this is getting used everywhere it needs to be yet (marcelo's earlier comment)

@SparkQA
Copy link

SparkQA commented Jun 28, 2019

Test build #106993 has finished for PR 24804 at commit a5224fd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gaborgsomogyi
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jun 28, 2019

Test build #106998 has finished for PR 24804 at commit a5224fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 28, 2019

Test build #107017 has finished for PR 24804 at commit f08c2c6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@vanzin vanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small doc nit; there's also MesosCoarseGrainedSchedulerBackend.scala which seems easy to fix to add support for this. There might be another spot in standalone that may need a change, but I'm having a hard time following that code... if someone cares about it they can figure it out, I guess.

@gaborgsomogyi
Copy link
Contributor Author

there's also MesosCoarseGrainedSchedulerBackend.scala which seems easy to fix to add support for this.

I've double checked the mentioned file but as I see it uses SparkConf to get the extra java options:

    val extraJavaOpts = conf.get(config.EXECUTOR_JAVA_OPTIONS).map {
      Utils.substituteAppNExecIds(_, appId, taskId)
    }.getOrElse("")

and SparkConf gives back the prepended value. Not yet see what can be added there.

@SparkQA
Copy link

SparkQA commented Jul 1, 2019

Test build #107067 has finished for PR 24804 at commit f35eee3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Jul 10, 2019

as I see it uses SparkConf to get the extra java options

You're right, I must have misread it.

@vanzin
Copy link
Contributor

vanzin commented Jul 10, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Jul 11, 2019

Test build #107489 has finished for PR 24804 at commit f35eee3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented Jul 11, 2019

Merging to master.

@vanzin vanzin closed this in f830005 Jul 11, 2019
@rdblue
Copy link
Contributor

rdblue commented Jul 11, 2019

This is great! Thank you for fixing on this @gaborgsomogyi and for reviewing @vanzin and @squito!

vinodkc pushed a commit to vinodkc/spark that referenced this pull request Jul 18, 2019
## What changes were proposed in this pull request?

This PR adds two new config properties: `spark.driver.defaultJavaOptions` and `spark.executor.defaultJavaOptions`. These are intended to be set by administrators in a file of defaults for options like JVM garbage collection algorithm. Users will still set `extraJavaOptions` properties, and both sets of JVM options will be added to start a JVM (default options are prepended to extra options).

## How was this patch tested?

Existing + additional unit tests.
```
cd docs/
SKIP_API=1 jekyll build
```
Manual webpage check.

Closes apache#24804 from gaborgsomogyi/SPARK-23472.

Authored-by: Gabor Somogyi <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>
otterc pushed a commit to linkedin/spark that referenced this pull request Mar 22, 2023
This PR adds two new config properties: `spark.driver.defaultJavaOptions` and `spark.executor.defaultJavaOptions`. These are intended to be set by administrators in a file of defaults for options like JVM garbage collection algorithm. Users will still set `extraJavaOptions` properties, and both sets of JVM options will be added to start a JVM (default options are prepended to extra options).

Existing + additional unit tests.
```
cd docs/
SKIP_API=1 jekyll build
```
Manual webpage check.

Closes apache#24804 from gaborgsomogyi/SPARK-23472.

Authored-by: Gabor Somogyi <[email protected]>
Signed-off-by: Marcelo Vanzin <[email protected]>

Ref: LIHADOOP-55812

RB=2325692
BUG=LIHADOOP-55812
G=spark-reviewers
A=ekrogen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants