-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17279][SQL] better error message for exceptions during ScalaUDF execution #14850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @yhuai |
|
Test build #64528 has finished for PR 14850 at commit
|
| f(input) | ||
| } catch { | ||
| case e: NullPointerException => | ||
| throw new RuntimeException(npeErrorMessage, e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we still use NullPointerException? NullPointerException can have a specified message. Then, you can use setStackTrace to set the original stacktrace.
|
Test build #64562 has finished for PR 14850 at commit
|
|
Test build #64652 has finished for PR 14850 at commit
|
| val result = try { | ||
| f(input) | ||
| } catch { | ||
| case e: NullPointerException => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit hacky to set stack trace like this.
npe.setStackTrace(e.getStackTrace)
If user search the code line reported in the stack trace, user may not able to find the code that matches the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For this code branch
eval(input: InternalRow), the existing NPE message should be clear enough if there is a full stacktrace, and the stack contains method of the UDF. - The error message you provided can be totally wrong.
"Given UDF throws NPE during execution, please check the UDF to make sure it handles null parameters correctly".
What if NPE is not caused by null parameter? prompting this message is misleading.
|
There are two branches to execute an UDF.
|
| s".apply($funcTerm.apply(${funcArguments.mkString(", ")}));" | ||
| val getFuncResult = s"$funcTerm.apply(${funcArguments.mkString(", ")})" | ||
| val rethrowException = "throw new org.apache.spark.SparkException" + | ||
| """("Exception happens when execute user code in Scala UDF.", e);""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception happens when executing user defined function (className: input argument type => output argument type) .
Or
`Failed to execute user defined function (className: input argument type => output argument type)
|
Test build #64829 has finished for PR 14850 at commit
|
|
Test build #64833 has finished for PR 14850 at commit
|
|
Test build #64840 has finished for PR 14850 at commit
|
|
Test build #64845 has finished for PR 14850 at commit
|
|
Test build #64889 has finished for PR 14850 at commit
|
|
+1 |
|
thanks for the review, merging to master! |
|
also backport it to 2.0 |
…F execution ## What changes were proposed in this pull request? If `ScalaUDF` throws exceptions during executing user code, sometimes it's hard for users to figure out what's wrong, especially when they use Spark shell. An example ``` org.apache.spark.SparkException: Job aborted due to stage failure: Task 12 in stage 325.0 failed 4 times, most recent failure: Lost task 12.3 in stage 325.0 (TID 35622, 10.0.207.202): java.lang.NullPointerException at line8414e872fb8b42aba390efc153d1611a12.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:40) at line8414e872fb8b42aba390efc153d1611a12.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$2.apply(<console>:40) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) ... ``` We should catch these exceptions and rethrow them with better error message, to say that the exception is happened in scala udf. This PR also does some clean up for `ScalaUDF` and add a unit test suite for it. ## How was this patch tested? the new test suite Author: Wenchen Fan <[email protected]> Closes #14850 from cloud-fan/npe. (cherry picked from commit 8d08f43) Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
If
ScalaUDFthrows exceptions during executing user code, sometimes it's hard for users to figure out what's wrong, especially when they use Spark shell. An exampleWe should catch these exceptions and rethrow them with better error message, to say that the exception is happened in scala udf.
This PR also does some clean up for
ScalaUDFand add a unit test suite for it.How was this patch tested?
the new test suite