Skip to content

Conversation

@southernriver
Copy link
Contributor

What changes were proposed in this pull request?

When we start spark-shell and use the udf for the first statement ,it's ok. But for the other statements it failed to load jar to current classpath and would throw ClassNotFoundException.It seems like that the first ClassLoader is different from the other's. For Spark-shell, the maintained class loader is always IMainsTranslatingClassLoader ,and for addJar Operation, the current classLoader is NonClosableMutuableclassLoader. For the first statement, there jar was loaded to right classLoader,and for other statements, the jar has been registered to functionRegistry and would not reload to NonClosableMutuableclassLoader, we need to reset classloader to active sparkSession's.

Here, I will show difference between the first statement and the second.
First statement is NonClosableMutuableclassLoader:
image

Second statement is IMainsTranslatingClassLoader:
image

Why are the changes needed?

The problem can be reproduced as described in the below.

scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
 ----------------------
 |bigdata_test.Add(1, 2)|
 ----------------------
 |                     3|
 ----------------------
 scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
 org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 'scala.didi.udf.Add': java.lang.ClassNotFoundException: scala.didi.udf.Add; line 1 pos 8
   at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:251)
   at org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:56)
   at org.apache.spark.sql.hive.HiveSimpleUDF.function(hiveUDFs.scala:56)
   at org.apache.spark.sql.hive.HiveSimpleUDF.method$lzycompute(hiveUDFs.scala:60)
   at org.apache.spark.sql.hive.HiveSimpleUDF.method(hiveUDFs.scala:59)
   at org.apache.spark.sql.hive.HiveSimpleUDF.dataType$lzycompute(hiveUDFs.scala:77)
   at org.apache.spark.sql.hive.HiveSimpleUDF.dataType(hiveUDFs.scala:77)
   at org.apache.spark.sql.hive.HiveSessionCatalog$$anonfun$makeFunctionExpression$3.apply(HiveSessionCatalog.scala:79)
   at org.apache.spark.sql.hive.HiveSessionCatalog$$anonfun$makeFunctionExpression$3.apply(HiveSessionCatalog.scala:71)
   at scala.util.Try.getOrElse(Try.scala:79)
   at org.apache.spark.sql.hive.HiveSessionCatalog.makeFunctionExpression(HiveSessionCatalog.scala:71)
   at org.apache.spark.sql.catalyst.catalog.SessionCatalog$$anonfun$org$apache$spark$sql$catalyst$catalog$SessionCatalog$$makeFunctionBuilder$1.apply(SessionCatalog.scala:1133)

After fix:

scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
+----------------------+                                                       
|bigdata_test.Add(1, 2)|
+----------------------+
|                     3|
+----------------------+
 
scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
+----------------------+
|bigdata_test.Add(1, 2)|
+----------------------+
|                     3|
+----------------------+

we should resolve this bug!

Does this PR introduce any user-facing change?

No

How was this patch tested?

manual.

@southernriver
Copy link
Contributor Author

cc @cloud-fan @maropu @dongjoon-hyun

@southernriver
Copy link
Contributor Author

@AmplabJenkins

@dongjoon-hyun
Copy link
Member

ok to test

@maropu
Copy link
Member

maropu commented Dec 15, 2019

Is this related to #23921? It seems the #23921 has a different approach from this. cc: @HyukjinKwon

@SparkQA
Copy link

SparkQA commented Dec 15, 2019

Test build #115349 has finished for PR 26888 at commit 7b24673.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Yeah, seems a duplicate.

@HyukjinKwon
Copy link
Member

@southernriver can you explain why/when the class loaders are changed?

@cloud-fan
Copy link
Contributor

should IMainTranslatingClassLoader fallback to spark context class loader?

@HeartSaVioR
Copy link
Contributor

That's duplicated with #23921 and IMHO #23921 is clearer fix. As it is quite old and no test, I've just took over and raised #27025.

@dongjoon-hyun
Copy link
Member

According to the above discussion, I'll close this PR and SPARK-30260 . Thank you, @southernriver and all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants