Skip to content
This repository was archived by the owner on Nov 15, 2024. It is now read-only.

Commit a758fd6

Browse files
sujith71955MatthewRBruce
authored andcommitted
[SPARK-22601][SQL] Data load is getting displayed successful on providing non existing nonlocal file path
## What changes were proposed in this pull request? When user tries to load data with a non existing hdfs file path system is not validating it and the load command operation is getting successful. This is misleading to the user. already there is a validation in the scenario of none existing local file path. This PR has added validation in the scenario of nonexisting hdfs file path ## How was this patch tested? UT has been added for verifying the issue, also snapshots has been added after the verification in a spark yarn cluster Author: sujith71955 <[email protected]> Closes apache#19823 from sujith71955/master_LoadComand_Issue. (cherry picked from commit 16adaf6) Signed-off-by: gatorsmile <[email protected]>
1 parent 32f1a61 commit a758fd6

File tree

2 files changed

+17
-1
lines changed

2 files changed

+17
-1
lines changed

sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -333,7 +333,7 @@ case class LoadDataCommand(
333333
uri
334334
} else {
335335
val uri = new URI(path)
336-
if (uri.getScheme() != null && uri.getAuthority() != null) {
336+
val hdfsUri = if (uri.getScheme() != null && uri.getAuthority() != null) {
337337
uri
338338
} else {
339339
// Follow Hive's behavior:
@@ -373,6 +373,13 @@ case class LoadDataCommand(
373373
}
374374
new URI(scheme, authority, absolutePath, uri.getQuery(), uri.getFragment())
375375
}
376+
val hadoopConf = sparkSession.sessionState.newHadoopConf()
377+
val srcPath = new Path(hdfsUri)
378+
val fs = srcPath.getFileSystem(hadoopConf)
379+
if (!fs.exists(srcPath)) {
380+
throw new AnalysisException(s"LOAD DATA input path does not exist: $path")
381+
}
382+
hdfsUri
376383
}
377384

378385
if (partition.nonEmpty) {

sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1983,4 +1983,13 @@ class HiveDDLSuite
19831983
}
19841984
}
19851985
}
1986+
1987+
test("load command for non local invalid path validation") {
1988+
withTable("tbl") {
1989+
sql("CREATE TABLE tbl(i INT, j STRING)")
1990+
val e = intercept[AnalysisException](
1991+
sql("load data inpath '/doesnotexist.csv' into table tbl"))
1992+
assert(e.message.contains("LOAD DATA input path does not exist"))
1993+
}
1994+
}
19861995
}

0 commit comments

Comments
 (0)