-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Closed
Description
Given the following JSON file:
$ cat /tmp/sample.json
{ "id": 1, "name": "Alice" }
{ "id": 2, "name": "Bob" }
{ "id": 3, "name": "Carol" }
{ "id": 4, "name": "Dave" }using to-avro on the master branch for converting this into avro fails with NPE:
$ git branch -v
* master 47398be7 PARQUET-1375: Upgrade to Jackson 2.9.9 (#616)
$ mvn clean install -DskipTests
(snip)
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ parquet-cli ---
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.jar
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/pom.xml to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.pom
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-tests.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-tests.jar
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-runtime.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 14.769 s
[INFO] Finished at: 2019-06-12T23:52:57+09:00
[INFO] ------------------------------------------------------------------------
$ mvn dependency:copy-dependencies
(snip)
$ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main to-avro /tmp/sample.json -o /tmp/sample.avro
Unknown error
java.lang.RuntimeException: Failed on record 0
at org.apache.parquet.cli.commands.ToAvroCommand.run(ToAvroCommand.java:120)
at org.apache.parquet.cli.Main.run(Main.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.parquet.cli.Main.main(Main.java:177)
Caused by: java.lang.NullPointerException
at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:153)
at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:145)
at org.apache.parquet.cli.commands.ToAvroCommand.run(ToAvroCommand.java:112)
... 3 more
$ echo $?
1But with its previous revision, it succeeds:
$ git checkout HEAD^
HEAD is now at 9d6fb45e PARQUET-1576 Bump Apache Avro to 1.9.0 (#638)
$ mvn clean install -DskipTests
(snip)
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ parquet-cli ---
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.jar
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/pom.xml to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT.pom
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-tests.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-tests.jar
[INFO] Installing /home/sekikn/repo/parquet-mr/parquet-cli/target/parquet-cli-1.12.0-SNAPSHOT-runtime.jar to /home/sekikn/.m2/repository/org/apache/parquet/parquet-cli/1.12.0-SNAPSHOT/parquet-cli-1.12.0-SNAPSHOT-runtime.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 15.822 s
[INFO] Finished at: 2019-06-12T23:57:04+09:00
[INFO] ------------------------------------------------------------------------
$ mvn dependency:copy-dependencies
(snip)
$ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main to-avro /tmp/sample.json -o /tmp/sample.avro
$ echo $?
0
$ java -cp 'target/*:target/dependency/*' org.apache.parquet.cli.Main head /tmp/sample.avro
{"id": 1, "name": "Alice"}
{"id": 2, "name": "Bob"}
{"id": 3, "name": "Carol"}
{"id": 4, "name": "Dave"}Reverting the following code
public static Iterator<JsonNode> parser(final InputStream stream) {
try(JsonParser parser = FACTORY.createParser(stream)) {to
public static Iterator<JsonNode> parser(final InputStream stream) {
try {
JsonParser parser = FACTORY.createParser(stream);seems to work.
cc [~Fokko] :)
Reporter: Kengo Seki / @sekikn
Assignee: Fokko Driesprong / @Fokko
PRs and other links:
Note: This issue was originally created as PARQUET-1596. Please see the migration documentation for further details.