Skip to content

osiegmar/JavaCsvComparison

Repository files navigation

Interpretation of the CSV Standard by Java Libraries

Unfortunately, there is no real industry standard for CSV files. The closest thing we do currently have (since 2005) is RFC 4180. Working with non-standardized data often comes with surprises, but what exactly happens when parsing CSV data that doesn't match this RFC?

This project is about to find that out...

Note

This project was created during the development of FastCSV. Since this comparison uses the result of FastCSV as a reference value (expected result), the comparison is highly biased.

Test results

The test suite consists of currently 60 tests in total. To pass a test, the library must return the expected result or throw an exception if the input is invalid. If the library lacks a feature (like skipping empty lines or comments), the test is marked as skipped. If a test is neither passed nor skipped, it is marked as failed.

Library Checks passed Checks skipped Checks failed
Commons CSV 48 12
CSVeed 28 7 25
FastCSV 60
Jackson CSV 49 7 4
Java CSV 51 7 2
Opencsv 38 17 5
picocsv 60
sesseltjonna-csv 30 17 13
SimpleFlatMapper 41 17 2
Super CSV 49 7 4
Univocity 51 7 2

A detailed report can be found at: https://osiegmar.github.io/JavaCsvComparison

The tests clearly show that there are significant differences between the libraries – especially when it comes to non-standardized data.

Execute

To run the tests and generate the report locally, execute the following command:

./gradlew test allureAggregateServe

Library Implementor?

The tests are written in a way that they can be easily adapted to other libraries and languages. They can also be used as unit tests for your library, like done in FastCSV.

Feel free to use the tests for your own project.

About

Comparison of parsing non-standard CSV data using different Java libraries.

Resources

License

Security policy

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •  

Languages