Invalid UTF8-character in non-UTF8 file is detected too early, so parsing can not be continued

following code-example can be tested with the attached file (test8.csv). The file is in ISO-8859 format and contains an UTF8 character, which is: é

``` java
            File file = new File("test8.csv");
            InputStream in = Files.newInputStream(file.toPath(), StandardOpenOption.READ);

            CsvSchema schema = CsvSchema.emptySchema().withHeader();
            CsvMapper mapper = new CsvMapper();
            ObjectReader reader = mapper.readerFor(Map.class).with(schema);
            MappingIterator<Map<String, String>> mappingIterator = reader.readValues(in);

            while (mappingIterator.hasNextValue()) {
                Map<String, String> line = mappingIterator.nextValue();
                System.out.println(line);
            }
            mappingIterator.close();
```

the parsing crashes in line _152_ at the call of "nextValue()". But the problematic UTF8 character is in line _185_. So the parsing does not crash at the position of the problematic character but much earlier... (must be because of buffering?)

i just ask, because if the parsing would crash at the exact position of the UTF8 character, we may simple ignore this line and continue with the next line. But this way the parsing crashes earlier and can not be recovered/continued.

Following parse-exception is output:

``` java
java.io.CharConversionException: Invalid UTF-8 middle byte 0x65 (at char #4861, byte #3999): check content encoding, does not look like UTF-8
```

The problematic character in the file test8.csv can be found in VI-Editor with ":goto 4861"

[test8.csv.zip](https://github.com/FasterXML/jackson-dataformat-csv/files/434551/test8.csv.zip)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Invalid UTF8-character in non-UTF8 file is detected too early, so parsing can not be continued #132

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Invalid UTF8-character in non-UTF8 file is detected too early, so parsing can not be continued #132

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions