-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
We probably need to change 1 or 2 behaviors when reading invalid data or unknown encodings. There are two cases: Char, and things like readuntil/readline.
readuntil can be reasonably defined in terms of bytes: just read everything until a certain value. This is good because then you can at least get the data without explicit support for every encoding. Currently we might return an invalid UTF8String, from which you can get the (unaltered) data. I don't know whether that is the best approach. Maybe there should be a lower-level routine that returns a byte array. We also need functions that do the same for different fixed-width encodings (16-bit, 32-bit).
Reading a Char I don't think can be done reasonably without knowing the encoding. The best immediate change I can think of is to give an error for invalid data while trying to read a UTF-8-encoded Char.