Sep 8 2017
Mar 24 2017
I'm now even more convinced that the problem is with the code that replaces 0x85 (incorrectly treated as NEL) with 0x0D+0x0A (CR+LF).
Because xD1 x0D (or xD1 x0A), xD3 x0D (or xD3 x0A), etc. are malformed UTF-8 sequences indeed.
May it be related to the fact that Unicode NEL character (Next Line) is U+0085?
Hence, it should be 0xC2 0x85 in UTF-8, but some code that checks for new lines might check just against 0x85 instead by mistake.