Steps to reproduce:
- Run the command (from cli): echo -e "bullet\x95bullet" | php edit.php unicode
- Go to page named unicode
Expected behaviour:
One of the following: An exception (my preference), output the page with invalid unicode, or output the page with invalid unicode replaced with the replacement character.
Actual behaviour:
The page contents is treated as an empty string. (The skin and stuff renders fine, just the page contents are blank)
Perhaps there is a unicode regex somewhere that is returning false, which then gets cast to the empty string.
In my opinion, while we don't have to entirely behave sensibly in this situation, outputting nothing is very confusing, and the fact that we're outputting nothing probably means we're ignoring error handling somewhere. We should not ignore error handling, and instead give a big exception error instead.
Originally reported at https://www.mediawiki.org/wiki/Topic:Vi3gbx8m6gjq6v1k According to report this behaviour is new in 1.34 and not present in 1.33