Symbol with UTF-8 code 0xCE 0x87 causes "The supplied MD5 hash was incorrect" on posting it throw https://www.wikidata.org/w/api.php?action=edit.
The response says "NFC-normalized Unicode without C0 control characters other than...", but the symbol is looked good, see: https://unicode-table.com/en/0387/
The issue can be reproduced using 2-byte document "0xCE 0x87".
Request:
POST https://www.wikidata.org/w/api.php?action=edit&bot=1&assert=bot&format=json&utf8=true&md5=b80d5a5d9193d69ce0b1009c31587da5¬minor&nocreate&basetimestamp=2017-04-24T17:57:35Z&starttimestamp=2017-04-24T17:57:35Z&title=Wikidata:Database%20reports%2FConstraint%20violations%2FP274 HTTP/1.1 Content-Type: multipart/form-data; boundary=---------------------------15841t4258657059076 Content-Length: 550 User-Agent: C++ WikiAPI Host: www.wikidata.org Connection: Keep-Alive Cache-Control: no-cache Cookie: WMF-Last-Access=24-Apr-2017; wikidatawikiUserName=KrBot; wikidatawikiSession=<cut>; forceHTTPS=true; wikidatawikiUserID=<cut>; centralauth_User=KrBot; centralauth_Token=<cut>; centralauth_Session=<cut>; WMF-Last-Access-Global=24-Apr-2017; GeoIP=<cut> -----------------------------15841t4258657059076 Content-Disposition: form-data; name="text" Content-Type: application/x-www-form-urlencoded · -----------------------------15841t4258657059076 Content-Disposition: form-data; name="summary" Content-Type: application/x-www-form-urlencoded update -----------------------------15841t4258657059076 Content-Disposition: form-data; name="token" Content-Type: application/x-www-form-urlencoded <cut>+\ -----------------------------15841t4258657059076--
Responce:
HTTP/1.1 200 OK Date: Mon, 24 Apr 2017 17:57:36 GMT Content-Type: application/json; charset=utf-8 Connection: keep-alive Server: mw2221.codfw.wmnet X-Powered-By: HHVM/3.12.14 X-Content-Type-Options: nosniff Cache-control: private, must-revalidate, max-age=0 MediaWiki-API-Error: badmd5 X-Frame-Options: DENY Vary: Accept-Encoding Backend-Timing: D=<cut> t=<cut> X-Varnish: <cut>, <cut>, <cut> Via: 1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4 Accept-Ranges: bytes Age: 0 X-Cache: cp2019 pass, cp3033 pass, cp3041 pass X-Cache-Status: pass Strict-Transport-Security: max-age=31536000; includeSubDomains; preload X-Analytics: WMF-Last-Access=24-Apr-2017;WMF-Last-Access-Global=24-Apr-2017;https=1 X-Client-IP: <cut> Content-Length: 565 {"error":{"code":"badmd5","info":"The supplied MD5 hash was incorrect.","*":"See https://www.wikidata.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes."},"warnings":{"edit":{"*":"The value passed for \"text\" contains invalid or non-normalized data. Textual data should be valid, NFC-normalized Unicode without C0 control characters other than HT (\\t), LF (\\n), and CR (\\r)."}},"servedby":"mw2221"}