Page MenuHomePhabricator

testreduce server fails inserting results in db in some cases
Closed, ResolvedPublic

Description

About ~150 tests fails in a test run (noticed this over the last month in rt-testing where only 99.99% of tests succeed whereas it used to be 100% earlier). This has probably been a problem from the beginning of testreduce and something we haven't noticed. After about 5 failing test runs, these test titles are ignore and successful test result % goes back to 100% again (and we forget about the failed titles).

Here is an example of a failing insert.

INSERT INTO results ( page_id, commit_hash, result ) VALUES ( 27797, '509155d5', '<testsuites>\n<testsuite name=\"Roundtrip article enwiki:Talk:List of awards and nominations received by Blackpink\">\n<testcase name=\"enwiki:Talk:List of awards and nominations received by Blackpink character 817\">\n<skipped type=\"insignificantWikitextDiff\">\n------\n== 2 new Apan awards ==\n\n++++++\n==2 new Apan awards==\n\n</skipped>\n</testcase>\n<testcase name=\"enwiki:Talk:List of awards and nominations received by Blackpink character 2440\">\n<skipped type=\"insignificantWikitextDiff\">\n------\n:::{{done}} &apos;&apos;&apos;&lt;span style=&quot;color:#f535aa&quot;&gt;—&lt;/span&gt; [[User:Paper9oll|&lt;span style=&quot;background:#f535aa; color:#fff; padding: 2px&quot;&gt;Paper9oll&lt;/span&gt;]] &lt;span style=&quot;color: #f535aa&quot;&gt;([[User_talk:Paper9oll|📣]] • [[Special:Contributions/Paper9oll|📝]])&lt;/span&gt;&apos;&apos;&apos; 15:29, 24 January 2021 (UTC)\n++++++\n:::{{done}} &apos;&apos;&apos;&lt;span style=&quot;color:#f535aa&quot;&gt;—&lt;/span&gt; [[User:Paper9oll|&lt;span style=&quot;background:#f535aa; color:#fff; padding: 2px&quot;&gt;Paper9oll&lt;/span&gt;]] &lt;span style=&quot;color: #f535aa&quot;&gt;([[User_talk:Paper9oll|📣]] • [[Special:Contributions/Paper9oll|📝]])&lt;/span&gt;&apos;&apos;&apos; 15:29, 24 January 2021 (UTC)\n\n</skipped>\n</testcase>\n</testsuite>\n<perfstats>\n<perfstat type=\"time:total\">1388.9099119901657</perfstat>\n<perfstat type=\"size:wt:raw\">43990</perfstat>\n<perfstat type=\"size:wt:gzip\">7607</perfstat>\n<perfstat type=\"size:html:raw\">2723</perfstat>\n<perfstat type=\"size:html:gzip\">1098</perfstat>\n</perfstats>\n</testsuites>' ) ON DUPLICATE KEY UPDATE id = LAST_INSERT_ID( id ), result = VALUES( result )

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ssastry triaged this task as Medium priority.Sep 11 2020, 11:27 PM

Three options:

  1. The field is declared as text, and maybe declaring as binary would fix this w/o further hard work
  2. We're truncating test results in a UTF-8-unsafe manner, like w/ JS surrogates
  3. The core parser is actually giving us bad UTF-8 due to some preexisting bug (like the formatNum bug I recently fixed) and so even though we're not doing anything "wrong" we still end up with bad UTF-8 in our output.

Looks like this may be an issue with the encoding we use for the database / field. Apparently, mysql utf8 only supports the BMP (3-byte chars) and so, as in the example in the description, if emojis and non-BMP characters are involved, the inserts fail. One option is to switch the encoding to uf8mb4 or as @cscott says in option #1, maybe make it a binary field.

Change 663021 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid/testreduce@master] Switch encoding from utf8 to utf8mb4

https://gerrit.wikimedia.org/r/663021

Change 663021 merged by Arlolra:
[mediawiki/services/parsoid/testreduce@master] Switch encoding from utf8 to utf8mb4

https://gerrit.wikimedia.org/r/663021

ssastry claimed this task.