Page MenuHomePhabricator

corrupted rev_content
Closed, ResolvedPublic

Description

Our wiki was running MW 1.26 and unknown versions of Parsoid and StructuredDiscussions before I upgraded to MW 1.30, Flow: REL1_30 9193b03, and Parsoid (0.9.0all) jessie-mediawiki 5fe6923ab

The MW upgrade went fine, and I tested SD by entering a comment, which submission seemed to succeed, but after returning to the talk page I got:

An error has occurred while processing HTML/wikitext conversion.
Flow\Exception\WikitextException from line 374 of /var/www/arabdigitalexpression.org/wiki/mediawiki-1.30.0/extensions/Flow/includes/Conversion/Utils.php: Misplaced DOCTYPE declaration

htmlParseStartTag: misplaced <html> tag

htmlParseStartTag: misplaced <head> tag

htmlParseStartTag: misplaced <body> tag

For some reason that affected also the recent changes page, which kept displaying the same error!

The relevant record in table flow_revision had the following in rev_content:

<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/"><head prefix="mwr: https://arabdigitalexpression.org/wiki/Special:Redirect/"><meta charset="utf-8"/><meta property="mw:pageNamespace" content="2600"/><meta property="mw:html:version" content="1.6.1"/><link rel="dc:isVersionOf" href="https://arabdigitalexpression.org/wiki/%D9%85%D9%88%D8%B6%D9%88%D8%B9%3AUacb1cpxtaerw7qw"/><title></title><base href="https://arabdigitalexpression.org/wiki/"/><link rel="stylesheet" href="//arabdigitalexpression.org/w/mw/load.php?modules=mediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.skinning.content.parsoid%7Cmediawiki.skinning.interface%7Cskins.vector.styles%7Csite.styles%7Cext.cite.style%7Cext.cite.styles%7Cmediawiki.page.gallery.styles&amp;only=styles&amp;skin=vector"/><!--[if lt IE 9]><script src="//arabdigitalexpression.org/w/mw/load.php?modules=html5shiv&amp;only=scripts&amp;skin=vector&amp;sync=1"></script><script>html5.addElements('figure-inline');</script><![endif]--></head><body data-parsoid='{"dsr":[0,1181,0,0]}' lang="ar" class="mw-content-rtl sitedir-rtl rtl mw-body-content parsoid-body mediawiki mw-parser-output" dir="rtl"><section data-mw-section-id="0" data-parsoid="{}"><p data-parsoid='{"dsr":[0,102,0,0]}'>يا <span about="#mwt1" typeof="mw:Transclusion" data-parsoid='{"pi":[[{"k":"1"}]],"dsr":[3,30,null,null]}' data-mw='{"parts":[{"template":{"target":{"wt":"FlowMention","href":"./قالب:FlowMention"},"params":{"1":{"wt":"Lamya Magdy"}},"i":0}}]}'>@</span><a rel="mw:WikiLink" href="./مستخدم:Lamya_Magdy" title="مستخدم:Lamya Magdy" about="#mwt1" data-parsoid='{"stx":"piped","a":{"href":"./مستخدم:Lamya_Magdy"},"sa":{"href":"مستخدم:Lamya Magdy"}}'>Lamya Magdy</a> و <span about="#mwt2" typeof="mw:Transclusion" data-parsoid='{"pi":[[{"k":"1"}]],"dsr":[33,61,null,null]}' data-mw='{"parts":[{"template":{"target":{"wt":"FlowMention","href":"./قالب:FlowMention"},"params":{"1":{"wt":"أحمد السروجي"}},"i":0}}]}'>@</span><a rel="mw:WikiLink" href="./مستخدم:أحمد_السروجي" title="مستخدم:أحمد السروجي" about="#mwt2" data-parsoid='{"stx":"piped","a":{"href":"./مستخدم:أحمد_السروجي"},"sa":{"href":"مستخدم:أحمد السروجي"}}'>أحمد السروجي</a> شكرا على شغلكم في استكمال توثيق المناهج.</p>
</section></body></html>

After I manually edited the field by removing the <!DOCTYPE>, <html>, <head>, <body> and <section> tags, the discussion page and recent changes page returned to normal. However, commenting anywhere fails with messages like:

[94aa215d19afdd79f5d8eda7] Exception caught: Request to parsoid for "wikitext" to "html" conversion of content connected to title "Topic:Uaiwn3m09miqfsxk" failed: 406

Is "406" here the same as the HTML error "unacceptable"? does it have to do with my configuration?

It is worth mentioning that during creating the problematic comment, I was going back and forth between rich and wikitext views, where in every context switch the text block I had previously entered in the rich-editing mode was made un-editable in the comment entry field upon returning to rich-text from wikitext modes.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 423820 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/Flow@master] Update Parsoid version in Accept header to 1.6.0

https://gerrit.wikimedia.org/r/423820

Change 423820 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/Flow@master] Update Parsoid version in Accept header to 1.6.0

https://gerrit.wikimedia.org/r/423820

This patch should fix the 406 problem, but I'm not sure why you got a full document as opposed to just a body. @ssastry did Parsoid remove bodyonly support by any chance?

Change 423820 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/Flow@master] Update Parsoid version in Accept header to 1.6.0

https://gerrit.wikimedia.org/r/423820

This patch should fix the 406 problem, but I'm not sure why you got a full document as opposed to just a body. @ssastry did Parsoid remove bodyonly support by any chance?

No, we still have body only in place (although we want to get rid of it -- T181657).

Change 423820 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Update Parsoid version in Accept header to 1.6.1

https://gerrit.wikimedia.org/r/423820