Page MenuHomePhabricator

Set $wgWellFormedXml = false by default
Closed, DeclinedPublic

Description

The main reason for this still being true was screen-scraping bots (together with $wgHtml5).

For some reason when $wgHtml5 was finally enabled we forgot to also get rid of wgWellFormedXml=true.

We should disable it by default for the same reason we now enable wgHtml5 by default.

This is mainly bugging because of the "&&" sign we have as of recently in the default html output, which, due to wgWellFormedXml, causes an ugly CDATA section.


Version: 1.22.0
Severity: normal

Details

Reference
bz50040

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:53 AM
bzimport set Reference to bz50040.
bzimport added a subscriber: Unknown Object (MLST).

Related URL: https://gerrit.wikimedia.org/r/70036 (Gerrit Change I4155c74042d22527dc5c9460a4af8b7b1adb36cc)

You could just use <script>if(jQuery)jQuery.ready()</script>, it would be shorter than what you have currently: <script>window.jQuery && jQuery.ready();</script>

In any case, I don't think that CDATA is a particularly inefficient or ugly part of our HTML output, compared to, say,

mw.loader.implement("user.options",function(){mw.user.options.set({"ccmeonemails":0,"cols":80,"date":"default","diffonly":0,"disablemail":0,"disablesuggest":0,"editfont":"default","editondblclick":0,"editsection":1,"editsectiononrightclick":0,"enotifminoredits":0,"enotifrevealaddr":0,"enotifusertalkpages":1,"enotifwatchlistpages":0,"extendwatchlist":0,"fancysig":0,"forceeditsummary":0,"gender":"unknown","hideminor":0,"hidepatrolled":0,"imagesize":2,"justify":0,"math":0,"minordefault":0,"newpageshidepatrolled":0,"nocache":0,"noconvertlink":0,"norollbackdiff":0,"numberheadings":0,"previewonfirst":0,"previewontop":1,"rcdays":7,"rclimit":50,"rememberpassword":0,"rows":25,"searchlimit":20,"showhiddencats":false,"showjumplinks":1,"shownumberswatching":1,"showtoc":1,"showtoolbar":1,"skin":"vector","stubthreshold":0,"thumbsize":4,"underline":2,"uselivepreview":0,"usenewrc":0,"watchcreations":1,"watchdefault":0,"watchdeletion":0,"watchlistdays":3,"watchlisthideanons":0,"watchlisthidebots":0,
"watchlisthideliu":0,"watchlisthideminor":0,"watchlisthideown":0,"watchlisthidepatrolled":0,"watchmoves":0,"wllimit":250,"useeditwarning":1,"flaggedrevssimpleui":1,"flaggedrevsstable":0,"flaggedrevseditdiffs":true,"flaggedrevsviewdiffs":false,"vector-simplesearch":1,"vector-collapsiblenav":1,"usebetatoolbar":1,"usebetatoolbar-cgd":1,"aftv5-last-filter":null,"visualeditor-enable":1,"wikilove-enabled":1,"echo-subscriptions-web-page-review":true,"echo-subscriptions-email-page-review":false,"ep_showtoplink":false,"ep_bulkdelorgs":false,"ep_bulkdelcourses":true,"ep_showdyk":true,"echo-notify-show-link":true,"echo-show-alert":true,"echo-email-frequency":0,"echo-subscriptions-email-system":true,"echo-subscriptions-web-system":true,"echo-subscriptions-email-other":false,"echo-subscriptions-web-other":true,"echo-subscriptions-email-edit-user-talk":false,"echo-subscriptions-web-edit-user-talk":true,"echo-subscriptions-email-reverted":false,"echo-subscriptions-web-reverted":true,
"echo-subscriptions-email-article-linked":false,"echo-subscriptions-web-article-linked":false,"echo-subscriptions-email-mention":false,"echo-subscriptions-web-mention":true,"echo-subscriptions-web-edit-thank":true,"echo-subscriptions-email-edit-thank":false,"gettingstarted-task-toolbar-show-intro":true,"variant":"en","language":"en","searchNs0":true,"searchNs1":false,"searchNs2":false,"searchNs3":false,"searchNs4":false,"searchNs5":false,"searchNs6":false,"searchNs7":false,"searchNs8":false,"searchNs9":false,"searchNs10":false,"searchNs11":false,"searchNs12":false,"searchNs13":false,"searchNs14":false,"searchNs15":false,"searchNs100":false,"searchNs101":false,"searchNs108":false,"searchNs109":false,"searchNs446":false,"searchNs447":false,"searchNs710":false,"searchNs711":false,"searchNs828":false,"searchNs829":false,"gadget-teahouse":1,"gadget-ReferenceTooltips":1,"gadget-DRN-wizard":1,"gadget-charinsert":1,"gadget-mySandbox":1});},{},{});mw.loader.implement("user.tokens",function(){mw
.user.tokens.set({"editToken":"+\\","patrolToken":false,"watchToken":false});},{},{});

which is sent out with every anonymous page view.

(In reply to comment #2)

You could just use <script>if(jQuery)jQuery.ready()</script>, it would be
shorter than what you have currently: <script>window.jQuery &&
jQuery.ready();</script>

<script>if(window.jQuery)jQuery.ready()</script>, rather. The "window." is important here, as we want to keep going even if jQuery is not loaded and thus its variable is not defined.

I've decided that I don't think this should be done, now or in the future. I think the general idea behind HTML's original SGML -> XML migration, i.e. to make parsers easier to write at the expense of brevity, continues to be sound and useful. The benefits of a reversion to SGML are small, and the costs can be expected to mirror bug 52253.

I note btw that the tidy config is also set to produce XHTML.

Change 70036 abandoned by Krinkle:
Set $wgWellFormedXml to false by default

Reason:
Agreed, per bug.

https://gerrit.wikimedia.org/r/70036