$wgWellFormedXml = false; breaks our EditPage broken bot protection in edittoken
Closed, ResolvedPublic

Description

When $wgWellFormedXml = false; is set our edittoken changes to:
<input type=hidden value=+\ name=wpEditToken>

The purpose of the \ is to protect against broken bots that mistreat the \" sequence treating it like a character escape.

We probably need to update Html.php so that it double quotes strings that end in a \.


Version: unspecified
Severity: normal

Details

Reference
bz49232
bzimport raised the priority of this task from to Normal.
bzimport set Reference to bz49232.
bzimport added a subscriber: Unknown Object (MLST).

What is the actual issue here?

(In reply to comment #1)

What is the actual issue here?

The \ in our edit token is intended to be output into the page as \". It's a protection against badly written proxies. These proxies strip the \ turning \" into " (which could break the content).

However when $wgWellFormedXml = false; is set it changes to value=+\. Which means that " is no longer present and it will no longer trip up the badly written proxies.

Change 67603 abandoned by Hashar:
Always quote attribute values ending in a backslash

Reason:
Abandoning old change. Feel free to reopen if there is still an interest in getting this merged.

https://gerrit.wikimedia.org/r/67603

Change 67603 restored by Krinkle:
Always quote attribute values ending in a backslash

Reason:
If an unquoted attribute value ending in \ actually works in modern browsers, and if EditPage wants older/broken browsers to work regardless (the token slash is mostly to reject bots/scripts, it makes sense to try and support browsers where possible since clients are helpless otherwise), then it would make sense to do the quoting in EditPage.

However that's not the case.

div = document.createElement('div');
div.innerHTML = '<input type=hidden name=token value=+\ />';
div.firstChild.value

"+"

div.innerHTML = '<input type=hidden value=c02a+\ name=wpEditToken>';
div.firstChild.value

"c02a+"

It seems that when it's parsed as part of a server-response (instead of in a fragment), that it does work, however. Which is why third-party wikis disabling good ol' wgWellFormedXml doesn't result in a broken EditPage.

https://gerrit.wikimedia.org/r/67603

Bawolff closed this task as Resolved.May 23 2016, 12:07 AM
Bawolff claimed this task.
Bawolff added a subscriber: Bawolff.

Fixed by removing $wgWellFormedXml in https://gerrit.wikimedia.org/r/#/c/286495/