Page MenuHomePhabricator

Fix rare unsafe use of Message::parse() in paragraph wrapper
Closed, ResolvedPublic

Description

Exibit A
	Html::rawElement( 'p', [], $msg->parse() )

I found less than 10 instances of in all WMF production codebases, so this is definitely rare and more likely the result of refactoring or copy-paste mistakes than any general lack of testing or understanding. But, I figured I'd file this task to track those fixes, reference as place to explain the issue, and possibly to talk about ways to statically detect this if we think it's worthwile (eg. CodeSniffer or Phan).

Problem

Interface messages, especially those that render as a dedicated block-level paragraph on special pages, are often translated or overriden on-wiki to contain rich text that can produce one or more block-level elements.

Even simple translation values that merely write two lines of text instead of one line, would break in the above scenario. There would be two <p> elements which are then wrapped inside a hardoded <p> element. This is actually illegal in HTML5, and is interpreted in modern and standards-compliant browsers, as redundant void elements. This is similar to what happens when you attempt to nest <li>, <img> or <input>` albeit for slightly different spec reasons.

HTML5: <p> <p>foo</p> <p>bar</p> </p>
DOM:  <p></p> <p>foo</p> <p>bar</p> <p></p>

Another common way to break this, and why we rarely wrap Message->parse() in paragraphs manually, is when translations or on-wiki overrides introduce a list. In that case you'd have <ul> inside the list.

And lastly, the most common way to break it is on-wiki overrides that use templates to generate <div>, <table> and what not.

Mitigation

The Parser actually always produces block. Akin to Markdown, even a single line of plain text would generate at least a paragraph element. The vast majority of interface messages are used in a context where inline flow is expected, such as link labels, tooltip attribute values, list item values, heading values, etc.

As such, Message::parse() defaults to stripping the single outer paragraph away before returning the value. See the private Message::format() method for how this works.

Fortunately, the solution is quite straight-forward, and this is already what core and 99% of extensions do when rendering a message block — we can call Message::parseAsBlock() instead, which returns the parser output unmodified, as paragraph or as whatever other block(s) the message contains.

Event Timeline

Change 851736 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/extensions/Collection@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851736

Change 851737 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/extensions/DiscussionTools@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851737

Change 851738 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/extensions/FileImporter@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851738

Change 851739 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/extensions/MassMessage@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851739

Change 851739 merged by jenkins-bot:

[mediawiki/extensions/MassMessage@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851739

Change 851738 merged by jenkins-bot:

[mediawiki/extensions/FileImporter@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851738

Change 851736 merged by jenkins-bot:

[mediawiki/extensions/Collection@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851736

Umherirrender renamed this task from Fix rare unafe use of Message::parse() in paragraph wrapper to Fix rare unsafe use of Message::parse() in paragraph wrapper.Nov 2 2022, 10:24 AM
Umherirrender updated the task description. (Show Details)

talk about ways to statically detect this if we think it's worthwile (eg. CodeSniffer or Phan).

CodeSniffer is not a static analysis tool, so I would recommend against using it for this purpose. With phan it should be easier, perhaps we could have a custom plugin in mediawiki-phan-config.

Change 851737 merged by jenkins-bot:

[mediawiki/extensions/DiscussionTools@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/851737

Change 853070 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/extensions/CentralNotice@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/853070

Found one more, in CentralNotice. It eluded my prior search as it still uses Xml::tags to build HTML rather than Html::element.

Change 853070 merged by jenkins-bot:

[mediawiki/extensions/CentralNotice@master] Avoid unsafe wrapping of Message::parse() into paragraph

https://gerrit.wikimedia.org/r/853070