Page MenuHomePhabricator

Does Language::convert() also do HTML escaping?
Closed, ResolvedPublic

Description

The phan-taint-check-plugin seems to be treating output of Language->convert() as already HTML escaped. Is this correct? I tried following the code but didn't see an obvious spot where the escaping happened.

It would be good to document this in the @return for this method.

Event Timeline

Legoktm created this task.Aug 22 2018, 7:43 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 22 2018, 7:43 PM
Anomie added a subscriber: Anomie.Aug 23 2018, 6:41 PM

It seems to be intending to operate on an input HTML string and transform only bits that aren't part of HTML tags and aren't inside <script>...</script> or <style>...</style>. Assuming that works right and assuming there's no way to piece together an HTML tag from different bits, the output should be just as escaped as the input.

If it doesn't work right or there is a way to piece things together, that'd be a security bug.

Thanks, so to confirm, input to Language::convert() should already be escaped?

For the following code:

return $this->getLinkRenderer()->makeLink(
                        $title,
                        $wgContLang->convert( $title->getPrefixedText() )
                );

The correct way to do it would be:

# HTML safe
$text = htmlspecialchars( $title->getPrefixedText() );
# Still HTML safe
$converted = $wgContLang->convert( $text )
# Safe because HtmlArmor prevents double escaping
$link = $this->getLinkRenderer()->makeLink( $title, new HtmlArmor( $converted ) );

Is that right?

That seems right at first glance. I haven't walked through the code.

$wgContLang->convert( $title->getPrefixedText() ) specifically seems like it should be as safe as just $title->getPrefixedText(), since a title can't contain { or } so I think all ->convert() will do is convert alphabets. Again, though, I haven't looked through everything.

Change 456331 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/core@master] Document expected input and return value for Language::convert()

https://gerrit.wikimedia.org/r/456331

Change 456331 merged by jenkins-bot:
[mediawiki/core@master] Document expected input and return value for Language::convert()

https://gerrit.wikimedia.org/r/456331

matmarex removed a subscriber: matmarex.Aug 30 2018, 9:27 PM

Change 456836 had a related patch set uploaded (by Legoktm; owner: Brian Wolff):
[mediawiki/core@master] Add taint annotation and warnings to Language::convert() et al

https://gerrit.wikimedia.org/r/456836

Change 456836 merged by jenkins-bot:
[mediawiki/core@master] Add taint annotation and warnings to Language::convert() et al

https://gerrit.wikimedia.org/r/456836

Bawolff closed this task as Resolved.Sep 2 2018, 3:50 PM
Bawolff claimed this task.
sbassett triaged this task as Medium priority.Oct 15 2019, 7:25 PM