Page MenuHomePhabricator

[WikiSEO] Mark some wiki-seo-param-* messages as raw
Closed, ResolvedPublic

Description

In the parent security review,

wiki-seo-param-* was marked as could have raw html. This isn't true for all these messages so should be mark only some of them as raw and then escape the rest.

wiki-seo-param-title-description is the only one with raw html in english. The rest are either plain text or wikitext.
@Octfx: What's your thoughts?

Event Timeline

Aklapper renamed this task from mark some wiki-seo-param-* messages as raw to [WikiSEO] Mark some wiki-seo-param-* messages as raw.Apr 6 2022, 7:42 AM
Aklapper removed mmartorana as the assignee of this task.
Aklapper added a subscriber: mmartorana.

We could do that, but is it really necessary?

The default behaviour for a valid WikiSEO param is to strip all tags:

# InfoAction.php line 103
default:
	$content = sprintf( '%s', strip_tags( $value ) );
	break;

Side note: I don't know why I've placed the sprintf there, oh well

Shouldn't this be enough to mitigate code injection?
If this is not the case, then I'm all for marking them as raw.

@Octfx: I looked through and given only 1 description message includes rawHtml, it might make sense to mark them. I don't think the param message would ever include rawHtml.

We could mark the param message as ->escaped() and param-*-description as rawHtml.

Okay, I'll rebase my patch and do that once @Zabe's merges.

Change 777399 had a related patch set uploaded (by RhinosF1; author: RhinosF1):

[mediawiki/extensions/WikiSEO@master] InfoAction: fix how some messages are marked

https://gerrit.wikimedia.org/r/777399

RhinosF1 triaged this task as High priority.

Change 777399 merged by jenkins-bot:

[mediawiki/extensions/WikiSEO@master] InfoAction: fix how some messages are marked

https://gerrit.wikimedia.org/r/777399

@Octfx: Please request a dedicated project tag for WikiSEO if you plan to track WikiSEO issues (bug reports, feature requests) in Wikimedia Phabricator. Thanks!

We could do that, but is it really necessary?

It definitely buys a bit more security, per the config doc. And this has been a known attack vector for translatewiki.net and Wikimedia production.

Shouldn't this be enough to mitigate code injection?

In many cases, sure. But the Security-Team typically prefers a layered approach to security to further discourage potential malicious actors and avoid situations where only a single defense is used.

@Octfx: I looked through and given only 1 description message includes rawHtml, it might make sense to mark them. I don't think the param message would ever include rawHtml.

We should be a little careful about messages that will never contain html or are never supposed to contain html and messages that might contain html, because it would be perfectly reasonable to do so. If the messages here are the only ones that can ever or should ever reasonably contain html, then that's fine.

We could mark the param message as ->escaped() and param-*-description as rawHtml.

Well, running them through ->escaped() means they shouldn't be included in RawHtmlMessages since any message html will be escaped at output. So the wiki-seo-param-* messages included in RawHtmlMessages in the above change set should likely be removed if they'll always be passed to ->escaped() within onInfoAction.

Hi @sbasset, only the messages ending in -description should contain raw html and they are not passed to ->escaped(). The rest should not and are now escaped.