Page MenuHomePhabricator

Report implicit or explict usage of Message::__toString in the phan-taint-check-plugin
Open, Needs TriagePublic

Description

Some of the issues reported by taint are missing calls to convert a message object to string. In all the cases the default for message escaping of Message::__toString is used and that can result in double escaping reported by taint.

To find some issues in core I have removed Message::__toString locally and phan reports than issues like:

includes\EditPage.php:1211 PhanTypeMismatchArgument Argument 1 ($text) is $this->context->msg('missing-revision-content', $this->oldid, Message::plaintextParam($this->mTitle->getPrefixedText())) of type \Message|\MessageSpecifier|\Serializable but \OutputPage::parseAsInterface() takes string defined at includes\OutputPage.php:2064
includes\Status.php:211 PhanTypeMismatchArgumentInternal Argument 2 ($pieces) is $errors of type \Message[] but \implode() takes string[]
includes\Status.php:256 PhanTypeMismatchArgument Argument 3 ($params) is $s of type \Message|\MessageSpecifier|\Serializable but \Status::msgInLang() takes string|string[] defined at includes\Status.php:436
includes\Status.php:260 PhanTypeMismatchArgument Argument 3 ($params) is $wrapper of type \Message|\MessageSpecifier|\RawMessage|\Serializable but \Status::msgInLang() takes string|string[] defined at includes\Status.php:436
includes\Status.php:270 PhanTypeMismatchArgument Argument 3 ($params) is $s of type \Message|\MessageSpecifier|\RawMessage|\Serializable but \Status::msgInLang() takes string|string[] defined at includes\Status.php:436
includes\Status.php:274 PhanTypeMismatchArgument Argument 3 ($params) is $wrapper of type \Message|\MessageSpecifier|\RawMessage|\Serializable but \Status::msgInLang() takes string|string[] defined at includes\Status.php:436
includes\actions\RevertAction.php:171 PhanTypeMismatchReturn Returning type \Message but getPageTitle() is declared to return string
includes\actions\RevertAction.php:175 PhanTypeMismatchReturn Returning type \Message but getDescription() is declared to return string
includes\api\ApiFeedWatchlist.php:158 PhanTypeMismatchArgument Argument 1 ($title) is $errorTitle of type \Message|\MessageSpecifier|\Serializable but \FeedItem::__construct() takes \Title|string defined at includes\changes\FeedItem.php:59
includes\api\ApiFeedWatchlist.php:165 PhanTypeMismatchArgument Argument 1 ($title) is $errorTitle of type \Message|\MessageSpecifier|\Serializable but \FeedItem::__construct() takes \Title|string defined at includes\changes\FeedItem.php:59
includes\logging\PageLangLogFormatter.php:60 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)
includes\logging\PageLangLogFormatter.php:61 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)
includes\specials\SpecialAllMessages.php:115 PhanTypeMismatchArgument Argument 1 ($msg) is $this->msg('allmessagestext') of type \Message|\MessageSpecifier|\Serializable but \HTMLForm::setIntro() takes string defined at includes\htmlform\HTMLForm.php:754
includes\specials\SpecialBlock.php:692 PhanTypeMismatchReturn Returning type true but validateTargetField() is declared to return \Message
includes\specials\SpecialChangeCredentials.php:142 PhanTypeMismatchArgument Argument 3 ($contents) is $info['provider'] of type \Message|\MessageSpecifier|\Serializable but \Html::element() takes string defined at includes\Html.php:231
includes\specials\SpecialChangeCredentials.php:144 PhanTypeMismatchArgument Argument 3 ($contents) is $info['account'] of type \Message|\MessageSpecifier|\Serializable but \Html::element() takes string defined at includes\Html.php:231
includes\specials\SpecialChangeCredentials.php:200 PhanTypeMismatchArgument Argument 2 ($text) is $info['account'] of type \Message|\MessageSpecifier|\Serializable but \MediaWiki\Linker\LinkRenderer::makeLink() takes \HtmlArmor|null|string defined at includes\linker\LinkRenderer.php:153
includes\specials\SpecialContributions.php:396 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)
includes\specials\SpecialDeletedContributions.php:191 PhanTypeMismatchReturn Returning type \Message but getSubTitle() is declared to return string
includes\specials\SpecialPasswordPolicies.php:164 PhanTypeMismatchArgument Argument 1 ($list) is $flagMsgs of type non-empty-list<\Message>|non-empty-list<\MessageSpecifier>|non-empty-list<\Serializable> but \Language::commaList() takes string[] defined at languages\Language.php:3420
includes\specials\SpecialPasswordReset.php:176 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)
includes\specials\SpecialPasswordReset.php:180 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)
includes\specials\SpecialRecentChanges.php:766 PhanTypeMismatchArgument Argument 1 ($title) is $this->msg('rclistfromreset') of type \Message|\MessageSpecifier|\Serializable but \SpecialRecentChanges::makeOptionsLink() takes string defined at includes\specials\SpecialRecentChanges.php:728
includes\specials\SpecialUserLogout.php:66 PhanTypeMismatchArgument Argument 1 ($msg) is $this->msg('userlogout-continue') of type \Message|\MessageSpecifier|\Serializable but \HTMLForm::addHeaderText() takes string defined at includes\htmlform\HTMLForm.php:806
includes\specials\SpecialVersion.php:818 PhanTypeMismatchArgument Argument 2 ($text) is $this->msg('version-version', $vcsVersion) of type \Message|\MessageSpecifier|\Serializable but \Linker::makeExternalLink() takes string defined at includes\Linker.php:849
includes\specials\SpecialWatchlist.php:636 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)
includes\specials\SpecialWatchlist.php:669 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)
includes\specials\pagers\BlockListPager.php:319 PhanTypeSuspiciousStringExpression Suspicious type \Message of a variable or expression used to build a string. (Expected type to be able to cast to a string)

It is not possible to remove Message::__toString for backward compatibilty and for edge cases where stringify was missing. But in simple cases like building html with Html::element or such it should be enforced to use escaped or parse or similar to make the code easy to understand and make taint happy.

A idea on finding issues with taint/escaping easier is to let taint report the implicit or explict use of Message::__toString and inform to use a method of Message class.
I have no idea if it is possible with phan to detect issues like PhanTypeSuspiciousStringExpression or mismatch of types as if the __toString function is not present.
There are also some types documtend as Message|string which are okay and not implict calling __toString in the code.

If this is not possible with phan, I have no problem if it is declined.

Event Timeline

A overview how often it is happen that a explict convert is missing:

Daimona added a subscriber: Daimona.

I think it should be possible to catch this with a phan plugin, not taint-check though. We'd likely have to write one ad-hoc and put it in mediawiki-phan-config. Note that phan doesn't have a "clean" way to do this (in that it won't tell plugins that __toString is going to get called). So I think we have two options:

  • Check for the Message class in places where a string cast can happen (string encapsulation, concatenation, explicit cast, typehint, etc.)
  • Request for something similar to be implemented upstream. A "good" implementation would be to make phan able to report @deprecated annotations on __toString (currently doesn't work). Then we could:
    • Enable Deprecated* warnings in the base config file
    • Write a plugin implementing SuppressionCapability
    • Suppress any "deprecated" warning not coming from Message::__toString()
    • Replace the "deprecated" warning for Message::__toString() with something more specific (unsure if possible)

There are some changes to Message::toString - https://gerrit.wikimedia.org/r/c/mediawiki/core/+/681818

In some situation the function now emits deprecation warnings