Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Use Sanitizer::stripAllTags() when generating notification snippets | mediawiki/extensions/DiscussionTools | master | +598 -596 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | matmarex | T219138 TemplateStyles CSS appears in notification text | |||
Resolved | TheDJ | T228856 RemexStripTagHandler should strip <style> contents |
Event Timeline
I think this is a bug in RemexStripTagHandler (called via EchoMentionPresentationModel::getBodyMessage() -> EchoDiscussionParser::getTextSnippet() -> Sanitizer::stripAllTags()) which IMO tends to be used as a rough equivalent of Element.innerText and should behave somewhat along those lines (specifically, hidden elements should be ignored).
Maybe not bug as such, but poor DX. Made that a separate task: T228856: RemexStripTagHandler should strip <style> contents
Presumably resolved via the subtask! I'm assuming you're not planning formal QA for this task, so I'll just close it.
It looks like you've fixed the bug for notification snippets generated by Echo (we should confirm it's definitely fixed), but we have a similar bug in DiscussionTools.
Code here generates snippets for topic subscriptions and some user talk page notifications (but not for mentions): https://github.com/wikimedia/mediawiki-extensions-DiscussionTools/blob/master/includes/Notifications/EventDispatcher.php#L211
And we have tests that actually confirm that content from <style> tags is included:
I think that method is fine, EchoEditUserTalkPresentationModel::getBodyMessage() should sanitize the text snippet like EchoMentionPresentationModel::getBodyMessage() does. We should probably review all the presentation models to make sure they do that.
DiscussionTools only stores the plain text in the notification data, so that's too late. (Echo stores wikitext there.)
Change 761719 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):
[mediawiki/extensions/DiscussionTools@master] Use Sanitizer::stripAllTags() when generating notification snippets
Right, so same issue. DiscussionTools unwraped the HTML using XML DOM somewhere, and gets the textContent, and does this BEFORE passing it to Echo. As XML DOM doesn't know of Html specifics, it gets the textContents of style and script tags, which textContent on an html element does not do.
The following classes override EchoEventPresentationModel::getBodyMessage():
- EchoEditUserTalkPresentationModel: needs fix
- EchoEmailUserPresentationModel: body generated from plaintext email so no issue
- EchoForeignPresentationModel: no free text
- EchoMentionInSummaryPresentationModel: uses Sanitizer::stripAllTags()
- EchoMentionPresentationModel: uses EchoDiscussionParser::getTextSnippet() which uses Sanitizer::stripAllTags()
- EchoRevertedPresentationModel: uses EchoDiscussionParser::getTextSnippetFromSummary() which uses Sanitizer::stripAllTags()
- EchoUserRightsPresentationModel: uses EchoDiscussionParser::getTextSnippet()
- EchoWatchlistChangePresentationModel: no free text
These are the core ones, there is plenty more in extensions.
Two unrelated remarks:
- many events show a summary (e.g. edit summary, log summary) in the notification body. It seems inconsistent whether that gets (partially) parsed via Linker::formatComment() or treated as plaintext.
- EchoFlyoutFormatter, EchoHtmlEmailFormatter and EchoPlainTextEmailFormatter calls $echoEventPresentationModel->getBodyMessage()->parse(). SpecialNotificationsFormatter calls $echoEventPresentationModel->getBodyMessage()->escaped(). That can't be good.
textContent always returns all tag content. innerText doesn't but that method's specific to the HTML DOM (and Dodo has apparently not implemented it).
Echo apparently also stores plaintext (parsed and stripped wikitext) - the events are generated in EchoDiscussionParser::generateEventsForRevision() which calls to EchoDiscussionParser::detectSectionTitleAndText() which uses getTextSnippet(). So once similar logic is added in DiscussionTools we should be good.
Change 761719 merged by jenkins-bot:
[mediawiki/extensions/DiscussionTools@master] Use Sanitizer::stripAllTags() when generating notification snippets