Page MenuHomePhabricator

Rename help key to help-raw in HTMLForm and deprecate old key name
Open, LowPublicSecurity

Description

The HTMLForm library doesn't escape the input of the help key by default. Is this intended behavior?

For example, the following form descriptor using an interface message through wfMessage()->text() is an XSS vector.

$formDescriptor= [
  'simpletextfield' => [
     'label' => 'Simple Text Field',
     'class' => 'HTMLTextField',
     'help' => wfMessage( 'myextension-mymessage' )->text(), // not escaped by HTMLForm, XSS!
   ]
]

We should rename the key to be help-raw, so users know it is a raw key. We should mark the old key name deprecated but keep it around for a transition period.

Afterwards we should adjust phan-taint-check to make sure it recognizes the new key name.

If you are fixing this, you probably need to edit (the comment in) includes/htmlform/HTMLForm.php and includes/htmlform/HTMLFormField.php as well as mentioning the deprecation in RELEASE-NOTES

  • New help-raw param added, and old help param marked as deprecated.
  • Migrate all Wikimedia production code to use help-raw instead of help.
  • Make use of the help param cause a deprecation warning.
  • Drop support for use of help (in MW 1.44 or later).

Event Timeline

I believe this is intentional, as some help messages can contain small amounts of harmless html for formatting purposes. This is a pattern we'd likely try to correct though, if found in actual code. And we do that in various ways (see also T347742 et al). Otherwise it's generally incumbent upon an engineer or developer to have a good understanding the MediaWiki Messages API and its output modes.

I believe this is intentional, as some help messages can contain small amounts of harmless html for formatting purposes. This is a pattern we'd likely try to correct though, if found in actual code. And we do that in various ways (see also T347742 et al). Otherwise it's generally incumbent upon an engineer or developer to have a good understanding the MediaWiki Messages API and its output modes.

Is there a reason we can't make it use something like 'raw' to make it possible to not escape, I was wondering?

Is there a reason we can't make it use something like 'raw' to make it possible to not escape, I was wondering?

There isn't any technical reason, no. But it might break a bunch of existing help messages and irritate other developers who want this kind of flexibility. If we were going to go this route, we should just make this bug public and put up a gerrit change set for feedback and review.

I've always hated that the key is named help instead of help-raw or help-html. Its super confusing. I'd support making a new name for the key and deprecating the old name.

I've always hated that the key is named help instead of help-raw or help-html. Its super confusing. I'd support making a new name for the key and deprecating the old name.

+1. If nothing it would at least make it consistent with the naming scheme used in for example label/label-raw.

I've always hated that the key is named help instead of help-raw or help-html. Its super confusing. I'd support making a new name for the key and deprecating the old name.

+1. If nothing it would at least make it consistent with the naming scheme used in for example label/label-raw.

patch-welcome. If someone is willing to get started on that, we can definitely make this task public.

I'll give it a try. I need some experience with gerrit, and this looks like a good chance.

sbassett changed Author Affiliation from N/A to Wikimedia Communities.
sbassett edited projects, added SecTeam-Processed; removed Security-Team.
sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".
sbassett changed the edit policy from "Custom Policy" to "All Users".
sbassett changed Risk Rating from N/A to Low.

Sorry, won't be able to work on this after all.

Sorry, won't be able to work on this after all.

No worries. Patches are always welcome but never an obligation. Raising the issue is in and of itself a contribution to MediaWiki :)

Bawolff renamed this task from HTMLForm doesn't escape the help key by default to Rename help key to help-raw in HTMLForm and deprecate old key name.Feb 17 2024, 9:07 PM
Bawolff added a project: good first task.
Bawolff updated the task description. (Show Details)

@AllUsernamesArePicked Can you plz share me the repo for this issue as i am new to this community answer i want to to contribute eagerly

@AllUsernamesArePicked Can you plz share me the repo for this issue as i am new to this community answer i want to to contribute eagerly

HTMLForm is part of MediaWiki core.

@GauriGupta: Please always go to the code project listed under "Tags" in the sidebar of a task, that page links the code repository.

Change #1055600 had a related patch set uploaded (by Alejandro Alcaide; author: Alejandro Alcaide):

[mediawiki/core@master] Deprecate the 'help' key in form descriptors in favor of 'help-raw'

https://gerrit.wikimedia.org/r/1055600

Change #1055600 merged by jenkins-bot:

[mediawiki/core@master] Deprecate the 'help' key in form descriptors in favor of 'help-raw'

https://gerrit.wikimedia.org/r/1055600

We also need a patch to phan-taint-check to make sure the bew keynane is marked as xss risk.

We also need a patch to phan-taint-check to make sure the bew keynane is marked as xss risk.

Filed T372572.

I could take this on. The way I'm searching for deprecated usage is: 'help' => wfMessage; if I'm missing something, please let me know. Cheers and happy Thanksgiving for those who celebrate.

Change #1099259 had a related patch set uploaded (by CraigKahle; author: CraigKahle):

[mediawiki/core@master] updating form descriptors to use new 'help-raw' key over 'help' to indicate that it is a raw field.

https://gerrit.wikimedia.org/r/1099259

I added a topic branch to get started on this. Before I continue I wanted to ensure I'm on the right path. I am a new contributor and have a couple of questions:

  1. WebInstaller.php has some uses of 'help' => ... but looks like it has it's own custom form param handling within WebInstaller class (and any classes within MediaWiki\Installer namespace). Is that correct?
  2. In HTMLFormField.php, we want to use 'help' in this case because the OOUI\FieldLayout instance is expecting that array key.

I appreciate the patience while I grok the form building code. Cheers.

The way I'm searching for deprecated usage is: 'help' => wfMessage; if I'm missing something, please let me know.

It misses all kinds of things, e.g. 'help' => $this->msg(...) which is a common pattern due to IContextSource. Also, using double quotes or aligning array arrows is uncommon but happens. I'd go with ('help'|"help")\s+=>.

  1. WebInstaller.php has some uses of 'name' => ... but looks like it has it's own custom form param handling within WebInstaller class (and any classes within MediaWiki\Installer namespace). Is that correct?

I assume you meant 'help'. Yeah that doesn't use HTMLForm. WebInstaller also takes raw HTML for help which is maybe not ideal in terms of code consistency, but it also doesn't use user-defined data (you don't have users yet at that point) so security-wise not really a problem.

  1. In HTMLFormField.php, we want to use 'help' in this case because the OOUI\FieldLayout instance is expecting that array key.

That uses help as an option array field for a class that isn't HTMLForm related, so beyond the scope of this task.

(HtmlForm classes mostly use key postfixes to clarify the type. OOUI classes escape everything, and expect raw HTML to be wrapped in HtmlSnippet. It's not great that we have those two conventions parallel in the codebase, but in isolation they both work well, and they are both used a lot so replacing one of them with the other probably isn't really feasible.)

Tagging a parameter migration in one of the most used MediaWiki classes as good first task is a bit ambitious IMO. There are ~50 hits for ('help'|"help")\s+=> in Wikimedia production; someone unfamiliar with the codebase would have to trace every one of those through the call chain until they reach HTMLForm or something that's clearly not using HTMLForm.

Gergő - I really appreciate the detailed response and advice. With that information, I'm going to give it another shot.

If that proves to be problematic, I can step aside and let someone with more experience with the codebase take this on. Thanks again.

You should feel free to work on it, I just feel that tag sets wrong expectations. We usually use it for small self-contained changes. Deprecations across the whole codebase aren't really that.

Change #1100006 had a related patch set uploaded (by CraigKahle; author: CraigKahle):

[mediawiki/core@master] updating form descriptors to use new 'help-raw' key over 'help' to indicate that it is a raw field.

https://gerrit.wikimedia.org/r/1100006

Thanks Gergő. Understood; I would agree with that. After tracing through each result from the regex you provided (thanks), I am only seeing two direct uses of HTMLForm:

  • includes/specials/forms/UploadForm.php
  • includes/preferences/DefaultPreferencesFactory.php

SpecialMovePage.php and CheckMatrixWidget.php are both using OOUI directly.
LoginSignupSpecialPage.php is using 'help-message'.

The uses in /includes/auth are all handling requests, and not building forms, so I reverted those changes and squashed into a single commit. I am running MW locally, enabled uploads and checked that the top two updates are working. Parser tests are passing locally as well.

Thanks for taking a look. Looking at CheckMatrixWidget.php , it looks to be using OOUI ? If that isn't the case I can add to my feature branch. Which extensions should be updated? I am happy to take that on if it is in scope for this ticket.

Hey Gergő. Hope you are doing well. I wanted to check back in on this to ensure that CheckMatrixWidget.php needs updating. I will update the test later today.

Extensions dir is empty for me as I just cloned, but if there are other repos I should check out and make updates to, I am happy to do that. Thanks.

Extensions dir is empty for me as I just cloned, but if there are other repos I should check out and make updates to, I am happy to do that. Thanks.

An empty extensions directory is the default for MediaWiki core, yes. MediaWiki extensions and skins are deployed as git submodules within Wikimedia production. There are a few trivial ways of finding/searching for MediaWiki extensions, a couple being via a gerrit filter and anything prefixed with Extension: in codesearch. And it might make sense to focus on Wikimedia-deployed extensions, as those directly affect projects like Wikipedia.

Change #1100006 merged by jenkins-bot:

[mediawiki/core@master] updating form descriptors to use new 'help-raw' key over 'help'

https://gerrit.wikimedia.org/r/1100006

Change #1099259 abandoned by Bartosz Dziewoński:

[mediawiki/core@master] updating form descriptors to use new 'help-raw' key over 'help' to indicate that it is a raw field.

Reason:

Superseded by https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1100006

https://gerrit.wikimedia.org/r/1099259

Hello again. Thank you very much @sbassett for the explanation. I would love to complete this task, however someone with deeper knowledge of the extension submodule structure may be better equipped to complete this. Thanks again to everyone that assisted and commented.