Page MenuHomePhabricator

Figure out a better solution for parsing & handling URL params for Special:MediaSearch in PHP
Open, Needs TriagePublic

Description

This is a follow-up ticket to T297828 (which has been dealt with for the moment at least). Filing this while the problem is still top-of-mind in case anyone wants to take a stab at refactoring at some point.

Special:MediaSearch needs to extract and parse URL query parameters in PHP, because some of these params need to be handled to set up the server-rendered search results page:

  • The type param (image, audio, video, etc) is very important and determines a lot of other behavior
  • Filter params (filemime, etc) need to be honored, but allowed values for a given filter will change based on the type. Some filters like filemime are supported across different type searchers provided the correct values are provided (filemime=png is invalid for type=video, etc).
  • Global mediawiki url params (debug=true, useskinversion=2, etc) need to be ignored but should be preserved in any links that are generated to change tabs, load more results, etc. The same goes for the optional title=Special:MediaSearch parameter that users will get if they search from the box on the Commons main page.
  • Users could show up on the page with completely arbitrary URL params & values that have been created for spammy or malicious purposes. See T297828 for examples. Attempting to parse URL params with array values (which we don't support or expect) was throwing errors previously.

The patch at https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MediaSearch/+/747644 introduced a basic filter to discard any keys/values from the user's query string that are not simple strings. The URL is not actually changed but MediaSearch will only care about the filtered array when it sets up the page in PHP. Can this solution be improved? Are we stripping out valid values or handling any values that are potentially unsafe?

// Discard query param keys or values that are not strings to sanitize before using
$queryParams = array_filter( $this->getRequest()->getValues(), static function ( $v, $k ) {
    return is_string( $k ) && is_string( $v );
}, ARRAY_FILTER_USE_BOTH );

Relatedly, the mustache template for Special:MediaSearch has to give special treatment to sort and title params (the latter had to be added as part of T297877). Are there any other parameters that need to be preserved as the user changes tabs that we are not preserving currently?

Are there any best practices for how to handle arbitrary URL params in MediaWiki PHP that we should apply here?

Event Timeline

Don't forget to check if most of this is even necessary, as the mustache forms contains the full URL (including all parameters) in the action="…" already, and most of the following hidden fields might just be duplicates.

(Assuming this task is about the SDAW-MediaSearch code project, hence adding that project tag so other people who don't know or don't care about team tags can also find this task when searching via projects. Please set appropriate project tags when possible. Thanks!)