Page MenuHomePhabricator

Improve support for non-alphanumeric characters in the interface for downloading User data
Closed, ResolvedPublic2 Estimated Story PointsBUG REPORT

Description

What is the problem?

If the user's username has certain characters, the Special:CentralAuth link does not always work correctly.

Examples:

These are extreme examples I grant, but apparently legit (i.e. not blocked) usernames with these characters do exist in the wild (e.g. https://en.wikipedia.org/wiki/Special:ListUsers?username=%2B+%2B&group=&wpsubmit=&wpFormIdentifier=mw-listusers-form&limit=500, https://en.wikipedia.org/wiki/Special:ListUsers?username=%22+...Nella+sua+meravigliosa+Luce+%22&group=&wpsubmit=&wpFormIdentifier=mw-listusers-form&limit=500).

Steps to reproduce problem
  1. Login to a wiki with a user whose username contains a ", + and/or &
  2. Go to Special:Preferences
  3. Click the link which says other Wikimedia projects where you have contributed

Expected behavior: Takes you to the Special:CentralAuth page for your user
Observed behavior: The Special:CentralAuth page is for the wrong user

Event Timeline

Change 666726 had a related patch set uploaded (by DLynch; owner: DLynch):
[mediawiki/core@master] URL encode the username passed to prefs-user-downloaddata-help-message

https://gerrit.wikimedia.org/r/666726

Change 666726 merged by jenkins-bot:
[mediawiki/core@master] URL encode the username passed to prefs-user-downloaddata-help-message

https://gerrit.wikimedia.org/r/666726

I cannot reproduce the bug with the usernames in the description, nor with a few others (see below).

For each username, I checked the HTML on the Special:Preferences was not malformed (like in the screenshot) and the CentralAuth link was correct (i.e. the CentralAuth user it displayed information for was the correct user).

Usernames tried:

  • Foo+="$%^*()-&'~ (from description)
  • Bar+=$%^*()-&'~ (from description)
  • Drw456!"$%&'()*+,-.=?\^ `~' (all ascii punctuation characters mediawiki considers valid)
  • äÂÉìÙ¨ÓÜÒãöíÝëßܳßëÛÂÓâÉÈß²äèúöò¸ÀþÚ¸ÿܯºÔÈïòÊíïÝ (characters outside ascii range)
  • Fooɇ𝕃 (high unicode range - takes more than 2 bytes to encode them)
  • Drw           ‎‏‗․‥…

‪‫‬‭‮ ‼‾⁇⁈⁉
    • (sample of unicode punctuation characters we apparently support)

Test environments: