Page MenuHomePhabricator

MediaWiki:antispoof-unassigned is unclear to users
Open, Needs TriagePublic

Description

The message says "– (U+2013)" because it it is an "unknown character"

01d32f4b5fdaf5cb.png (243×328 px, 6 KB)

"– (U+2013)" is not an "unknown character"; the fact you've given me a canonical definition means you don't think it's unknown either.

This was actually changed in 2022 via T94959
It's still pretty unclear. It seems part of the problem is that it needs to distinguish itself from the other messages, so that people can 'figure out' why a character is not allowed. But these people are mostly experienced wiki users, which seems more important than giving understandable feedback to the end user.

Also, perhaps we should make this character recognized within Antispoof ? en and em dash are pretty common characters.

Event Timeline

AntiSpoof has a feature to detect mixed script styles and avoid that user names are created with mixed scripts.
It has a list of known scripts and the range 2000–206F "General Punctuation" is not known and gives this messages.

There are possible some points todo in this task:

  • The wording of the message, that is a I18n issue, last change was with 5d6157a82ff25e0df7e98878345ebf2ddd8f50cd as mention in task description
  • Adding the range in AntiSpoof for the script, maybe mapping with SCRIPT_ASCII_PUNCTUATION would be okay
  • There is no Equivset for dashes or hypens or minus

equivset should be updated before the script is added to avoid that similiar user names are created.