Page MenuHomePhabricator

Allowing spaced slash after username or IP address in User or User talk namespace title is confusing
Open, Needs TriagePublicBUG REPORT

Description

@Legoktm and I have been working on a Rust library that replicates TitleCodec::splitTitleString. I just implemented IPv4 address normalization and noticed an inconsistency in how the software handles subpages of User or User talk pages.

Inconsistency

If you try to visit or User:1.1.1.01 or User:1.1.1.01 , you're sent to User:1.1.1.1 because of normalization. Good.

If you add a slash, the server lets you view User:1.1.1.01 /, which says "User account "1.1.1.01 " is not registered." No IP address normalization.

If, however, you visit User:Example /, when the page User:Example exists, you get a subpage navigation link up to User:Example. Similarly, User:Example / test gives you a link back to User:Example. If you create User:Example / test and link to User:Example/test, it'll show you a redlink if User:Example/test hasn't been created as well. I demonstrated this on English Wiktionary: User:Erutuon / test. Spaces are significant there, even though the subpage navigation link goes to the same user page, whether the slash after the username is spaced or unspaced.

The inconsistency here is that User:Example / gets MediaWiki to recognize that a user exists by that name, but User:1.1.1.01 / doesn't recognize that you'd get a valid IP user page if you remove the slash and therefore doesn't normalize the apparent IP address.

Suggested change

A better behavior would be for TitleCodec::splitTitleString to remove spaces around the first slash in the User or User talk namespace title. A slash directly after the username is really just a directory separator, and there shouldn't be a distinction between the title that has a space before or after the first slash and the title that has no spaces around the slash. Such a distinction invites confusion.

Slashes later in user page titles on the other hand can legitimately be intended to mean "or", and spaces around them should probably be kept, even though they are interpreted as subpage separators by MediaWiki. It would cause frustration to remove them. Like perhaps someone would create a version of the English Wikipedia article Aoraki / Mount Cook in a user subpage (User:Example/Aoraki / Mount Cook) and they wouldn't want the spaces around the slash automatically removed (-> User:Example/Aoraki/Mount Cook). So it's probably best not to normalize away spaces around any slashes after the first.

This would be a breaking change, and quite a lot of userpage titles with _/ or /_ or _/_ directly after the username would have to be moved. For instance, this query of the English Wikipedia page table contains quite a few cases.

Event Timeline

If we implement this normalization, I'm not sure it should be limited to the user namespace, rather any namespace that has subpages enabled (e.g. https://test.wikipedia.org/wiki/Talk:Foo_/_bar). The case of "Aoraki / Mount Cook" is interesting, I wonder if that's better fixed by {{DISPLAYTITLE:...}}. (note that if we limit this to namespaces with subpages, it won't affect mainspace anyways. It would affect a potential User namespace draft though)

My current thought is that the root part should go through full normalization, and each subpage part should get whitespace normalization.

So User:1.1.1.01 / becomes User:1.1.1.1/, User:Example / test becomes User:Example/test. Aoraki / Mount Cook is weird, because while the mainspace page stays the same, the talk page is actually Talk:Aoraki/Mount Cook. So maybe we should do this subpage normalization regardless of whether the namespace has subpages enabled or not.

note that all changes to title normalization and validation are blocked on T196088: Get cleanupTitles.php into a good enough state that we could run it in production.

After further discussion with @Erutuon and looking at more concrete examples on Wiktionary, I think that limiting this to just user/user talk root pages as originally proposed is probably safer just given the amount of pages that already have slashes and spaces around them.

Maybe if we were starting from scratch normalizing each subpage part would fly, but it's probably too disruptive today.

Change 745971 had a related patch set uploaded (by Legoktm; author: Legoktm):

[mediawiki/core@master] Normalize IP addresses/usernames in titles regardless of subpage

https://gerrit.wikimedia.org/r/745971

The slash also makes problems for user renames and moving the sub pages, becaues that select is not searching with the extra space.

For non-user pages that is still a problem when using Special:MovePage, but that is not fixable.