edits where the rev_user_text and ar_user_text fields contain underscores, initial lower-case letters or consecutive spaces, which could occur in the Phase I and II software, are inaccessible using Special:Contributions
OpenPublic

Description

Author: timwi

Description:
BUG MIGRATED FROM SOURCEFORGE
http://sourceforge.net/tracker/index.php?func=detail&aid=917040&group_id=34373&atid=411192
Originally submitted by Toby Bartels (tobybartels) 2004-03-16 03:04

On the English Wikipedia in 2001, there was a user
"Ryan_Lackey" whose user name contained an underscore.
You can see edits credited to this user at, for
example, [[Talk:Sealand]]. But these edits are not
recorded at
[http://en.wikipedia.org/w/wiki.phtml?title=Special:Contributions&target=Ryan_Lackey],
which, after all, ''should'' be for a user named "Ryan
Lackey" (who doesn't exist). Similarly,
[[User:Ryan_Lackey]] doesn't think that it's a user
page for an actual user.

This specific case can probably be fixed if a developer
performs a username change (from "Ryan_Lackey" to "Ryan
Lackey") -- assuming that the name changing feature
doesn't break down too! ^_^ But the larger bug probably
applies to other editors from Phase I.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=34873

bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz323.
bzimport created this task.Via LegacySep 3 2004, 3:21 AM
bzimport added a comment.Via ConduitNov 20 2005, 5:20 PM

robchur wrote:

Not a bug...underscores are seen as spaces, and so aren't supported in
usernames. Running the username change *might* do it, but it might break their
contribs, too, depending upon how they're linked. Depends.

TobyBartels added a comment.Via ConduitNov 21 2005, 9:13 PM

It's not a bug in the current MediaWiki software, since that software doesn't
allow underscores in usernames. Instead, it's a bug in the English Wikipedia
website, and possibly other websites that used Phase I. (It may also be a bug
in the Phase I software or in the Phase I -> II conversion script, but I don't
think that they matter anymore.) I've fixed the Product field to indicate this.

demon added a comment.Via ConduitJan 25 2008, 5:39 AM

(In reply to comment #1)

Not a bug...underscores are seen as spaces, and so aren't supported in
usernames. Running the username change *might* do it, but it might break their
contribs, too, depending upon how they're linked. Depends.

Yes, it would. I tried manually switching a user name to have _ instead of a space, and it broke (same issue had come up with one of my users, wanted the underscore).

Too many places where _ is getting stripped or un-stripped possibly.

bzimport added a comment.Via ConduitDec 21 2008, 7:17 PM

ayg wrote:

These still haven't been cleaned up . . . for instance, user_id 87496 is Nicholas_Lativy. [[User:Nicholas Lativy]] is a different user, 90574. These should be identified and dealt with, although they're probably long abandoned.

bzimport added a comment.Via ConduitMar 19 2009, 3:39 PM

mike.lifeguard+bugs wrote:

Removed shell keyword as it seems there's nothing to do on shell presently.

Graham87 added a comment.Via ConduitMar 31 2009, 1:13 AM

The most famous victim of this bug would have to be Larry Sanger. See this diff (and note the username of "Larry_Sanger"):
http://en.wikipedia.org/w/index.php?title=Talk:Assistive_technology&diff=234142&oldid=234141

However, there are no contributions for Larry Sanger from the talk namespace in April 2001:
http://en.wikipedia.org/w/index.php?limit=50&tagfilter=&title=Special%3AContributions&contribs=user&target=Larry+Sanger&namespace=1&year=2001&month=4

This bug affects the Nostalgia Wikipedia in exactly the same way.

Graham87 added a comment.Via ConduitMar 31 2009, 1:45 AM

I've just created the account under the user name "Ryan Lackey" to keep the userpage from being deleted by automated scripts who think it's a userpage of an unregistered user. This will also stop malcontents from trying to use the account.

bzimport added a comment.Via ConduitJul 24 2009, 10:27 AM

happy.melon.wiki wrote:

So what actually needs to be done here? What needs to be "cleaned up", and where?

TobyBartels added a comment.Via ConduitJul 24 2009, 4:35 PM

What needs to be done? My estimate:

  • Identify all of the wikis that ever used Phase I software.
  • Identify all of the characters forbidden in current MediaWiki usernames but allowed in Phase I.
  • Identify all of the users registered at those wikis with at least one of those characters in each name.
  • Find an appropriate alternative name (which probably needs to be done ad hoc; we know that [[User:Larry_Sanger]] was the same as [[User:Larry Sanger]], and we know that [[User:Ryan Lackey]] is a dummy account, but we don't know what's up with [[User:Nicholas_Lativy]] and it may be too late to ask).
  • Move the invalidly named account --possibly by hand-editing the database-- to the validly named account.

This is a lot of work for little reward, so maybe we just need to keep this bug open (or is WONTFIX for this sort of thing?) so that people know about the possibility. And try not to let anything else interact badly with it.

Graham87 added a comment.Via ConduitJul 24 2009, 4:49 PM

In the revision table, all underlines in the rev_user_text field need to be changed to spaces. Ditto for the ar_user_text field in the archive table. I think those changes will completely solve the problem, but I'm not 100% sure ... I'm not an expert on the database schema. It would be nice if the user IDs in the revision table were changed as well (so the user ID of Larry_Sanger would be the same as the user ID of Larry Sanger).

Graham87 added a comment.Via ConduitJul 24 2009, 4:55 PM

I comment-conflicted with Toby there. :-) As he said, other special characters caused problems when used in phase I usernames as well. The only one I can think of is "@", which is replaced with ".", as in this edit: http://en.wikipedia.org/w/index.php?title=Wikipedia:UuU&oldid=291430. That problem would be harder to fix though.

Graham87 added a comment.Via ConduitJul 24 2009, 5:02 PM

Re-added hsell keyword as fixing this bug requires direct manipulation of the database.

Graham87 added a comment.Via ConduitOct 21 2009, 1:56 AM

I just found a case of this bug where the user had an underline in their name but the account was subsequently taken over by a vandal. I thought I had created accounts for all UseModWiki-era users who didn't have them, but users have occasionally slipped through the cracks. See:
http://en.wikipedia.org/wiki/User_talk:The_ansible#Note

I have also known for a long time about the case of "Simon_J_Kissane", see:
http://en.wikipedia.org/w/index.php?title=Talk:Algebraic_number&oldid=234678

Dinoguy1000 added a comment.Via ConduitNov 28 2009, 11:42 PM

Are we supposed to list all cases we find (as in [[bugzilla:20757]])? Because I just ran across [[User:Alan_D]]: http://en.wikipedia.org/w/index.php?title=Sailor_Moon&oldid=282114 for example does not show up in his contributions.

TobyBartels added a comment.Via ConduitNov 29 2009, 12:45 AM

@ Philip #15

I don't know who decides what we're "supposed" to do, but I think that it would be a good idea, at least until a developer writes in to say that there's no point.

Graham87 added a comment.Via ConduitNov 29 2009, 3:29 AM

There's no point in listing them as far as I can tell. The devs can find them all automatically if they use the method I outlined in comment 11. I don't see the point of listing all instances at bug 20757 either, but it's better to be safe than sorry.

As an aside, to draw more attention to this bug, I've mentioned it at http://en.wikipedia.org/wiki/Wikipedia:MediaWiki/DeveloperMemo/November2009#Requests_-_fixes

bzimport added a comment.Via ConduitNov 29 2009, 3:47 PM

ayg wrote:

There is no point in listing them one by one here. Anyone with even toolserver access can just query the appropriate tables to find the bad rows. E.g., on enwiki,

mysql> SELECT user_name FROM user WHERE user_name LIKE '%\_%';
+-----------------+

user_name

+-----------------+

Nicholas_Lativy

+-----------------+
1 row in set (1 min 13.18 sec)

The same can just as easily be done for the other wikis, and other tables.

Graham87 added a comment.Via ConduitDec 13 2009, 1:57 PM

Curiously, the import feature seems to convert underlines to spaces in usernames automatically. I just imported some history from Nostalgia Wikipedia to the English Wikipedia, thanks to bug 20280. Larry Sanger's early contribution list, especially before January 2002, is now quite interesting:

http://en.wikipedia.org/w/index.php?title=Special:Contributions&dir=prev&target=Larry+Sanger

Dinoguy1000 added a comment.Via ConduitDec 14 2009, 12:58 AM

Not really relevant to this bug, but importing such edits also causes diff sizes to be generated for them.

Graham87 added a comment.Via ConduitDec 28 2009, 8:53 AM

Re: Comment 12, the problem is not the at being changed to a dot, but the fact that the first letter of the username contains a lower-case letter. I've changed the bug name accordingly to take this into account.

Therefore I would consider this bug resolved if someone changed underlines to spaces in the username fields as described in comment 11, then used the same procedure to change initial lower-case letters in usernames to capital letters. The change in the user ID number would be nice, but not strictly necessary, and it would probably be more trouble than it's worth.

Graham87 added a comment.Via ConduitDec 28 2009, 9:02 AM

And it goes without saying that I'd like this bug fixed on all applicable wikis, not just the English Wikipedia. I'm particularly thinking about the Nostalgia Wikipedia here, but other WMF projects might be affected as well.

Dinoguy1000 added a comment.Via ConduitDec 29 2009, 1:42 AM

What WMF projects besides the en.wp were active back on the Phase I software, anyways?

Graham87 added a comment.Via ConduitDec 29 2009, 3:50 AM

(In reply to comment #23)

What WMF projects besides the en.wp were active back on the Phase I software,
anyways?

Plenty of them. Compare http://meta.wikimedia.org/w/index.php?title=Wikipedia_software_upgrade_status&oldid=2478 and http://en.wikipedia.org/w/index.php?title=Wikipedia:Complete_list_of_language_wikis_available&oldid=353094 ... that's only the Wikipedias.

It occurs to me that it might be easier to fix this bug by changing the Special:Contributions and deleted contributions pages to check for table rows with underlines and initial lower-case letters in the usernames.

Graham87 added a comment.Via ConduitDec 29 2009, 12:14 PM

This bug also affects some usernames from the Phase II software (which was used in the English Wikipedia from January to July 2002), so I've changed the bug title accordingly. See this edit to "military history":
http://en.wikipedia.org/w/index.php?title=Military_history&diff=98562&oldid=74194

Graham87 added a comment.Via ConduitDec 29 2009, 4:10 PM

(In reply to comment #24)

It occurs to me that it might be easier to fix this bug by changing the
Special:Contributions and deleted contributions pages to check for table rows
with underlines and initial lower-case letters in the usernames.

And it now occurs to me that fixing the problem by changing the contributions special pages, rather than changing the entries in the database, wouldn't fix the problem with importing edits in comment 19. See this page in my userspace:

http://en.wikipedia.org/wiki/User:Graham87/Import#Underline
Therefore my idea in comment 25 would be a second-rate solution.

Graham87 added a comment.Via ConduitJan 10 2010, 3:33 AM

Here's an example of this bug in a non-English Wikipedia: http://it.wikipedia.org/w/index.php?title=Karl_Pearson&oldid=4015

Graham87 added a comment.Via ConduitMar 18 2010, 5:21 AM

In the revision table of the Nostalgia Wikipedia, one of the usernames listed is "Brad_", so it was apparently possible for usernames to end in underlines in the phase I and II software.
In these cases, these usernames should probably be changed to "Brad old" or something similar. Replacing the underlines with spaces in this case would produce the username "Brad ", and the space at the end would still make the username invalid.

At the moment, I'm creating English Wikipedia accounts for all usernames that existed in the Nostalgia Wikipedia. Therefore, almost all of the usernames affected by this bug in the English Wikipedia will have a dummy account associated with them.

Graham87 added a comment.Via ConduitMay 10 2010, 7:16 AM

I've found some edits where the username is stored in the database with two consecutive spaces. None of these edits can be found through the user contributions list. I have changed the bug summary accordingly. In this diff, the extra space is not apparent when looking at the page in a browser, but it is obvious when checking the HTML source code:
http://en.wikipedia.org/w/index.php?title=Theosophy&diff=291009&oldid=291008

Graham87 added a comment.Via ConduitAug 14 2010, 9:16 AM

Here is an example from Meta of a username with a lower-case letter from the Phase II software:
http://meta.wikimedia.org/w/index.php?title=Market_research_in_progress_at_meta_wikipedia&action=history

I've also changed the bug summary to be more informative.

Graham87 added a comment.Via ConduitAug 14 2010, 3:33 PM

Hmmm, this is probably due to the facte that the rev_user field is non-zero for each of the edits listed in those two links, and in fact is linked to the user ID of the user who made the edit; this never happens in the English Wikipedia, so these methods cannot be used there. The rev_user field shows the user ID of the editor who made a particular edit; the equivalent field in the archive table is ar_user. The user ID for an edit is always 0 for anonymous editors, mass-imports and scripts; it isn't usually zero for normal registered users. If the user ID given for an edit made by a registered user is 0, then the "contribs" link won't show up for the user in the page history. This example comes from a mistaken import, but it is illustrative:
http://en.wikipedia.org/w/index.php?title=Ocean&diff=331657005&oldid=271538

No contribs are found for Ryan_Lackey (see top of bug report) in the API of the English Wikipedia, because none of his edits have an associated non-zero user ID:
http://en.wikipedia.org/w/api.php?action=query&list=usercontribs&ucuser=Ryan_Lackey

Nemo_bis added a comment.Via ConduitAug 17 2010, 5:06 PM

On the examples from Meta: note that in the history (and also Special:undelete) the links to user page a user talk are red even if the pages actually exist (I'm adding also a screenshot for future reference).

Nemo_bis added a comment.Via ConduitAug 17 2010, 5:08 PM

Created attachment 7634
Red link to existing lower case user and user talk pages

See bug 323 comment 35.

Attached:

Graham87 added a comment.Via ConduitAug 18 2010, 3:37 AM

I've changed the summary once again, so it shows the correct fields!

Nemo_bis added a comment.Via ConduitAug 19 2010, 1:43 PM

Thanks to [[it:User:Mauro742]] you can now find the complete list of all 4336 en.wiki affected revisions at [[User:Nemo_bis/Bug 323 revisions]].

Nemo_bis added a comment.Via ConduitOct 31 2010, 6:09 PM

Some edits of renamed users are affected, too, and have not been moved to the new username: compare http://meta.wikimedia.org/w/index.php?title=Special:Undelete&target=Native+American+Affairs&timestamp=20020127003202 by [[m:user:maveric149]] (lowercase: see also [[m:Special:Contributions/maveric149]] which for some reason is not empty) and http://meta.wikimedia.org/w/index.php?title=Wikimedia_bank_account_history_for_2004&action=history which was created after the user was renamed to Daniel_Mayer (http://meta.wikimedia.org/w/index.php?title=User:Maveric149&diff=1352761&oldid=157121) and is now under the correct username Mav (I've just restored this page).

Dinoguy1000 added a comment.Via ConduitNov 3 2010, 2:30 AM

See also bug 3507, dealing with the usernames themselves instead of edits attributed to those users.

Peachey88 added a comment.Via ConduitJul 9 2011, 3:02 AM

deblocking from 29757, these have nothing to do with user renames, they are caused from user accounts predating phase3 (aka mediawiki as we know it today)

Nemo_bis awarded a token.Via WebDec 12 2014, 8:02 AM
Nemo_bis added a project: Epic.Via WebJan 2 2015, 7:01 PM
Nemo_bis set Security to None.
Ricordisamoa awarded a token.Via WebJan 21 2015, 1:25 AM
Ricordisamoa added a subscriber: Ricordisamoa.
scfc added a subscriber: scfc.Via WebJan 21 2015, 4:07 AM
Krenair added a subscriber: Krenair.Via WebFeb 13 2015, 10:13 PM
tomasz removed a project: Shell.Via WebFeb 23 2015, 7:53 PM
epriestley closed this task as "Resolved" by committing Unknown Object (Diffusion Commit).Via DaemonsMar 4 2015, 8:19 AM
TobyBartels added a subscriber: epriestley.Via WebMar 4 2015, 10:09 AM

How did you manage to resolve this? The databases are not currently fixed; is there a script ready to be run that will do this? @epriestley

TTO reopened this task as "Open".Via WebMar 4 2015, 10:17 AM
TTO added a subscriber: TTO.

This was erroneously closed due to a bug.

Aklapper removed a subscriber: epriestley.Via WebMar 4 2015, 11:18 AM
Reguyla moved this task to Special pages on the Wikimedia-General-or-Unknown workboard.Via WebTue, Jun 16, 2:11 PM

Add Comment