Page MenuHomePhabricator

articleinfo-authorship miscounting contributions
Closed, InvalidPublic

Description

https://xtools.wmflabs.org/articleinfo-authorship/en.wikipedia.org/The_King_Of_Rome credits a bot with writing 51% of the en.Wikipedia article 'The King of Rome' - with a contribution of 28 characters. I'm the only other contributor listed, with 26 characters. That's clearly wrong.

XTools version: 3.3.1-2f87b48

Event Timeline

Pigsonthewing updated the task description. (Show Details)
JJMC89 added a subscriber: JJMC89.Apr 12 2018, 6:24 PM

The linked page is for a redirect (The King Of Rome) to The King of Rome.

The authorship information for The King of Rome is https://xtools.wmflabs.org/articleinfo-authorship/en.wikipedia.org/The_King_of_Rome, which lists Pigsonthewing: 5,201 characters (67.7%).

MusikAnimal closed this task as Invalid.EditedApr 12 2018, 6:37 PM
MusikAnimal moved this task from Inbox to Complete on the XTools board.
MusikAnimal added a subscriber: MusikAnimal.

I think he's talking about the redirect: https://en.wikipedia.org/w/index.php?title=The_King_Of_Rome&redirect=no (but you're right that he may think he's looking at the target article)

It looks like it's correct, though. Excluding spaces, you added #REDIRECT[[TheKingofRome]] (26 characters) and the bot added {{Rfromothercapitalisation}} (28 characters).

It is possible however that it will sometimes be a little off. This is because WikiWho (the service we use for this) goes off of tokens, not characters. A token is any group of characters, such as a word or {{ (the start of a template). For instance, if you put "Gooogle" and someone came along and fixed your typo, they get credit for the whole word (token). XTools is just counting the characters in each token, because I felt that was easier for people to understand.

This sounds like a bad system, but it is just a consequence of how they are measuring content persistence, which is a very difficult problem to solve. Nonetheless, their gold standard tests showed their algorithm to be 95% accurate. So in general, I think you can take the authorship stats to heart.

Finally, note that "characters" are not the same thing as "bytes", which is what the numbers are when you view revision histories (+50 or -50, etc.). On English Wikipedia this should not make a big difference, though, since most Latin characters use only a single byte.

I'm going to close this as invalid, but we love hearing your feedback, so keep it coming :)