Page MenuHomePhabricator

tools.wmflabs.org/pageviews data corrupted
Closed, InvalidPublic

Description

Recently ( on 14 June 2019 ) I had an article on DYK. During the DYK day an editor rudely moved the article to a new but similar name.

Air Lock Diving-Bell Plant -- moved to---Air lock diving-bell plant.

This meant that [https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2019-05-26&end=2019-06-14&pages=Air_lock_diving-bell_plant| tools.wmflabs.org/pageviews came up with a wrong total for pageviews concerning the article. The page views for the previous name were lost.


Regards

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 15 2019, 1:01 PM

Good point. Still. to get the true stat, you would have to know that the article title had been changed, and add it onto the query. This of course will not happen in the future, in time knowledge of the name change will be effectively buried under the weight of future edits mounting up. The score does not move with the title change, its effectively lost. A more important article will be affected by this phenomena, recently ''Windjammers'' was changed to ''Iron-hulled sailing ship''. That's a very old, fairly popular, higher trafficked article, It's a kind of a bug. ~~~~

Second comment. I notice there is something odd in the stat report you kindly gave m. The name change was made ''very'' late in the DYK day. After the name change the number of views crashes to 710 with 20 edits, but before the change there are nearly 23,000 views and only 2 edits. Surely it should show 22,789 edits and 20 edits?

to get the true stat, you would have to know that the article title had been changed, and add it onto the query. ... The score does not move with the title change, its effectively lost.

I'd call it more of a caveat than a bug, but it's certainly annoying and not super intuitive. There's T141332 but this is a very complicated approach to the problem that wouldn't work but so well. The real solution needs to be in the underlying API. For that there's T159046.

There is also Redirect Views (same tool in the screenshot you gave). This usually gives you what you want, since redirects are usually left after a page move. There are plans to integrate this functionality into the main Pageviews tool (T163621).

I notice there is something odd in the stat report you kindly gave m. The name change was made ''very'' late in the DYK day. After the name change the number of views crashes to 710 with 20 edits, but before the change there are nearly 23,000 views and only 2 edits. Surely it should show 22,789 edits and 20 edits?

Unlike pageviews, edits are moved with the page move. So if you're looking at the old location, you will only see edits for that (2 edits at the time of writing). Not sure if that answers your question.

I recognize this system is far from ideal. I think T163621 is probably the best solution in the short-term. I'll try to get to it soon.

Thanks for the reply. I guess that effectively closes the issue at this level at least.

MusikAnimal closed this task as Invalid.Jun 18 2019, 4:45 PM
MusikAnimal moved this task from Backlog to Done on the Tool-Pageviews board.

No problem. I'm going to close this as invalid since there doesn't appear to be a bug.