Page MenuHomePhabricator

Wikistats Bug: all but 2018 data missing?
Closed, ResolvedPublic5 Estimated Story Points

Description

  • Go to Wikistats 2.0. Select English or French Wikipedia.
  • Select editors
  • Ask to see data for two years
  • Expected result: see two years of data
  • Actual results: looks like I'm seeing only results for 2018

Screen Shot 2018-04-23 at 10.37.28 PM.png (1,496×1,114 px, 207 KB)

https://stats.wikimedia.org/v2/#/en.wikipedia.org/contributing/editors

Event Timeline

Rerunning indexing for 2018-02 snapshot [0033301-180330093100664-oozie-oozi-W]

I re-run indexation for 2018-02 snapshot and now segment sizes in druid are what i would expect, about 2 G per segment.

sudo -u hdfs oozie job --oozie $OOZIE_URL -Drefinery_directory=hdfs://analytics-hadoop$(hdfs dfs -ls -d /wmf/refinery/2018* | tail -n 1 | awk '{print $NF}
') -Dqueue_name=production -Doozie_launcher_queue_name=production -Dstart_time=2018-02-01T00:00Z -Dstop_time=2018-02-02T00:00Z -config /srv/dep
loyment/analytics/refinery/oozie/mediawiki/history/reduced/coordinator.properties -run

Note this indexation did not changed the 2018-03 data, we need to take a second look at that snapshot and see if it is correct cc @Milimetric

We also need totake a closer look to what happened here.

Nuria triaged this task as Unbreak Now! priority.Apr 24 2018, 5:39 AM

@Nuria actions fixed the problem for data up to 2018-02. I restarted a job ending in 2018-03 as the problem is not related to snapshots but to wrong indexation while testing. Will follow up later today when back from day off.

Job finished, data is up to date. Thanks @Nuria and @Milimetric for having spotted the problem and quick fix it !

Indeed, confirmed all looks good, I'll put this in code review so we can remember to talk about what happened.

oops, closed by accident.

Nuria set the point value for this task to 5.
Nuria moved this task from In Code Review to Done on the Analytics-Kanban board.

@JAllemandou I see the Editor data now. Thanks. But when I split by editor type, Anonymous users come in as zero. See screenshot. So it looks like anon editor data is still missing? Or am I doing something wrong—or is the fix just not on production yet?

Screen Shot 2018-04-24 at 8.32.34 AM.png (1,027×660 px, 80 KB)

Reindexing again 2018-02 data, looking into 2018-03 issue with anonymous editors

oozie job -info 0034476-180330093100664-oozie-oozi-W

Reindexing done, need to delete last month on snapshot.

Change 428922 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Correct mediawiki-history job bugs

https://gerrit.wikimedia.org/r/428922

Change 428922 merged by jenkins-bot:
[analytics/refinery/source@master] Correct mediawiki-history job bugs

https://gerrit.wikimedia.org/r/428922

Change 428977 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Add unittest for a Mediawiki-History function already fixed

https://gerrit.wikimedia.org/r/428977

Data up to 2018-02 is now 2018-02 snapshot, removed last segment 2018-03

Change 428977 merged by jenkins-bot:
[analytics/refinery/source@master] Add unittest for a Mediawiki-History function already fixed

https://gerrit.wikimedia.org/r/428977

Vvjjkkii renamed this task from Wikistats Bug: all but 2018 data missing? to meeaaaaaaa.Jul 1 2018, 1:14 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed JAllemandou as the assignee of this task.
Vvjjkkii lowered the priority of this task from Unbreak Now! to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed the point value 5 for this task.
Vvjjkkii removed subscribers: gerritbot, Aklapper.
AfroThundr3007730 renamed this task from meeaaaaaaa to Wikistats Bug: all but 2018 data missing?.Jul 1 2018, 6:24 AM
AfroThundr3007730 closed this task as Resolved.
AfroThundr3007730 assigned this task to JAllemandou.
AfroThundr3007730 raised the priority of this task from High to Unbreak Now!.
AfroThundr3007730 updated the task description. (Show Details)
AfroThundr3007730 set the point value for this task to 5.
AfroThundr3007730 edited subscribers, added: GerritBot, Aklapper; removed: JAllemandou.