Page MenuHomePhabricator

Clean the code review queue of analytics/wikistats
Closed, DeclinedPublic

Description

(This is an experimental task, and this is why I just CCed everybody subscribed to the main task)

After T88531: Goal: Organize a Gerrit Cleanup Day on September 23, 2015, analytics/wikistats is the Wikimedia repository with oldest changesets waiting for code review. This task aims to remove this repository from the pole position, hopefully with a successful review of all these patches.

https://gerrit.wikimedia.org/r/#/q/status:open+project:analytics/wikistats,n,z shows a common pattern:

  • all patches are submitted by volunteers, to be specific one volunteer (@Nemo_bis)
  • they are more than one year old
  • they got no reviews at all, not a single comment (Jenkins aside)
  • they have almost no reviewers nominated (@ezachte is listed in one)

Analytics-Engineering, do you want to play this game during DevRel-October-2015?

Also asked on the mailing list.

See also:

Event Timeline

Qgil created this task.Sep 25 2015, 7:06 AM
Qgil assigned this task to Aklapper.
Qgil removed Aklapper as the assignee of this task.
Qgil raised the priority of this task from to Medium.
Qgil updated the task description. (Show Details)
Qgil set Security to None.
Qgil removed projects: ECT-July-2015, ECT-August-2015.
Qgil added subscribers: Jay8g, mmodell, Tgr and 17 others.

One roadblock to merging them is that that the CI might not work anymore: T113725: job analytics-wikistats has zuul trigger but vanished from integration/config

I might be able to give a "+1 I would merge this" on some of them. But even if I had permission to +2 them the deployment would still need to happen. No use in changing it and not putting it live, that just makes life hard for the next deployment. I found no documentation (e.g. at https://www.mediawiki.org/wiki/Analytics/Wikistats ) about how to deploy this, so here my attempt at finding out, based on my reading of the puppet leafs:

There is probably a checkout of that project on stat1002.eqiad.wmnet under /srv/wikistats_git but its creation and update is not puppetized and I found no indication about automation for it. Similarly I find references to /srv/stats.wikimedia.org but nothing that creates it. Maybe I missed something. If not it would be worthwhile to improve that a bit. But again to deploy those improvements it is a good idea to have the people who know how to deploy changes there sanity check that.

Tgr added a comment.Sep 25 2015, 10:21 PM

There is probably a checkout of that project on stat1002.eqiad.wmnet under /srv/wikistats_git but its creation and update is not puppetized and I found no indication about automation for it. Similarly I find references to /srv/stats.wikimedia.org but nothing that creates it.

They are on stat1003 actually, and not updated since 2013 so that's probably not where the actual wikistats lives.

I use the WikiStats code on my own server for non-Wikimedia wikis, so I actually care about merge even without deployment. :)

@Analytics: Who should be the assignee of this task?

so that's probably not where the actual wikistats lives.

Sounds like Analytics-Engineering need to check https://www.mediawiki.org/wiki/Analytics/Wikistats and potentially update it, as one outcome for Wikimedia wikis.

Oh I missed that stats.wikimedia.org is served by stat1001 (according to the misc varnish config), so there is a 3rd host that has parts of this around.

Aklapper updated the task description. (Show Details)Sep 28 2015, 12:01 PM
ezachte added a subscriber: Milimetric.EditedSep 28 2015, 12:59 PM

Some info on these Wikistats issues:

1 Wikistats runs on two servers

stat1002 for monthly reports, using stub dumps
stat1003 for additional metrics only available from full archive dumps (word count, links, trends in avg page size). As these dumps are insanely huge this happens on a much slower cycle

(right now I don't seem to have access to stat1003 via putty, but the job runs since long without interruptions)

2 Most monthly jobs run from my home directory on stats1002

Also on stat1002 is /a/wikistats_git/ which used to be the place to run jobs from, but there were push issues (this dates back to before gerrit), and at that time no-one knew why, so I couldn't update bash files from there, etc

3 Obviously running from home dir is not ideal for others when they need to take over in an emergency.
More things are not up to par for Wikistats, but Wikistats is going to be made obsolete at WMF asap (actually since hadoop arrived, 4 years ago, but real soon now). As I work 1/3 FTE (by own request) I have to set priorities. Restructuring code is not one of them. Helping Wikistats be superseded should be one of them, and I'm eager to help with that.

Nothing is puppetized, BUT as I said Wikistats is near end of life at WMF, so better not bother with puppet (preserving the code base is another thing of course, if only for @Nemo_bis)

4 For a year my git repo wasn't updated, some lingering sync issues. Some months ago @Milimetric helped me to cleanup my totally out of date git repo. Which was great, and I want thank him once again! So now all my code is in git and Nemo's code as well, as far as I know, please correct me if I'm wrong. So my understanding is those gerrit tasks are obsolete. I'm back to pull/push and no longer use gerrit myself (burn me).

Thanks

So my understanding is those gerrit tasks are obsolete.

At least one I tested is not obsolete. https://gerrit.wikimedia.org/r/#/c/92066/ cleanly rebased to master and still had content (I did that locally and didn't push the rebase).

More things are not up to par for Wikistats, but Wikistats is going to be made obsolete at WMF asap
[...]
So now all my code is in git and Nemo's code as well, as far as I know, please correct me if I'm wrong. So my understanding is those gerrit tasks are obsolete.

Thanks for elaborating! From a quick comparison of a fresh git checkout of analytics/wikistats and the five unreviewed patchsets in Gerrit, I don't see any of those five changesets fully reflected in the git codebase. So they don't seem to be obsolete to me.

@ezachte: What would be the plan forward to make decisions on these five proposed changesets in Gerrit?
Do you plan to review those? Or is the repository "deprecated" from your point of view and nobody should provide any patches to that codebase anymore? Or something else or in-between?
Thanks for your help!

matmarex removed a subscriber: matmarex.Sep 29 2015, 1:26 AM

@ezachte: Could you answer the last comment please?

Sorry. @Aklapper I missed your previous comment. And @JanZerebecki answering yours is still on my list.

This week I will have to focus on T114379, as that one has been prioritized and others depend on my input. But I will follow-up on this one.

@ezachte: Thank you, great to hear this is on the radar! I'll assign this task to you for the time being.

Tgr added a comment.Oct 13 2015, 10:58 PM

There is a proposed WikiDev16 session about improving code review, especially for volunteers: T114419: Event on "Make code review not suck". You are welcome to comment there if interested.

This week I will have to focus on T114379, as that one has been prioritized and others depend on my input. But I will follow-up on this one.

@ezachte: Any vague timeframe? :)

I reached out by mail to @Nemo_bis with comments on each open patch.

@Nemo_bis says this may have to wait till Feb
in the meantime I can look further into '[Full dump analysis] Reduce edits_only and reverts_only intricacy'

Aklapper moved this task from Backlog to Doing on the DevRel-February-2016 board.
Milimetric moved this task from Analytics Query Service to Radar on the Analytics board.
Aklapper removed ezachte as the assignee of this task.Nov 18 2019, 11:52 AM

Removing assignee @ezachte as that Phabricator account has been deactivated. (If there are questions, it seems that @erik_zachte could be contacted.)

Milimetric closed this task as Declined.EditedNov 19 2019, 4:16 AM

Wikistats 1 is no longer maintained. (And then I realized the irony of declining this without cleaning up the gerrit changes, so I went and abandoned everything open in analytics/wikistats. Leaving the same note for Nemo here too: let me know if this is undesirable in some way, we can figure something out.)