Page MenuHomePhabricator

Stats tool for Wikimedia events outcomes
Closed, ResolvedPublic



The organization of editathons and online writing contests and challenges is something common within the Wikimedia movement. The good results of these events encourage us to continue organizing them. However, obtaining statistics after the events is usually a manual, complex and tedious task. There are already some tools (such as Fountain to deal with this, but they are too complex to activate, incomplete or depend on one person/developer.


We would like to obtain a new tool, more user friendly and easy to use by anyone at any time. Reuse is a core value in the Wikimedia projects, so the tool has to be available in every language and project within the Wikimedia world.

The tool would measure, within a given time frame and from the insertion of user account names and a certain list of articles, the number of articles edited, number of bytes and number of editions. Also, it would be interesting to allow filtering those articles that do not meet a certain size and that does not measure the number of bytes from tables and templates.

Event Timeline

Toniher renamed this task from Stats tool to Stats tool for Wikimedia events outcomes.Apr 30 2018, 10:09 PM

Maybe grantmetrics will suite you. Haven't used it (yet), just saw presentation of it by @MusikAnimal at conference.

Hi! Indeed Grant Metrics might serve some of your needs. It was built to be simple and easy to use. Despite the name, we hope it will be suitable for general editathons and not just grant-based programmes.

I will comment on things pointed out that Grant Metrics currently does not support, so you can gauge whether it is/will be the right application for you:

tool has to be available in every language and project within the Wikimedia world

We currently only support Wikipedias, but we will eventually support all projects. We want to make sure the generated statistics are tailored for that project. For instance, with Wikisource you may want the number of texts uploaded, proofread, validated, etc., and not just raw number of pages created/improved.

given a ... certain list of articles

Currently we only take usernames as the basis for the event (along with a time frame). Soon we will allow you to go by template transclusions (e.g. Art+Feminism Editathon), and categories. It seems taking an explicit list of articles would be a simple feature that we can look into adding.

measure number of bytes

Currently we report the number of pages edited, pages improved, "new editors", and new editors that were retained (edited 7+ days after the event). Adding a metric for bytes added is doable, and this would give you a good sense of participation. However I should note this is not the same as bytes retained. It would be more telling (in my opinion) to see how much content added by your participants is still live. That involves content persistence, which is a very complex problem.

filtering articles that do not meet a certain size

This is a nice idea, but I wonder if it is a commonly desired metric. We want to keep Grant Metrics as simple as can be, so maybe for something like this you could use the CSV export feature and run the equations that way.

does not measure the number of bytes from tables and templates

This sounds like you want to measure prose, as in raw content that is not wikitext syntax. This is doable. My thinking is instead of "bytes added" as noted above, we could measure the difference in prose for a given article from the start of the event to the end (this may include new articles). So what that means is that it may include prose added by non-participants, but I think this is okay (e.g. some patrollers will end up helping out too, which is good!). You will still get a good sense how much things improved as a result of the editathon. Measuring prose added by each individual participant, on each individual article, is expensive to compute and probably not something we'll explore in the near term.

That being said, do you think Grant Metrics will suit your needs? I would love to hear more.

I should say, that at least with what we've done with Grant Metrics, the effort involved definitely exceeded what could be done at a single Hackathon. Of course don't let me discourage you from working on a new tool, though! :)

In addition to Grant Metrics, Community Tech will soon be looking into other tools for program and events organizers, which may fulfill your requirements. This will probably involve improvements to Grant Metrics, but it's possible we'll build some separate, supplemental applications.

I will be at the Wikimedia Hackathon if you want to talk more :) I'd love hear any feedback.

Hi! Thank you very much to both for the information. However, although Grant Metrics seems like a good tool (I didn´t know it), it does not still meet the requirements I raise. I explain below:

It would be interesting if it were a tool for any person or event, not just to measure metrics of a given grant program.

It does not take into account a list of articles so the statistics would not correspond to the reality of a particular event, since it measures any contribution of an account.

The number of bytes, at least the bytes of text (excluding tables and templates) seem to me another interesting fact when evaluating the success of an activity since they allow, for example, to calculate the average size of each article or contribution.

If you could incorporate those improvements it would be fantastic.

I am aware that these are perhaps too complex requirements, but after years of experience organizing events we continue with a very manual management of them and we consider that a useful and easy to use tool is necessary. Something like that happened with the management of Wiki Loves contests, until the wonderful Montage tool appeared and since then everything is much easier :).

Unfortunately I will not be able to attend the Wikimedia Hackaton :(

Thanks for the feedback. I will share this with my team. I can't say for sure if we'll implement all the proposed features in Grant Metrics, but it's good to hear what people want, because addressing the needs of event organizers such as yourself is one of our goals for this year.

If anyone is interested in working on this at the Hackathon, feel free to reach out to me. I don't know what Community Tech's plans are for the hackathon, or if we'd be able to help, but we'd love to at least see what's in the works as this task aligns with some of our commitments for 2018.

@Rodelar have you tried using Programs & Events Dashboard for your editathons? Art + Feminism had good success with it this year for helping organizers set up a few hundred editathons in March, and it's possible to configure it so that it will only report stats on the set of articles you want to track. If it doesn't work for your use case, we're eager to identify how it can be improved, as editathon campaigns are a very common use case for it.

And in case you need to file any bugs:

Also, although I won't be at the 2018 Hackathon, several developers familiar with the project will be. Anyone interested in this area can let me know if you want to get in touch with them. :-)

Yes, me not mentioning the dashboard was not deliberate. You should give it a try! Community Tech hopes to fulfill some of your needs, too, but it's painful to hear you're currently doing these things manually, so do explore the available options :)

Hi Ragesoss and MusikAnimal.

I know the Dashboard and we have used it, for example, with Art+Feminism, but it does not work the way we would like. For example, it adds up everything the registered editors do, and that is not always necessary, but what they have added in a certain number of articles. The addition of users in the list is complicated and the list of articles that is provided does not work properly.

In view of all this, I think that the Grants Metrics could be our solution, but applying the characteristics that I suggested in the previous comment. Counting users, articles or bytes manually implies a huge effort and dedication of time, so if those improvements could be implemented, we would appreciate it :).

CommunityTechBot raised the priority of this task from High to Needs Triage.Jul 3 2018, 1:51 AM

Hi @Rodelar, the Event Tool (working title) CommTech is planning should meet these needs. Please be sure to give us your thoughts about metrics you require by answering the questions in this talk page post: Event Tool metrics—what do you want? Thanks!

I'm going to remove Community Tech and Event Metrics from this ticket, as it is more or less an Epic that duplicates the entire Event Metrics project. @Rodelar, if you have comments about our plans for Event Metrics, we want to hear from you.

Hello. Thank you @jmatazzoni, I´m ok with it. Looking forward to check the new features of Events Metrics tool :)

I'm going to remove Community Tech and Event Metrics from this ticket, as it is more or less an Epic that duplicates the entire Event Metrics project.

@Rodelar: Is it fine to close this task as it seems that there's nothing to follow up in this task?

Sorry for the delay in responding. I agree to close it. Thank u!