Research are working on core metrics guidelines for Communications, so this is a good opportunity to nail down all the details of this metric.
Details at meta:Research:Defining monthly active editors, 2016.
Research are working on core metrics guidelines for Communications, so this is a good opportunity to nail down all the details of this metric.
Details at meta:Research:Defining monthly active editors, 2016.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | nshahquinn-wmf | T117221 [Epic] Update official Wikimedia press kit with accurate numbers | |||
Resolved | nshahquinn-wmf | T144639 Propose metrics along with qualifiers for the press kit | |||
Resolved | nshahquinn-wmf | T151507 Refine definition of active editors metric |
@leila, @ezachte, and I just met on this, and we agreed how to resolve most of the inconsistencies.
Action items:
@Neil_P._Quinn_WMF thanks for documenting it and your help. Just a note that we agreed to run the final proposal re the updated metric by Dario before sending it out. I'll take care of that as well.
I've documented the new definition at meta:Research:Active editor and updated our metric calculations to match (results available at mw:Wikimedia Product # Editing.
Remaining action items:
Other than that, I believe my work here done.
@Neil_P._Quinn_WMF I haven't yet, thanks for flagging this. I'll set aside time for reviewing it this week.
Now that we've developed a consensus definition for active editors
Really? Was there a discussion somewhere on analytics or wiki-research-l or wikimedia-l or some other relevant discussion venue?
I've not had the opportunity to comment before, so I left some comments now: https://meta.wikimedia.org/wiki/Research_talk:Active_editor
I expect other people may have comments too, and it would be better to have a discussion now rather than later (e.g. when they are caught by surprise due to WikiStats changes). So, again, please notify/open a discussion at least on the main mailing lists.
@Nemo_bis I responded to your comments on the talk page.
On the larger question of having a broader discussion, this project (as I said just now on the talk page) was only about agreeing on minor technical details that had previously only been decided implicitly by the implementers of these metrics (such as me and @Erik_Zachte). It was not about making any major changes.
There are major changes I would like to make, like including non-content namespaces in this calculation or even moving from an edit-counts based metrics to a session-time based one. However, those would be much more disruptive, so I would absolutely propose those for a broader discussion first on the relevant mailing lists (analytics-l, wiki-research, wikitech, and maybe wikimedia-l as well).
Anyway, if you think I'm totally wrong about the significance of these changes, you're welcome to start a mailing list discussion and see if significant numbers of other people agree with you. If that turns out to be the case, we will definitely adjust.
I also responded on the talk page. And actually reconsidered while doing so. It seemed adding extensive code to detect internationalized redirects wasn't worth the trouble for filtering a few edits. But then I realized we need that detection anyway to filter redirect pages from article counts (or else English Wikipedia would have 13 million 'articles'). Hmm
Regretfully other urgent work got in the way this week and I won't be able to give detailed feedback on this until I am back on January 10. Thanks for leading this effort so far, y'all.
That's true, but consider that there is no good historical source of data about redirects other than manually going over the text of old revisions via the dumps or the API. You can't do it in the (MariaDB) application databases, because they don't contain the text of old revisions (that's in External Storage). You also can't do it even for present data in the Data Lake, because it does not have info on redirects or internal links.
So eliminating the redirect requirement from active editors still dramatically increases the number of ways it can be calculated. It doesn't help us with calculating historical article counts, but that's a separate issue.
@Neil_P._Quinn_WMF from my point of view, we have converged here. If you and Erik agree, let's close it. thank you for all your work on it. :)