Mon, Jun 11
Thu, Jun 7
Fri, Jun 1
Thu, May 31
Wed, May 30
Thanks @ezachte! That helps a lot. I'm using the article count data as part of our monthly metrics, so I want them on an ongoing basis. Happily, I've figured out to use the zipped CSVs so I don't need anything new 😁
Tue, May 29
Mon, May 28
This is mostly done! I've heavily overhauled the calculation pipeline and created a notebook for calculating the monthly metrics table.
Fri, May 25
Thu, May 24
@ezachte, I've found the location of the data I want within the Wikistats dumpfiles, but I'm still not sure which folder is the canonical one.
It looks like this is because there have been several schema changes since March 2017 and I assume the queries haven't been updated.
Wed, May 23
Tue, May 22
@ezachte, @Erik_Zachte (I don't know which is the real one!): I'm working on producing these new metrics for our Board reporting, and I'm trying to get historical monthly data on the article counts for all our projects. This seems really tricky to calculate myself, but I think I can get it from Wikistats instead :)
@ggellerman did you ever make a research pipeline project? 😁
Mon, May 21
Sat, May 19
This looks like it will be harder than I thought. We want to start tracking some content metrics (namely the number of articles, number of media files, number of Wikidata entities, and number of Wikidata claims).
May 17 2018
May 16 2018
May 15 2018
@Jdforrester-WMF, have you noticed this?
Benoît is back and, as far as I can tell, I didn't break anything 😁
May 14 2018
This is actually resolved! When I originally filed this, the timestamps were stored in Mediawiki's string format (e.g. 20180514215953), but in June 2017 we started storing in JDBC format (e.g. 2018-05-14 21:59:53.0). This means Hive's date and time functions can now operate on them.
@Pginer-WMF, do you want the Product Analytics team to work on this, T194647, T194648, and T194650? If so, we can definitely consider it although you should note that (1) it may take a few weeks to triage since we're working through a big backlog right now and (2) we'll continue to be very capacity limited until our new data analyst starts, hopefully in 2-3 weeks.
May 12 2018
May 11 2018
For history purposes, today's TechCom radar email said about this proposal: "discussed and decided not to continue discussion within TechCom, because the standard is not currently relevant".
May 10 2018
Sadly, I never got a chance to do this. @egalvezwmf will be very kindly doing some basic analysis of this year's data for us.
We still want to do the open subtasks, but they're not really linked to each other so there's no point in keep this parent task open.
Damn it. Today is not my day.
Hmm, let's try to actually unsubscribe Morten!
Actually, let's not overload Morten with notifications before he starts :)