Tue, Sep 18
Thu, Sep 13
Wed, Sep 12
Hm, @bd808 do you think it would actually be useful to address issues like that one by one or just wait for a better way to group pages with all their redirects? (as in, parsing wikitext historically and building the redirect graph)
I acknowledge my bias here, as a member of the team that's going to implement this feature. We have meetings to make sure we understand and include the use cases that Tilman's thinking of for the design of the metadata management part of this project. I believe that's the only remaining item, and there's a path to resolve it, so in my opinion, yes, this can go to last call as far as Tech Com is concerned. The decision to use git as storage seems noncontroversial and what happens with metadata seems to not interest TechCom too much, though obviously it should be handled with care.
Tue, Sep 11
So it looks like Edit_11319708 and onward have this data, @Neil_P._Quinn_WMF when you have a sec let me know what you're looking for exactly. I think it's weird that the older tables don't have wikitext but I think I remember it wasn't instrumented back then?
Mon, Sep 10
Thanks for that, @Tbayer, I'm not sure at all what could be the inconsistency, but I've only looked for trivial obvious problems so far. I'll continue to keep this tab open in my browser and hope to return to it.
@Amire80, Francisco's right about the setup. However, if you have a config page testing it locally isn't too important, I published your dashboard and it looks good (we always do that step anyway):
We got lucky and the Turnilo folks fixed this upstream: https://github.com/allegro/turnilo/pull/173
Hmmmm, the queries / data / graphs / my recollection all seem to point to attempt, success, and failure being saved, for wikitext: https://edit-analysis.wmflabs.org/compare/
@sahil505 I was a little busy at the end of your GSoC session to say this, but I wanted to make sure it's somewhere public: you did an amazing job, and put Wikistats 2 in a much better place than where you found it. Thank you so much, you should be proud of your work here.
The suggestion to collect use cases is great, thank you for that. I think X!Tool could use this data so it doesn't have to crunch it itself, but maybe that's not strong enough for now. Ultimately we want to have a lot more data available via this and similar APIs, and as we evolve towards that goal we'll have to solve this problem. It seems natural then to use the same solutions that the community agrees upon with X!Tool, namely opt-in. So when we do think about publishing this again, we could proceed as follows:
Fri, Aug 31
ping @Anomie or @Tgr: were either of you interested in finishing this change https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/331100/ or should I consider taking it over when I have some time?
Nice! Let's make sure to update the documentation (https://wikitech.wikimedia.org/wiki/Analytics/Systems/Superset#Access ) regarding any improvements.
Mon, Aug 27
Fri, Aug 24
We have to run events through refine anyway to see what's wrong, since nobody remembers, but let's do this:
Thu, Aug 23
@Neil_P._Quinn_WMF so we moved this to radar because we can't ingest the events as they are right now. So the editing team needs to change the schema to comply with the guidelines. Do ask us if you need help crafting the schema.
Aug 21 2018
If all you're looking for is a list of revisions that have been deleted, then the data lake will indeed help you out. If you need help with Hive, let me know, but essentially start hive and do this:
Aug 20 2018
Sahil did this
deprioritizing this task, unless @Ottomata disagrees
Untagging Analytics as there hasn't been any input in a while. When you know more details, tag us again.
There's a work-around, and maybe they'll actually fix the bug upstream sometime.
Aug 17 2018
Aug 15 2018
Ok, starting work on this. Basic plan:
Aug 14 2018
Something really really strange is happening. I added this file back in January: https://github.com/wikimedia/puppet/blob/53abe99dc8604f176e95ae7028efd6cf76cf6645/modules/dumps/manifests/web/html.pp#L58 (content here: https://github.com/wikimedia/puppet/blob/53abe99dc8604f176e95ae7028efd6cf76cf6645/modules/dumps/files/web/html/pageviews_readme.html) so I have no idea why it just showed up, there was no puppet change that I can see that would've done anything to it... But when I navigate to that link, sometimes it shows up and sometimes it doesn't. Maybe someone with better puppet-fu than me can take a look? @ArielGlenn?
No @WMDE-leszek we have a lot of other stuff we need to do first. It is a relatively small task, so I'd be happy to mentor someone doing it. I'll also try to grab it when I'm ahead on my other work, but that somehow never seems to happen :/
Aug 13 2018
Thanks @Nettrom. Backfilling that data so it shows up on the dashboard is a bit of a pain, but if you think it would be useful I'll do it.
Aug 10 2018
As of June 21st, the query started returning 0. Indeed, there's no data after that day: