It looks like maybe this is to blame? https://github.com/wikimedia/analytics-quarry-web/blob/4b3583c4cf7f45b7bac56b8df9dfd0799a12111a/quarry/web/output.py#L80
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
May 24 2019
May 22 2019
I trained some damaging and goodfaith models. They are performing... OK. We're getting in the upper 80s for ROC-AUC. I would expect a solid model to be in the mid-90s so there's definitely some more work to do. But, it looks like these models will be *useful*. So I'll get a pull request together.
I worked with @zhuyifei1999 to develop https://etherpad.wikimedia.org/p/chinese_word_lists and then implemented it in https://github.com/wikimedia/revscoring/pull/438
Would 1800 UTC on Friday, May 24th work for you, @Bstorm?
Yup. We've got that in the config. At the very least, we'll need to do a restart to reconnect to the new DB based on the DNS change. But we'll also probably want a little downtime for when we are in read-only mode.
For clarity, I don't think ORES was ever "banned", but we did receive resistance when we first sought to document ORES on dewiki. That was back in 2014. I think *some* discussion is necessary before we do any deployment on German Wikipedia.
Hi @Bstorm, I just got back from the Wikimedia Hackathon and I'm catching up on other things. I don't think we can schedule maintenance and make the switch today. Friday seems more likely. Could that work?
May 20 2019
May 19 2019
Just talked to @Krenair for a bit about swift and he asked a me a good question: Why are we looking to get off of NFS again? Just how slow is it? Are we sure it is a performance bottleneck? How big of a bottleneck?
Talked to @yuvipanda and he suggested we look into https://docs.min.io
Nice work! Thanks for working on this!
OK should be good in this one: https://wikitech.wikimedia.org/w/index.php?title=Hiera:Ores&diff=1826595&oldid=1826556
https://github.com/wikimedia/revscoring/pull/439 Here's the updated code.
https://etherpad.wikimedia.org/p/dutch_badwords @Catrope and I worked on this.
@Ciell, maybe you have some ideas for how we could get started here. Or if an article quality prediction model would be helpful at all.
I looked into this with @RonnieV and we weren't able to find any documentation about an article quality scale on nlwiki. I think defining article quality in nlwiki terms will be a good first step to building this model. Alternative, one could translate the English Wikipedia article quality scale for use on nlwiki. See https://en.wikipedia.org/wiki/Wikipedia:Content_assessment#Grades
May 18 2019
Looks like this is set from here: https://github.com/wikimedia/puppet/blob/011a633fe9f09e64715dc198d1e0fae9028d7cf1/hieradata/labs.yaml#L271
Looks like we have allowed_hosts=172.16.7.178. So maybe the icinga2 server changed?
May 17 2019
Yes. We will need to have the Growth-Team update the thresholds before they'll be reflected on-wiki. Essentially, that will complete this task.
*ping*
Here's some details about connecting to our replica databases from a tool in toolforge: https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#Python
We did some past work on newcomer reputation in T205926: Create a newcomerquality meta-model for revscoring
Here's a query of the top contributors to Czech Wikipedia in the last 30 days: https://quarry.wmflabs.org/query/36234
Maybe reputation can be a reward for a newcomer. E.g. stack overflow uses reputation to attract people to do more contributions --@Ferenczy
May 15 2019
We currently have mwparserfromhell 0.5.2 deployed. So it looks like we're due for an update. I'll get on that.
May 14 2019
https://labels.wmflabs.org/stats/jawiki/15 looks like this is complete!
Regretfull, https://meta.wikimedia.org/wiki/Conflict_of_interest_editing doesn't look very relevant.
I rebuilt the python virtualenv and restarted the web service. Looks like the issue is resolved.
Met with @Ladsgroup and @akosiaris to understand some of this better. Here are our notes: https://etherpad.wikimedia.org/p/ores-to-k8s
Huge boost in model fitness! This is now one of the best "goodfaith" models that we have! I've submitted my work for review. See https://github.com/wikimedia/editquality/pull/195 Will update about deployments of the new model when that is ready.
May 13 2019
Just sat down with this again. Here's the old dataset:
OK the model is deployed. I've also configured a simple gadget to allow you to see the predictions in svwiki. See https://sv.wikipedia.org/wiki/Anv%C3%A4ndare:EpochFail/common.js for how to enable it for your user account.
This has been merged and deployed. Thresholds will need to be updated. See T223164
This is the difference between needing to review 50% of edits to catch 90% of the damage to only needing to review 15% of edits to catch 90% of the damage. HUGE!
We went up from 0.08 precision @ 0.90 recall to 0.26 prevision @ 0.90 recall. So that's quite a jump.
https://labels.wmflabs.org/ui/eswikiversity/ The labeling interface is ready to go!
OK I made the change so that autoconfirmed users are "trusted" so long as their edits are not reverted and they have never been blocked. That got us down below 4828 revisions.
Oh! Forgot to respond to the other concern:
Re. the "skill share overview", I agree @Rfarrand. I'll arrive around 1:30PM on Thursday, but I have meetings with some WMDE folks that afternoon and a remote keynote to give. I might be able to meet up with y'all after 9PM if anyone is still awake. I'm adjusting my sleep schedule in advance, so I expect to be *mostly* awake and competent.
May 10 2019
This is now deployed to our beta (testing) service. See http://ores-beta.wmflabs.org/v3/scores/svwiki/