May 10 2019
Nevermind. I needed to do git lfs fetch --all first.
Looks like this work-around isn't working now.
May 9 2019
Of the ~17k edits we get, here's the breakdown by our "autolabeler":
Turns out that cuts out only 500 edits -- from 10727 to 10214. That's a lot of edits to label. We want to get this down to about 5k at the most. I'll try cutting the number of "trusted edits" down to 200.
OK let me re-run. We were already considering admins "trusted" but I'll see how much of a difference it makes to include your edits.
@Lsanabria, I'm still waiting on your response to my last questions. No rush. Just want to make sure you know I'm blocked on you taking a look.
May 8 2019
You can get fitness statistics directly from the service. Here's some relevant statistics for the English Wikipedia models and the German Wikipedia models.
Sorry for the delay. I've been on vacation for almost a week. I'll be looking to get this deployed some time this week and then I'll set you up with a simple gadget that will help you explore the quality of the predictions. I'll ping back here with updates.
May 1 2019
I can run the venv and import "encodings"
I found this at the end of the uwsgi.log:
Looks like I can "become ores-support-checklist"
Found some docs here: https://github.com/wikimedia/ores-support-checklist
Marking this high instead of "unbreak" because it's not a critical service and only serves informational needs.
Apr 30 2019
In the meantime, I added the campaign here: https://labels.wmflabs.org/ui/srwiki/ Please pick up these edits and re-label them as we were doing in the etherpad. Once we're done with this, we can re-examine the data and update the training/testing set.
OK it's clear that we would benefit from re-labeling these 500 revisions using Wiki labels. I'm working to get a campaign loaded. I'd like to call it something like "Edit quality (500 edits re-review)" or something like that. Could someone help me get a Serbian translation of that?
OK we have a model. Fitness isn't really that great, but it'll be interesting to see how it works in practice.
Excellent! Thank you.
No worries. I can work from this. Do you have any datasets extracted that I could work from? Or maybe the extractor is just fast enough to run again.
Apr 29 2019
@Gilles, are you still working on this task?
We can definitely play with the "trusted edits" set. @Lsanabria, are there any user-rights on Spanish Wikiversity that you think might indicated a "trusted" status? Also, do you think if we labeled edits by anyone with over a couple hundred edits as "trusted", would that mostly work out OK? Note that even "trusted" edits get loaded into Wiki Labels for review if they are reverted.
I think mentor matching needs at least half an hour. I'd like to have a bit of buffer time, so an hour would be better if we can swing it.
It looks like a lot of the edits that were labeled "badfaith" but that we have no re-labeled "goodfaith" were saved by @Zoranzoki21. That might be simply because Zoranzoki21 did a lot of labeling work. Would you take a look at them to see if you agree with our re-assessment? Maybe there is some confusion as to the meaning of "goodfaith".
So, as it stands, more than half of the items labeled badfaith are actually goodfaith upon review. I'll look into these labels to see if I can see some sort of consistency with them.
An etherpad is directly editable. You should be able to just type into it.
Apr 26 2019
I just labeled a few. I'm seeing some edits that look like they are goodfaith in this set. I wonder if I am missing something.
I've dumped all of the edits labeled "badfaith" into this etherpad: https://etherpad.wikimedia.org/p/srwiki_badfaith_edits
@Ladsgroup i just pinged you in the task because it looks like the data is a little weird and I had some questions about it a few weeks ago that look like they are still unanswered.
Apr 25 2019
@JTannerWMF, per T164331: Define a process for adding ORES filters to new wikis when ORES is enabled on those wikis, I believe we'd decided that it is the #Growth team's responsibility to get buy-in for enabling these features since you're maintaining the UI. I'd suggest reaching out to the Wikipedians who did the most labeling work.
In conversation with @Ladsgroup, we couldn't figure out whether unused thresholds should be invisible (e.g. https://he.wikipedia.org/wiki/%D7%9E%D7%99%D7%95%D7%97%D7%93:ORESModels )
Apr 24 2019
I'd prefer PNG over JPG if SVG or another vector format is not an option. No one wants those JPG artifacts messing with your nice, crisp design. :D
Here are titles for African Diaspora (Hard mode): https://quarry.wmflabs.org/run/366495/output/0/json
Here are titles for a Women Scientists (Probably less hard): https://quarry.wmflabs.org/run/366489/output/0/json
I just updated https://github.com/halfak/taxonomy_examples with a new dataset called "vital_10k_taxonomy.json" I'll be working on getting another dataset with pages that fall into a specific topic cross-section next.
(EC) @JEumerus/@Thryduulf, false positives/negatives don't come into play until there is a model. We'll certainly be looking at fitness statistics and manually reviewing false positives once that model is first built. This has been the pattern for vandalism fighting ORES models and I think it is wholly appropriate for this modeling work as well. Once we know what the model is able to detect and what it can be useful for, then we can discuss "tools" and usecases.
Hey folks. I've been following this task, but I might not have the full context, so take what I say with a grain of salt that is appropriately sized.
Apr 23 2019
Wow! There shouldn't be. That was so long ago. Like, 5 years! I'm not sure how I would check. If there was one running, it would have split newcomer groups by their user_id. E.g. Odd user_ids would have been bucketed in experimental and even user_ids would have been bucketed in control (or vice versa)
@Catrope, I figured you might know what's up.
I just manually added back two that I knew we had.
This is probably due to some changes to dashboards that @Ladsgroup recently made.
Apr 19 2019
It's really easy to read at a glance. I like it.
Apr 18 2019
Looks like joblib uses cloudpickle, so we should probably use that directly.
Looks good to me.
Apr 17 2019
Thank you very much @eranroz! Im excited to sharpen up the predictions for hewiki :D