- Deploy models to beta and smoke test.
- Deploy to production.
- Enable new models via configuration.
- Announce on-wiki.
|Resolved||awight||T192498 Deploy ORES advanced editquality models to arwiki|
|Resolved||Halfak||T189710 Train and test damaging/goodfaith model for arwiki|
|Resolved||awight||T131669 Complete edit quality campaign for Arabic Wikipedia|
@Ghassanmas Would you be willing to draft a release announcement for these upcoming models? We expect to have the deployment ready by the end of the month.
Yes of course ! I would be happy to work on that also its really an easy small task to do !. To follow up with deployment date, will be there an announcement on email@example.com when the exact time is known?
Thanks! The deployment will take several steps, and it's only the very last step "Enable new models via configuration" which is visible to wiki users. We can use this task for coordinating, and it should be the Global-Collaboration team who does the final deployment, since they own the "recent changes new filters" interface where ORES features will appear.
I looked at the generated models for arwiki and the goodfaith one is not good enough to use on the wiki. See T193905: arwiki goodfaith model is not usable. The damaging model is usable but not great:
- At 15% precision we get 88% recall. This is excellent for a "may have problems" filter
- At 49.7% precision (nearest to 45%), we get 10.6% recall
- At 62.1% precision (nearest to 60%), we get 7.4% recall. This is where we put "likely have problems" by default, but we're probably better off with the 50/10 threshold above.
- Queries for 75%, 90%, and higher precision levels all returned a setting with 100% precision but 0.3% recall. We'd normally put "very likely have problems" at 90% precision, but in this case we can't use that, so we just wouldn't have this filter.
I'm going to deploy this with a "may have problems" filter at 15/88, a "likely have problems" filter at 50/10, and no "very likely problems" filter, but it might be worth looking into improving the damaging model. I won't deploy any goodfaith filters at all, because that model provides no usable thresholds.