- Deploy models to beta and smoke test.
- Deploy to production.
- Enable new models via configuration.
- Announce on-wiki.
|Resolved||Catrope||T192501 Deploy ORES advanced editquality models to cawiki|
|Resolved||Ladsgroup||T187732 Train/test damaging/goodfaith model for Catalan Wikipedia|
@Townie Would you be willing to draft a release announcement for these upcoming models? We expect to have the deployment ready by the end of the month.
Note that the goodfaith model for cawiki isn't super helpful. There's a quite good "may be bad faith" filter with 32% precision and 89.1% recall, but the threshold range for that is 0.000 through 0.999. So the only way to make "very likely good faith" not overlap with "may be bad faith" was to set its threshold range to be 1 through 1 (i.e. just the value 1), which has 99.8% precision and 97.2% recall. This seems to indicate that a very large proportion of all inputs get score 1.
Using the default settings would have pegged "very likely good faith" at 99.5% precision, which has 99.5% recall and corresponds to a threshold range of 0.883 through 1. This would have overlapped both with "may be bad faith" (0 - 0.999 at 32/89) and "likely bad faith" (0 - 0.926 at 60/70). On the other end of the spectrum, the "very likely bad faith" filter gets 90.4% precision with 29.2% recall, but its threshold range is 0 - 0.008.
Maybe this is all fine and the model is just very confident, but I felt a bit dirty setting this up.