Technically this model is language agnostic, but it does require some statistical values for every wiki in order to calculate quality features:
- avg article length
- avg number of media
- avg number of categories
- avg number of headings
- avg number of wikilinks
- avg number of references
This task involves adding these values for new languages and updating the model binary to accurately reflect the total number of supported wikis.
- Add quality feature values for 35 new languages to constants.py using this new file created by @diego (See the commit message for more details on how default values were generated for wikis)
- Update the supported_wikis attribute on the serialized RevertRiskModel
- Bump model version from 1.0 to 2.0
- Test the new model binary
- Pass it on to the ML team (sha512 checksum for the serialized model: P52800)