The existing evaluation for the model does so across five languages: Arabic, English, French, Hungarian, and Turkish. Chinese Wikipedia (zhwiki) now also uses the PageAssessments extension from which we retrieve the groundtruth data, so we can add it to that list!
Adding it will require:
- Implementing a zhqual_to_enqual function (examples for French/Arabic/Turkish/Hungarian)
- Adding the analysis to the bottom of the notebook
- Adding the results to the summary in the top
Resources:
- Chinese Wikipedia quality rubric
- Current ratings in database that will need translated into English quality classes
NOTE: not all of the existing quality ratings for zhwiki will be translate-able into the English quality classes we're using (stub/start/c/b/ga/fa) but we should hopefully be able to get examples for each of these classes and not throw away too much data.