Using the tool built for this purpose in T273062, conduct the following test of the image recommendations POC:
- Evaluate results on Arabic, Cebuano, English, Vietnamese, Bengali and Czech wikis
- Evaluate 500 unillustrated articles from each wiki
- For each result for each unillustrated article, manually decide whether the match is good, okay, or bad. Evaluators also have the option to choose "unsure" if they're not confident in their selection.
- Based on the percentage of good matches, evaluate what the likely revert rate would be for bots adding these images to articles
- Evaluate which match source(s) provide the best matches (Wikidata, interwiki, Commons category, MediaSearch)
- Evaluate whether MediaSearch provides valuable results where the other sources have none
- Evaluate performance based on logged response time
- Evaluate what percentage of matches are offensive or NSFW, to help decide whether we should put some kind of safe search filter on the recommendations
The estimated time of work is 3 hours for the 500 images. However, is the 3 hours are passed without finishing the test, please leave a comment.