Using the tool built for this purpose in T273062, conduct the following test of the image recommendations POC:
- Evaluate results on Arabic, Cebuano, English, Vietnamese, Bengali and Czech wikis
- Evaluate 500 unillustrated articles from each wiki
- For each result for each unillustrated article, manually decide whether the match is strong, okay, or weak
- Based on the percentage of strong matches, evaluate what the likely revert rate would be for bots adding these images to articles
- Evaluate which match source(s) provide the strongest match (Wikidata, interwiki, Commons category, MediaSearch)
- Evaluate whether MediaSearch provides valuable results where the other sources have none
- Evaluate performance based on logged response time
- Evaluate what percentage of matches are offensive or NSFW, to help decide whether we should put some kind of safe search filter on the recommendations