Page MenuHomePhabricator

Image suggestion evaluation September 2020
Closed, ResolvedPublic

Description

This is the second round of evaluation of the image recommendation algorithm. Per T256081#6474174, this version is very simple: it only suggests images if they are attached to the article's Wikidata item or are in the Commons category attached to the Wikidata item. We are going to evaluate these the same way as we did in T260857: Image suggestion evaluation August 2020.

I put the files in tabs in this spreadsheet: https://docs.google.com/spreadsheets/d/1gk_3tPhDP49AwZQJj0Mcz4AOq1_054OGJHPj-Q-qEkw/edit#gid=253656325. We will evaluate the first 50 articles in each list. I sorted the articles randomly so we are evaluating a representative group.

We'll classify the "top. image" into these categories, along with explanatory comments where useful. NOTE: there is an additional classification, for if the article should not even have an image.

ClassificationExplanation
2Great match for the article, illustrating the thing that is the title of the article; e.g. the article is "Food" and it is an image of food.
1Good match, but difficult to confirm for the article unless the user has some context, and would need a good caption; e.g. the article is "Food" and it is an image of a famous chef.
0Not a fit for the article at all; e.g. the article is "Food" and the image is a car.
-1Image is correct for the subject, but does not match the local culture; e.g. the article is "Food" and the image is a specific food from a specific culture that is not recognizable in the local culture.
-2Misleading image that a newcomer could accidentally think is correct; e.g. the article is "Taco" and the image is a burrito.
-3Page should not have an image, e.g. disambiguation pages, lists, or "given name" articles.

Event Timeline

This time it's easier and at the same time harder to evaluate. Below are some notable aspects I noticed:

  • The suggestions mostly focused on locations and faunas/floras, which is even less diverse than last time. Or maybe it's due to the fact that the number of articles on those 2 topics in Vietnamese wiki is way higher compared to other topics? (so once we had bots creating tons of articles on these 2 topics)
  • More "2" results are given, which is a absolutely better than last time.
  • Some images are on topic, but on the other hand if I were to decide whether to put them in an article or not, I'd choose not to, because they are either low quality or not actually provide any useful information for the article (despite being on topic), e.g: Image/Article in English. In this case I'm hesitating whether to put a 0 or a 2... Please get back to me on this so I can fix my grade.
  • Some images don't have enough information on the file (nor are used on any articles) so I cannot grade them.

So overall it's better, but I'm thinking it's because most of the suggestions are on locations and faunas/floras, and the matching rate for these 2 topics were already high the last time we evaluated. I'd love to have suggestions that filter out these 2 topics.

Also I want to point out a fact that some of the images only have captions in other languages rather than English or don't have a caption at all, and are not used on any articles, so it might be difficult for newcomers to really understand what those pictures are about and to evaluate whether those pictures fit in the articles or not.

Used the page created by @Urbanecm_WMF , thanks!
Images evaluated for arwiki and frwiki:

  • same results in both wikis,
  • big improvement, many accurate images,
  • images with 1 score can fit articles in some context, I was a bit severe.

{{done}}

In my case, I am unable to spot any topic bias (just like last time). The algorithm is much more accurate, basically the only mistakes were mistakes in the source data and disambig pages.

That leads me to a thought: When we ask newcomers to evaluate an image, and they say no, can we fix the source data on their behalf, to improve Wikimedia in each case? Removing a category/WD item assignment should be easy enough IMO. @MMiller_WMF, do you think that's a good idea to consider?

Quite a few animals, train stations (specifically in Japan), cities in China (the mainland), and otherwise diverse topics. One "1946 (Year)" page.

We have all languages covered and some comments. @MMiller_WMF, I let you check if this task is done. If so, please close it.

Resolved since a while, I forgot to close it.