Page MenuHomePhabricator

Suggested Tags should not suggest tags about image format or genre
Open, Needs TriagePublic

Description

The Depicts guidelines at Commons explain that you should not add depicts statements about the format or genre of the image as this belongs in other structured data tags (such as fabrication method, genre, material used, etc.). Thus we should filter out the following tags that are commonly suggested by Google:

  • portrait
  • drawing
  • macro photography
  • illustration
  • visual arts
  • art of painting
  • art
  • modern art
  • stock photography (debatable, but this is a very subjective quality assessment that also has racial bias)
  • text (debatable)

There are probably more, but those seem to be the most common.

Event Timeline

kaldari created this task.Apr 10 2020, 1:15 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 10 2020, 1:15 AM
kaldari renamed this task from Suggested Tags should not suggest tags about image format to Suggested Tags should not suggest tags about image format or genre.Apr 10 2020, 1:17 AM
kaldari updated the task description. (Show Details)
kaldari updated the task description. (Show Details)

Hi @kaldari . Some of these items have been added to the filter list already (art of painting and stock photography for sure). Unfortunately, T249273 (which addresses an issue where filter updates didn't apply to previously analyzed images) hasn't landed on production yet. Once it does, you'll see them go away on older images. They're already being filtered out on new uploads.

Additionally we've created https://commons.wikimedia.org/wiki/Commons_talk:Structured_data/Computer-aided_tagging/Blacklist where Community members (and WMFers) can suggest additional items to be filtered out. You don't need to add the ones on this ticket because we know about them already (and they're being considered/debated). This is just FYI for future reference (and a lovely place for lively discussion 😄 )

@Ramsey-WMF - That's awesome. Thanks for the info!

@Ramsey-WMF - I hope y'all are planning to move the blacklist on-wiki. It would make life easier for everyone.

Yup, it's absolutely being considered. There are currently some internal and external factors that necessitated using this approach first.

There are currently some internal and external factors that necessitated using this approach first.

@Ramsey-WMF - Anything I can help with? I created an on-wiki configuration system for WikiLove a few years ago. It's pretty easy to implement.

You don't need to add the ones on this ticket because we know about them already (and they're being considered/debated).

@Ramsey-WMF - Was there any outcome of the debate on the tags I suggested in the description? Of the 10 suggestions, it looks like 3 are moot, but the other 7 are still active problems causing unnecessary extra work for the community:

  • portrait: listed in wgMachineVisionWithholdImageList
  • drawing
  • macro photography
  • illustration
  • visual arts
  • art of painting: listed in wgMachineVisionWikidataIdBlacklist
  • art
  • modern art
  • stock photography: listed in wgMachineVisionWikidataIdBlacklist
  • text

The debate isn't quite resolved yet, but hopefully we are entering conditions where it will be. As we work on MediaSearch and do real-world testing on the effects of depicts on search results, we hope to have solid evidence that will settle it one way or the other.

You don't need to add the ones on this ticket because we know about them already (and they're being considered/debated).

@Ramsey-WMF - Was there any outcome of the debate on the tags I suggested in the description? Of the 10 suggestions, it looks like 3 are moot, but the other 7 are still active problems causing unnecessary extra work for the community:

  • portrait: listed in wgMachineVisionWithholdImageList
  • drawing
  • macro photography
  • illustration
  • visual arts
  • art of painting: listed in wgMachineVisionWikidataIdBlacklist
  • art
  • modern art
  • stock photography: listed in wgMachineVisionWikidataIdBlacklist
  • text
CBogen moved this task from Triage to SDoC on the Structured-Data-Backlog board.Aug 25 2020, 6:48 PM