When developing the T299781: [EPIC] Image suggestions backend , we collected image Wikidata QIDs from two properties, i.e., P18 and P373. The dataset lives in the image_suggestions_wikidata_data Hive table.
We may gather additional image QIDs via relevant Wikidata properties that have an image range.
This spike is to understand pros and cons of properties that are more generic than P18.
Update
We sampled 200 random topics and queried Wikidata for all properties that expect a Commons media file (includes files other than images). Result:
topics | section score | topics with values | total media property values | gain VS p18 |
200 | > 10 | 96 | 146 | +58 |
Observations
- There's a slight gain compared to only using p18
- additional media is mainly composed of logos, signatures, icons, and audio files. None of them are relevant image suggestion candidates
- a few properties look relevant, although the gain would be even slighter:
- the implementation cost to add these properties is relatively high
Related
- Full list of properties with Commons media range here
- see also T316149: [L] Create tool for manual evaluation of section-level image suggestions's plan
Conclusion
We already leverage p18 and other properties don't seem worth the effort.