Page MenuHomePhabricator

[AOI] Investigation: Can we make it easier to locate usable images for Wikipedia on Commons or Flickr?
Closed, ResolvedPublic

Description

Per http://www.allourideas.org/wikimediaaccesorios/results?locale=es (5th item), is there anything that we can build to make it easier to locate usable images for Wikipedia on Commons or Flickr?

Event Timeline

kaldari raised the priority of this task from to Needs Triage.
kaldari updated the task description. (Show Details)
kaldari added a project: Community-Tech.
kaldari subscribed.

It looks like this need may already be covered by Magnus's FIST tool: http://tools.wmflabs.org/fist/fist.php

kaldari moved this task from New & TBD Tickets to Ready on the Community-Tech board.
kaldari added a subscriber: Magnus.

To round this up, WD-FIST can find images on Wikipedia (and Commons) and add them to Wikidata, which in turn is the most complete and convenient image-for-an-article store we have at the moment.

kaldari renamed this task from [AOI] Spike: Can we make it easier to locate usable images for Wikipedia on Commons or Flickr? to [AOI] Investigation: Can we make it easier to locate usable images for Wikipedia on Commons or Flickr?.Aug 12 2015, 2:29 AM

According to @Jdforrester-WMF, image searching (via the Media Insertion interface already in VisualEditor) will be part of the initial release of the new WikiText editor, so it probably doesn't make sense for us to add anything to the existing WikiText editor.

Some tickets related to improving the Media Insertion interface: T51662, T72284, T53031

AFAIK there is a new search engine by Creative Commons in the works, to offer a one-stop free image search. That would take care of the current "source fragmentation", but I don't know when this will go online.

@Magnus: There's http://search.creativecommons.org/ , a front-end to a number of CC web searches. It only searches one at a time, but it's at least a single starting point.

ETA: Code is here: https://github.com/creativecommons/garmonbozia

Ways of finding images on or for Commons include the on-wiki search interfaces (Commons, media results in other wiki searches, VE's media search capabilities), associated tools (bots, or Labs tools), and external search interfaces.

The on-wiki search interface is the responsibility of the Discovery team, but no one is specifically working on it. Associated bugs include: T104565, T96535, T10738, T18933, T67636. Categorized images have a nice browser, but the interface for the search itself doesn't surface the images very well.
VE's media search is the responsibility of the Editing team. Associated bugs include: T51662, T72284, T95223, T53031, T99741.

Existing tools are listed at https://commons.wikimedia.org/wiki/Commons:Tools. Of particular note:

  • GLAMify looks for images from a collection that are present in one language's Wikipedia but absent from another's (for instance, seeing where a Swedish image collection is used in Swedish Wikipedia but not in other Wikipedias). Runs as a bot on the Meta page.
  • Wikidata Images looks for images associated with a subject in Wikidata that are not included in a given article. For enwiki, it updates Category: No local image but image on Wikidata. It would be helpful if there was some way to locate the corresponding image(s) to make it easy to include them.
  • File Image Search Tool (FIST). Searches a wide variety of sources. Is a bit slow but perhaps not surprisingly so when it's querying several APIs?

Other resources, primarily external:

Getting on-wiki search working well and giving it a better UI would be the best long-term solution for finding images on Commons, but that's out of our scope. For Commons as well as external databases, FIST seems to be the most centralized wiki-specific search tool available and could perhaps be more widely publicized (also, @Magnus, how's its i18n/l10n?). It might be able to be expanded to search more image galleries as well.

Here's a crazy thought:

  • Set up a new wikibase installation (on Labs?)
  • Create one item for each file on Commons (we wanted to do that anyway...)
  • Create one item for each free image on the free media sources listed above, maybe even Flickr (might be too large)
  • Update those occasionally

Largest catalog of free images on the web EVER! Bots can update the available images, community can improve it for better search/query results. We have the parts, we just need to be brave enough to scale it up.

@Magnus: That would be awesome. I'm not sure whether Labs is set up to handle that, but if there's a good base of people interested in a project like this, let us know!

@Fhocutt: It begins (scraping NOAA as an exercise):

https://tools.wmflabs.org/freefiles/index.php?title=Special:RecentChanges&limit=500&hidebots=0

I will not be online much the next 1-2 weeks, but if anyone is interested, please add yourself to the tool (freefiles)!

It looks most of the high impact tasks here fall under the purview of either the Search and Discovery team or the Editing team. Magnus' idea for a master database of all free images is interesting, although it sounds like a maintenance nightmare and probably not something that the Community Tech team should take on right now.