Page MenuHomePhabricator

Image Classification Research and Development
Open, Needs TriagePublic

Description

This is the master task gathering our efforts towards developing in-house image classification models to be used across the organization.
It includes tasks on estimating data size and access, resource availability, model development and product applications.

  • Image data: @Miriam + @Gilles to work on estimating the size of Commons image corpus at different resolutions. T215250
  • GPUs: @elukey + @EBernhardson to work on connecting the GPU to stat1005 when time allows; Miriam will test GPU models afterwards. See the GPU task here. It was suggested that GPUs are useful for others in Research (e.g. @diego and @Isaac) and Search working on text analysis. T148843
  • Evaluating existing classifiers: This is a short-term effort towards developing our own classification models. The Research team will work on a protocol for evaluating generalisability and biases of existing image classifiers that SDC (@Ramsey-WMF @Abit @dr0ptp4kt @Cparle) or others (@MusikAnimal) might want to use, based on diverse image sets from Wikidata/Commons. The Research team will also help with the integration between Wikidata items and the labels from existing image classifiers.
  • Longer-term: Training our own image classifiers: The longer term plan, when data and processing units will be available, is to train our own image classifiers for various purposes: object detection, adult image filtering, image quality, image authenticity etc. T331134

Related Objects

StatusSubtypeAssignedTask
OpenMiriam
Resolvedelukey
Declined Cmjohnson
ResolvedRobH
ResolvedOttomata
DeclinedNone
Resolvedelukey
Resolved Cmjohnson
Resolved Cmjohnson
Resolvedelukey
ResolvedMiriam
Declined Gilles
Resolvedjijiki
Resolvedelukey
Resolved Gilles
ResolvedMiriam
InvalidMiriam
ResolvedMiriam
ResolvedMiriam
ResolvedMiriam
ResolvedMiriam
DeclinedMiriam
ResolvedMiriam
ResolvedAikoChou
Opentizianopiccardi
OpenMiriam

Event Timeline

There is an image classifier worth building that probably wouldn't fall into preexisting politically challenging bias, which is determining whether an image is a photograph or not. We have this long-standing limitation of only visually optimising (slight sharpening) thumbnails for JPGs because they're the only file type that's mostly photographs. Which leaves thumbnails of photographs uploaded as PNG and TIFF visually flat. See T192744 for some context.

An image classifier that can tell photographs apart from diagrams, maps, schematics, etc. would be quite useful for the visual quality of the thumbnails we render. Either by being directly inserted into our thumbnailing pipeline at the time thumbnails are rendered, or by tagging images with structured data (which would allow humans to override the decision made by the classifier) that would inform the thumbnailing process.

If we go down that pathway of trying to identify what images are photographs, we should look into work by a former colleague of mine on detecting visualizations on Commons (in some ways, the inverse task): http://brenthecht.com/publications/www18_vizbywiki.pdf

He (Allen Lin) might have some insight into some easy wins or pitfalls in building a model like that.

@Gilles thanks for this! Images and graphics have very different underlying image statistics: it is therefore fairly easy for a classifier to tell them a part. So it should be feasible.

If we can collect some training data, by finding one or more categories in Commons with a substantial number of diverse graphics images, I can try to quickly build a graphics VS photo classifier, by finetuning an existing image classifier (it won't be perfect, but no GPU needed ;) ) @Isaac maybe your colleague can help with this, by sharing which categories and keywords he used to create his training data?

As a side note, such a classifier can be helpful also to improve the accuracy of other image classifiers (e.g. object detectors or image quality classifiers), that are tipycally trained on photographic material and therefore fail completely when classifying non-photographic images.
We did studies in the past to quantitavely explain the importance and the nature of the difference between graphics and images: https://www.dropbox.com/s/y97h8kjx84hbrzk/p242-redi.pdf?dl=0

FYI, some developments in the area of using image classification in the Wikiverse:

We now have a Wikidata Distributed Game - Depicts that uses image classification ML to generate candidates. This was done as a project I did with The Met Museum and Microsoft.

https://outreach.wikimedia.org/wiki/GLAM/Newsletter/January_2019/Contents/USA_report

Not sure if this is relevant, but this seemed the best place to note.

I just came across:
https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_YARN
It seems relatively easy to package up (e.g. on a notebook host) and ship to hdfs and then include it in a spark job.

Miriam added a subscriber: Harej.
Miriam claimed this task.
Miriam renamed this task from Image Classification Working Group to Image Classification Research and Development.Jul 25 2023, 5:48 PM